Tsql 解析文本文件,根据文本字符串标记对行进行分组
我正在导入一个由多个“报告”组成的大型文本文件。每个报告由几行数据组成。我知道新报告何时开始的唯一方法是该行以XX开头。那么下面的所有行都属于带有XX的主行。我正在尝试输入一个分组ID,以便处理数据并将其解析到数据库中Tsql 解析文本文件,根据文本字符串标记对行进行分组,tsql,Tsql,我正在导入一个由多个“报告”组成的大型文本文件。每个报告由几行数据组成。我知道新报告何时开始的唯一方法是该行以XX开头。那么下面的所有行都属于带有XX的主行。我正在尝试输入一个分组ID,以便处理数据并将其解析到数据库中 CREATE TABLE RawData( ID int IDENTITY(1,1) NOT NULL ,Grp1 int NULL ,Grp2 int NULL ,Rowdata varchar(max) NULL ) INSERT INTO
CREATE TABLE RawData(
ID int IDENTITY(1,1) NOT NULL
,Grp1 int NULL
,Grp2 int NULL
,Rowdata varchar(max) NULL
)
INSERT INTO RawData(Rowdata) VALUES 'XX Monday'
INSERT INTO RawData(Rowdata) VALUES 'Tues day'
INSERT INTO RawData(Rowdata) VALUES 'We d ne s day'
INSERT INTO RawData(Rowdata) VALUES 'Thurs day'
INSERT INTO RawData(Rowdata) VALUES 'F r i d day'
INSERT INTO RawData(Rowdata) VALUES 'XX January'
INSERT INTO RawData(Rowdata) VALUES 'Feb r u a'
INSERT INTO RawData(Rowdata) VALUES 'XX Sun d a y'
INSERT INTO RawData(Rowdata) VALUES 'Sat ur day'
我需要编写一个脚本,根据XX行所在的位置更新Grp1字段。完成后,我希望桌子看起来像这样:
ID Grp1 Grp2 RowData
1 1 1 XX Monday
2 1 2 Tues day
3 1 3 We d ne s day
4 1 4 Thurs day
5 1 5 F r i d day
6 2 1 XX January
7 2 2 Feb r u a
8 3 1 XX Sun d a y
9 3 2 Sat ur day
我知道对于Grp2场,我可以使用稠密秩。我遇到的问题是如何填写Grp1的所有值。我可以在看到“XX”的地方进行更新,但这不会填充下面的值
谢谢你的建议/帮助。这应该可以解决问题
-- sample data
DECLARE @RawData TABLE
(
ID int IDENTITY(1,1) NOT NULL
,Grp1 int NULL
,Grp2 int NULL
,Rowdata varchar(max) NULL
);
INSERT INTO @RawData(Rowdata)
VALUES ('XX Monday'),('Tues day'),('We d ne s day'),('Thurs day'),('F r i d day'),
('XX January'),('Feb r u a'),('XX Sun d a y'),('Sat ur day');
-- solution
WITH rr AS
(
SELECT ID, thisVal = ROW_NUMBER() OVER (ORDER BY ID)
FROM @rawData
WHERE RowData LIKE 'XX %'
),
makeGrp1 AS
(
SELECT
ID,
Grp1 = (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id),
RowData
FROM @rawData r
)
SELECT
ID,
Grp1,
Grp2 = ROW_NUMBER() OVER (PARTITION BY Grp1 ORDER BY ID),
RowData
FROM makeGrp1;
更新:下面是更新@RawData表的代码;我只是重新阅读了要求。我将保留原始解决方案,因为它将帮助您更好地了解我的更新工作原理:
-- sample data
DECLARE @RawData TABLE
(
ID int IDENTITY(1,1) NOT NULL
,Grp1 int NULL
,Grp2 int NULL
,Rowdata varchar(max) NULL
);
INSERT INTO @RawData(Rowdata)
VALUES ('XX Monday'),('Tues day'),('We d ne s day'),('Thurs day'),('F r i d day'),
('XX January'),('Feb r u a'),('XX Sun d a y'),('Sat ur day');
-- Solution to update the @RawData Table
WITH rr AS
(
SELECT ID, thisVal = ROW_NUMBER() OVER (ORDER BY ID)
FROM @rawData
WHERE RowData LIKE 'XX %'
),
makeGroups AS
(
SELECT
ID,
Grp1 = (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id),
Grp2 = ROW_NUMBER()
OVER (PARTITION BY (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id) ORDER BY ID)
FROM @rawData r
)
UPDATE @RawData
SET Grp1 = mg.Grp1, Grp2 = mg.Grp2
FROM makeGroups mg
JOIN @RawData rd ON mg.ID = rd.ID;
这应该能奏效
-- sample data
DECLARE @RawData TABLE
(
ID int IDENTITY(1,1) NOT NULL
,Grp1 int NULL
,Grp2 int NULL
,Rowdata varchar(max) NULL
);
INSERT INTO @RawData(Rowdata)
VALUES ('XX Monday'),('Tues day'),('We d ne s day'),('Thurs day'),('F r i d day'),
('XX January'),('Feb r u a'),('XX Sun d a y'),('Sat ur day');
-- solution
WITH rr AS
(
SELECT ID, thisVal = ROW_NUMBER() OVER (ORDER BY ID)
FROM @rawData
WHERE RowData LIKE 'XX %'
),
makeGrp1 AS
(
SELECT
ID,
Grp1 = (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id),
RowData
FROM @rawData r
)
SELECT
ID,
Grp1,
Grp2 = ROW_NUMBER() OVER (PARTITION BY Grp1 ORDER BY ID),
RowData
FROM makeGrp1;
更新:下面是更新@RawData表的代码;我只是重新阅读了要求。我将保留原始解决方案,因为它将帮助您更好地了解我的更新工作原理:
-- sample data
DECLARE @RawData TABLE
(
ID int IDENTITY(1,1) NOT NULL
,Grp1 int NULL
,Grp2 int NULL
,Rowdata varchar(max) NULL
);
INSERT INTO @RawData(Rowdata)
VALUES ('XX Monday'),('Tues day'),('We d ne s day'),('Thurs day'),('F r i d day'),
('XX January'),('Feb r u a'),('XX Sun d a y'),('Sat ur day');
-- Solution to update the @RawData Table
WITH rr AS
(
SELECT ID, thisVal = ROW_NUMBER() OVER (ORDER BY ID)
FROM @rawData
WHERE RowData LIKE 'XX %'
),
makeGroups AS
(
SELECT
ID,
Grp1 = (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id),
Grp2 = ROW_NUMBER()
OVER (PARTITION BY (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id) ORDER BY ID)
FROM @rawData r
)
UPDATE @RawData
SET Grp1 = mg.Grp1, Grp2 = mg.Grp2
FROM makeGroups mg
JOIN @RawData rd ON mg.ID = rd.ID;
更新后的原始数据如下所示
更新后的原始数据如下所示
干得好,约翰-这是一个非常有效的解决方案。一个小的更正:在SET语句中,您想要的不是Grp2=B.Grp1,而是Grp2=B.Grp2。@JJones很乐意帮忙。干得好,John-这是一个非常有效的解决方案。一个小的更正:在SET语句中,您想要的不是Grp2=B.Grp1,而是Grp2=B.Grp2。@琼斯很乐意帮忙。