Tsql 解析文本文件,根据文本字符串标记对行进行分组

Tsql 解析文本文件,根据文本字符串标记对行进行分组,tsql,Tsql,我正在导入一个由多个“报告”组成的大型文本文件。每个报告由几行数据组成。我知道新报告何时开始的唯一方法是该行以XX开头。那么下面的所有行都属于带有XX的主行。我正在尝试输入一个分组ID,以便处理数据并将其解析到数据库中 CREATE TABLE RawData( ID int IDENTITY(1,1) NOT NULL ,Grp1 int NULL ,Grp2 int NULL ,Rowdata varchar(max) NULL ) INSERT INTO

我正在导入一个由多个“报告”组成的大型文本文件。每个报告由几行数据组成。我知道新报告何时开始的唯一方法是该行以XX开头。那么下面的所有行都属于带有XX的主行。我正在尝试输入一个分组ID,以便处理数据并将其解析到数据库中

CREATE TABLE RawData(
    ID int IDENTITY(1,1) NOT NULL
    ,Grp1 int NULL
    ,Grp2 int NULL
    ,Rowdata varchar(max) NULL
)

INSERT INTO RawData(Rowdata) VALUES 'XX Monday'
INSERT INTO RawData(Rowdata) VALUES 'Tues day'
INSERT INTO RawData(Rowdata) VALUES 'We d ne s day'
INSERT INTO RawData(Rowdata) VALUES 'Thurs day'
INSERT INTO RawData(Rowdata) VALUES 'F r i d day'
INSERT INTO RawData(Rowdata) VALUES 'XX January'
INSERT INTO RawData(Rowdata) VALUES 'Feb r u a'
INSERT INTO RawData(Rowdata) VALUES 'XX Sun d a y'
INSERT INTO RawData(Rowdata) VALUES 'Sat ur day'
我需要编写一个脚本,根据XX行所在的位置更新Grp1字段。完成后,我希望桌子看起来像这样:

ID   Grp1   Grp2   RowData
1    1      1      XX Monday
2    1      2      Tues day
3    1      3      We d ne s day
4    1      4      Thurs day
5    1      5      F r i d day
6    2      1      XX January
7    2      2      Feb r u a
8    3      1      XX Sun d a y
9    3      2      Sat ur day
我知道对于Grp2场,我可以使用稠密秩。我遇到的问题是如何填写Grp1的所有值。我可以在看到“XX”的地方进行更新,但这不会填充下面的值


谢谢你的建议/帮助。

这应该可以解决问题

-- sample data
DECLARE @RawData TABLE 
(
    ID int IDENTITY(1,1) NOT NULL
    ,Grp1 int NULL
    ,Grp2 int NULL
    ,Rowdata varchar(max) NULL
);
INSERT INTO @RawData(Rowdata) 
VALUES ('XX Monday'),('Tues day'),('We d ne s day'),('Thurs day'),('F r i d day'),
       ('XX January'),('Feb r u a'),('XX Sun d a y'),('Sat ur day');

-- solution
WITH rr AS
(
  SELECT ID, thisVal = ROW_NUMBER() OVER (ORDER BY ID)
  FROM @rawData
  WHERE RowData LIKE 'XX %'
),
makeGrp1 AS
(
  SELECT 
    ID,
    Grp1 = (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id),
    RowData
  FROM @rawData r
)
SELECT 
  ID,
  Grp1,
  Grp2 = ROW_NUMBER() OVER (PARTITION BY Grp1 ORDER BY ID),
  RowData
FROM makeGrp1;
更新:下面是更新@RawData表的代码;我只是重新阅读了要求。我将保留原始解决方案,因为它将帮助您更好地了解我的更新工作原理:

-- sample data
DECLARE @RawData TABLE 
(
    ID int IDENTITY(1,1) NOT NULL
    ,Grp1 int NULL
    ,Grp2 int NULL
    ,Rowdata varchar(max) NULL
);
INSERT INTO @RawData(Rowdata) 
VALUES ('XX Monday'),('Tues day'),('We d ne s day'),('Thurs day'),('F r i d day'),
       ('XX January'),('Feb r u a'),('XX Sun d a y'),('Sat ur day');

-- Solution to update the @RawData Table
WITH rr AS
(
  SELECT ID, thisVal = ROW_NUMBER() OVER (ORDER BY ID)
  FROM @rawData
  WHERE RowData LIKE 'XX %'
),
makeGroups AS
(
  SELECT 
    ID,
    Grp1 = (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id),
    Grp2 = ROW_NUMBER() 
      OVER (PARTITION BY (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id) ORDER BY ID)
  FROM @rawData r
)
UPDATE @RawData 
SET Grp1 = mg.Grp1, Grp2 = mg.Grp2
FROM makeGroups mg
JOIN @RawData rd ON mg.ID = rd.ID;

这应该能奏效

-- sample data
DECLARE @RawData TABLE 
(
    ID int IDENTITY(1,1) NOT NULL
    ,Grp1 int NULL
    ,Grp2 int NULL
    ,Rowdata varchar(max) NULL
);
INSERT INTO @RawData(Rowdata) 
VALUES ('XX Monday'),('Tues day'),('We d ne s day'),('Thurs day'),('F r i d day'),
       ('XX January'),('Feb r u a'),('XX Sun d a y'),('Sat ur day');

-- solution
WITH rr AS
(
  SELECT ID, thisVal = ROW_NUMBER() OVER (ORDER BY ID)
  FROM @rawData
  WHERE RowData LIKE 'XX %'
),
makeGrp1 AS
(
  SELECT 
    ID,
    Grp1 = (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id),
    RowData
  FROM @rawData r
)
SELECT 
  ID,
  Grp1,
  Grp2 = ROW_NUMBER() OVER (PARTITION BY Grp1 ORDER BY ID),
  RowData
FROM makeGrp1;
更新:下面是更新@RawData表的代码;我只是重新阅读了要求。我将保留原始解决方案,因为它将帮助您更好地了解我的更新工作原理:

-- sample data
DECLARE @RawData TABLE 
(
    ID int IDENTITY(1,1) NOT NULL
    ,Grp1 int NULL
    ,Grp2 int NULL
    ,Rowdata varchar(max) NULL
);
INSERT INTO @RawData(Rowdata) 
VALUES ('XX Monday'),('Tues day'),('We d ne s day'),('Thurs day'),('F r i d day'),
       ('XX January'),('Feb r u a'),('XX Sun d a y'),('Sat ur day');

-- Solution to update the @RawData Table
WITH rr AS
(
  SELECT ID, thisVal = ROW_NUMBER() OVER (ORDER BY ID)
  FROM @rawData
  WHERE RowData LIKE 'XX %'
),
makeGroups AS
(
  SELECT 
    ID,
    Grp1 = (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id),
    Grp2 = ROW_NUMBER() 
      OVER (PARTITION BY (SELECT MAX(thisVal) FROM rr WHERE r.id >= rr.id) ORDER BY ID)
  FROM @rawData r
)
UPDATE @RawData 
SET Grp1 = mg.Grp1, Grp2 = mg.Grp2
FROM makeGroups mg
JOIN @RawData rd ON mg.ID = rd.ID;
更新后的原始数据如下所示

更新后的原始数据如下所示


干得好,约翰-这是一个非常有效的解决方案。一个小的更正:在SET语句中,您想要的不是Grp2=B.Grp1,而是Grp2=B.Grp2。@JJones很乐意帮忙。干得好,John-这是一个非常有效的解决方案。一个小的更正:在SET语句中,您想要的不是Grp2=B.Grp1,而是Grp2=B.Grp2。@琼斯很乐意帮忙。