Sql 将一列拆分为多列

Sql 将一列拆分为多列,sql,json,sql-server,tsql,Sql,Json,Sql Server,Tsql,我正在处理power bi审核日志报告文件。该文件包含一列“AuditDate”,其中包含多个列。我需要使用sql将该列拆分为多个列 该列具有如下值 AuditDate ------------ "{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType""

我正在处理power bi审核日志报告文件。该文件包含一列“AuditDate”,其中包含多个列。我需要使用sql将该列拆分为多个列

该列具有如下值

AuditDate
------------
"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""john@abc.com"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"
基本上我需要把这个专栏分成几个部分

id     RecordType      CreationTime    Operaration     OrganizationID  UserType
------------------------------------------------------------------------------
44de2468    20     2018-08-03T12:30:34   ViewReport     779558               0

有人能帮你用sql查询吗?

这很简单,你只需要一个字符串“拆分器”(又名标记器)。如果您使用的是SQL 2016+,则可以使用
STRING_SPLIT
;如果您使用的是2016年之前的系统,您可以在2005+或2012+上使用。解决方案如下所示:

DECLARE @AuditDate VARCHAR(8000) = 
'"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""john@abc.com"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"'

SELECT 
  Id             = MAX(CASE split.attrib WHEN 'ID'             THEN split.val END),
  RecordType     = MAX(CASE split.attrib WHEN 'RecordType'     THEN split.val END),
  CreationTime   = MAX(CASE split.attrib WHEN 'CreationTime'   THEN split.val END),
  Operation      = MAX(CASE split.attrib WHEN 'Operation'      THEN split.val END),
  OrganizationId = MAX(CASE split.attrib WHEN 'OrganizationId' THEN split.val END),
  UserType       = MAX(CASE split.attrib WHEN 'UserType'       THEN split.val END)
FROM
(
  SELECT      attrib = REPLACE(REPLACE(SUBSTRING(split.value, 1, mid.point-1),'{',''),'"',''),
              val    = REPLACE(REPLACE(SUBSTRING(split.value, mid.point+1, 8000),'{',''),'"','')
  FROM        STRING_SPLIT(@AuditDate,',') AS split
  CROSS APPLY (VALUES(CHARINDEX(':', split.value))) AS mid(point)
  WHERE       REPLACE(REPLACE(SUBSTRING(split.value, 1, mid.point-1),'{',''),'"','') IN
                ('id','RecordType','CreationTime','Operation','OrganizationID','UserType')
) AS split;
结果:

Id         RecordType  CreationTime          Operation   OrganizationId  UserType
---------- ----------- --------------------- ----------- --------------- ---------
44de2468   20          2018-08-03T12:30:34   ViewReport  779558          0

这非常简单,您只需要一个字符串“拆分器”(又名标记器)。如果您使用的是SQL 2016+,则可以使用
STRING_SPLIT
;如果您使用的是2016年之前的系统,您可以在2005+或2012+上使用。解决方案如下所示:

DECLARE @AuditDate VARCHAR(8000) = 
'"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""john@abc.com"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"'

SELECT 
  Id             = MAX(CASE split.attrib WHEN 'ID'             THEN split.val END),
  RecordType     = MAX(CASE split.attrib WHEN 'RecordType'     THEN split.val END),
  CreationTime   = MAX(CASE split.attrib WHEN 'CreationTime'   THEN split.val END),
  Operation      = MAX(CASE split.attrib WHEN 'Operation'      THEN split.val END),
  OrganizationId = MAX(CASE split.attrib WHEN 'OrganizationId' THEN split.val END),
  UserType       = MAX(CASE split.attrib WHEN 'UserType'       THEN split.val END)
FROM
(
  SELECT      attrib = REPLACE(REPLACE(SUBSTRING(split.value, 1, mid.point-1),'{',''),'"',''),
              val    = REPLACE(REPLACE(SUBSTRING(split.value, mid.point+1, 8000),'{',''),'"','')
  FROM        STRING_SPLIT(@AuditDate,',') AS split
  CROSS APPLY (VALUES(CHARINDEX(':', split.value))) AS mid(point)
  WHERE       REPLACE(REPLACE(SUBSTRING(split.value, 1, mid.point-1),'{',''),'"','') IN
                ('id','RecordType','CreationTime','Operation','OrganizationID','UserType')
) AS split;
结果:

Id         RecordType  CreationTime          Operation   OrganizationId  UserType
---------- ----------- --------------------- ----------- --------------- ---------
44de2468   20          2018-08-03T12:30:34   ViewReport  779558          0

看起来您正在处理一个格式错误的JSON列。那些双引号真麻烦

但是,如果您可以清理格式,您可以在查询中使用JSON函数

首先,设置数据(使用您在这个问题的另一份副本()中提供的数据):

现在,用单双引号替换双引号来清理格式错误的JSON

UPDATE @t
SET AuditDate = REPLACE(AuditDate,'""','"');
验证JSON的外观是否良好

SELECT * FROM @t

--Results:
+------------+---------------------------------------------------------+
| RecordType |                        AuditDate                        |
+------------+---------------------------------------------------------+
| View       | {"Id":"44de2468","Type":20,"CreationDate":"2018-08-23"} |
| Edit       | {"Id":"44de2467","Type":40,"CreationDate":"2018-08-24"} |
| Print      | {"Id":"44de2768","Type":60,"CreationDate":"2018-05-06"} |
| Delete     | {"Id":"44de2488","Type":30,"CreationDate":"2018-07-20"} |
+------------+---------------------------------------------------------+
然后使用
JSON\u VALUE()
提取您感兴趣的部分

SELECT 
    RecordType
  , JSON_VALUE(AuditDate, '$.Id') AS [Id]
  , JSON_VALUE(AuditDate, '$.Type') AS [Type]
  , JSON_VALUE(AuditDate, '$.CreationDate') AS CreationDate
FROM @t

--Results
+------------+----------+------+--------------+
| RecordType |    Id    | Type | CreationDate |
+------------+----------+------+--------------+
| View       | 44de2468 |   20 | 2018-08-23   |
| Edit       | 44de2467 |   40 | 2018-08-24   |
| Print      | 44de2768 |   60 | 2018-05-06   |
| Delete     | 44de2488 |   30 | 2018-07-20   |
+------------+----------+------+--------------+

看起来您正在处理一个格式错误的JSON列。这些双引号很麻烦

但是,如果您可以清理格式,您可以在查询中使用JSON函数

首先,设置数据(使用您在这个问题的另一份副本()中提供的数据):

现在,用单双引号替换双引号来清理格式错误的JSON

UPDATE @t
SET AuditDate = REPLACE(AuditDate,'""','"');
验证JSON的外观是否良好

SELECT * FROM @t

--Results:
+------------+---------------------------------------------------------+
| RecordType |                        AuditDate                        |
+------------+---------------------------------------------------------+
| View       | {"Id":"44de2468","Type":20,"CreationDate":"2018-08-23"} |
| Edit       | {"Id":"44de2467","Type":40,"CreationDate":"2018-08-24"} |
| Print      | {"Id":"44de2768","Type":60,"CreationDate":"2018-05-06"} |
| Delete     | {"Id":"44de2488","Type":30,"CreationDate":"2018-07-20"} |
+------------+---------------------------------------------------------+
然后使用
JSON\u VALUE()
提取您感兴趣的部分

SELECT 
    RecordType
  , JSON_VALUE(AuditDate, '$.Id') AS [Id]
  , JSON_VALUE(AuditDate, '$.Type') AS [Type]
  , JSON_VALUE(AuditDate, '$.CreationDate') AS CreationDate
FROM @t

--Results
+------------+----------+------+--------------+
| RecordType |    Id    | Type | CreationDate |
+------------+----------+------+--------------+
| View       | 44de2468 |   20 | 2018-08-23   |
| Edit       | 44de2467 |   40 | 2018-08-24   |
| Print      | 44de2768 |   60 | 2018-05-06   |
| Delete     | 44de2488 |   30 | 2018-07-20   |
+------------+----------+------+--------------+

对于SQL Server 2016,这非常简单。有相当多的JSON支持。唯一的问题是,字符串不正确。很明显,有一个引擎将所有内部引号加倍(一种转义技术)

如果这在您的控制之下,您应该尝试将列的格式更改为正确的JSON。最好让编写应用程序以正确的JSON格式提供这些审核。至少您可以添加第二列并使用触发器保持同步。作为最后一种手段,您可以使用
REPLACE
修复字符串:

REPLACE(REPLACE(REPLACE(@YourString,'"{','{'),'}"','}'),'""','"');
对于很多行,这可能需要一段时间……这就是为什么最好将格式保持在正确的JSON中

只是为了说明这些原则:

DECLARE @YourString NVARCHAR(MAX)=N'"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""john@abc.com"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"';

SET @YourString = REPLACE(REPLACE(REPLACE(@YourString,'"{','{'),'}"','}'),'""','"');
现在,您的字符串将如下所示:

{"Id":"44de2468","RecordType":20,"CreationTime":"2018-08-03T12:30:34","Operation":"ViewReport","OrganizationId":"779558","UserType":0,"UserKey":"FFFA3DA","Workload":"PowerBI","UserId":"john@abc.com","ClientIP":"9.5.3.26","UserAgent":"Mozilla\/5.0 (Windows NT 10.0;","Activity":"ViewReport","ItemName":"Sales","WorkSpaceName":"TeamITO","DatasetName":"Sales1","ReportName":"Sales1","WorkspaceId":"e8eaa0ca","ObjectId":"Sales1","DatasetId":"4c5d-ad45-eb6546","ReportId":"4cb0-99ad-de41b5160c47","IsSuccess":true,"DatapoolRefreshScheduleType":"None","DatapoolType":"Undefined"}
此查询将以驱动列表的形式返回所有列:

SELECT * 
FROM OPENJSON(@YourString);
结果返回一个带有类型提示的列表(而“value”的实际类型是
nvarchar
):

更好的是,您可以添加一个
WITH
子句,如下所示:

SELECT * 
FROM OPENJSON(@YourString)
WITH 
(   
    Id             varchar(200)  '$.Id',  
    RecordType     int           '$.RecordType',  
    CreationTime   datetime      '$.CreationTime'
    --Add all your known columns here...
)
这样,您就可以同时键入值


对于SQL Server 2016,这非常简单。有相当多的JSON支持。唯一的问题是,字符串不正确。很明显,有一个引擎将所有内部引号加倍(一种转义技术)

如果这在您的控制之下,您应该尝试将列的格式更改为正确的JSON。最好让编写应用程序以正确的JSON格式提供这些审核。至少您可以添加第二列并使用触发器保持同步。作为最后一种手段,您可以使用
REPLACE
修复字符串:

REPLACE(REPLACE(REPLACE(@YourString,'"{','{'),'}"','}'),'""','"');
对于很多行,这可能需要一段时间……这就是为什么最好将格式保持在正确的JSON中

只是为了说明这些原则:

DECLARE @YourString NVARCHAR(MAX)=N'"{""Id"":""44de2468"",""RecordType"":20,""CreationTime"":""2018-08-03T12:30:34"",""Operation"":""ViewReport"",""OrganizationId"":""779558"",""UserType"":0,""UserKey"":""FFFA3DA"",""Workload"":""PowerBI"",""UserId"":""john@abc.com"",""ClientIP"":""9.5.3.26"",""UserAgent"":""Mozilla\/5.0 (Windows NT 10.0;"",""Activity"":""ViewReport"",""ItemName"":""Sales"",""WorkSpaceName"":""TeamITO"",""DatasetName"":""Sales1"",""ReportName"":""Sales1"",""WorkspaceId"":""e8eaa0ca"",""ObjectId"":""Sales1"",""DatasetId"":""4c5d-ad45-eb6546"",""ReportId"":""4cb0-99ad-de41b5160c47"",""IsSuccess"":true,""DatapoolRefreshScheduleType"":""None"",""DatapoolType"":""Undefined""}"';

SET @YourString = REPLACE(REPLACE(REPLACE(@YourString,'"{','{'),'}"','}'),'""','"');
现在,您的字符串将如下所示:

{"Id":"44de2468","RecordType":20,"CreationTime":"2018-08-03T12:30:34","Operation":"ViewReport","OrganizationId":"779558","UserType":0,"UserKey":"FFFA3DA","Workload":"PowerBI","UserId":"john@abc.com","ClientIP":"9.5.3.26","UserAgent":"Mozilla\/5.0 (Windows NT 10.0;","Activity":"ViewReport","ItemName":"Sales","WorkSpaceName":"TeamITO","DatasetName":"Sales1","ReportName":"Sales1","WorkspaceId":"e8eaa0ca","ObjectId":"Sales1","DatasetId":"4c5d-ad45-eb6546","ReportId":"4cb0-99ad-de41b5160c47","IsSuccess":true,"DatapoolRefreshScheduleType":"None","DatapoolType":"Undefined"}
此查询将以驱动列表的形式返回所有列:

SELECT * 
FROM OPENJSON(@YourString);
结果返回一个带有类型提示的列表(而“value”的实际类型是
nvarchar
):

更好的是,您可以添加一个
WITH
子句,如下所示:

SELECT * 
FROM OPENJSON(@YourString)
WITH 
(   
    Id             varchar(200)  '$.Id',  
    RecordType     int           '$.RecordType',  
    CreationTime   datetime      '$.CreationTime'
    --Add all your known columns here...
)
这样,您就可以同时键入值


您使用的是什么版本的SQL Server?SQL Server 2016您使用的是什么版本的SQL Server?SQL Server 2016HI Alan,谢谢您的快速回答。我将解释我到底需要什么。该表除了“AuditDate”之外,也没有其他列。该表如下所示Recordtype操作Createddate AuditDate该表有5000多行。因此,如果我从“table”Recordtype操作Id Createddate Id Recordtype中选择*,则最终输出应为Creationtime操作等。包含所有行。@sateesh Alan的回答向您展示了如何添加列。我认为只要稍加努力,您就可以做到。但是,如果您遇到错误,现在您知道如何使用来获得最终结果,请打开一个新问题。@sateesh-除了scsimon所说的^^^之外,我的解决方案是使用一个基本的“拆分器”又名“标记器”,应用并更好地理解和理解我在Jeff Moden的文章中学习的旋转技术:。逆向工程it-玩转代码,它将很容易修改以满足您的需要。@AlanBurstein,,谢谢..我能从中得到我所需要的,嗨,Alan,谢谢您的快速回答。我将解释我到底需要什么除了“AuditDate”之外,table几乎没有其他列。该表类似于此Recordtype操作Createddate AuditDate。该表有5000多行。因此如果我从“表”中选择*的话,最终输出应该是