如何使用python或R将带有数组的JSON转换为CSV
我正在尝试将带有数组的JSON转换为CSV,但由于数组可以包含不同的内容,所以到目前为止我还没有找到解决方案 下面是一个JSON示例如何使用python或R将带有数组的JSON转换为CSV,python,arrays,json,csv,Python,Arrays,Json,Csv,我正在尝试将带有数组的JSON转换为CSV,但由于数组可以包含不同的内容,所以到目前为止我还没有找到解决方案 下面是一个JSON示例 [ { "name": "Doc 1", "description": "This is Document 1", "createdby": "User 1", "uid": "101
[
{
"name": "Doc 1",
"description": "This is Document 1",
"createdby": "User 1",
"uid": "101",
"created": "2020-01-01T12:00:00.000Z",
"changed": "2020-01-01T13:00:00.000Z",
"dim1": false,
"dim2": false,
"changedby": "User 1",
"path": "/1/2/3"
},
{
"name": "Doc 2",
"description": "This is Document 2",
"createdby": "User 1",
"uid": "102",
"created": "2020-01-01T12:00:00.000Z",
"changed": "2020-01-01T13:00:00.000Z",
"dim1": false,
"dim2": false,
"reference": [
{
"description": "Test1.csv",
"uid": "9000.csv",
"current": true
}
],
"changedby": "User 4",
"path": "/1/2/4"
},
{
"name": "Doc 3",
"description": "This is Document 3",
"createdby": "User 5",
"uid": "105",
"created": "2020-01-01T12:00:00.000Z",
"changed": "2020-01-01T13:00:00.000Z",
"dim1": false,
"dim2": false,
"reference": [
{
"description": "Test1.csv",
"uid": "9000.csv",
"current": true
},
{
"description": "Test6.csv",
"uid": "9005.csv",
"current": true
}
],
"changedby": "User 4",
"path": "/1/2/4"
},
{
"name": "Doc 4",
"description": "This is Document 4",
"createdby": "User 2",
"uid": "103",
"created": "2020-01-01T12:00:00.000Z",
"changed": "2020-01-01T13:00:00.000Z",
"dim1": false,
"dim2": false,
"reference": [
{
"description": "Test2.sql",
"uid": "9001.sql",
"connection": {
"type": "manual",
"system": "SQL",
"name": "Test2",
"user": "sqlread1",
"server": "server1.domain.com",
"port": "1433",
"sid": "300",
"dim3": null
},
"current": false
}
],
"changedby": "User 4",
"path": "/1/2/5"
},
{
"name": "Doc 5",
"description": "This is Document 5",
"createdby": "User 3",
"uid": "104",
"created": "2020-01-01T12:00:00.000Z",
"changed": "2020-01-01T13:00:00.000Z",
"dim1": false,
"dim2": false,
"reference": [
{
"description": "Test3.sql",
"uid": "9002.sql",
"connection": {
"type": "direct",
"system": "SQL",
"name": "Test3",
"user": "sqlread2",
"server": "server2.domain.com",
"port": "1433",
"sid": "301",
"dim3": null
},
"current": false
},
{
"description": "Test4.sql",
"uid": "9003.sql",
"connection": {
"type": "manuel",
"system": "SQL",
"name": "Test4",
"user": "sqlread3",
"server": "server2.domain.com",
"port": "1433",
"sid": "302",
"dim3": null
},
"current": false
},
{
"description": "Test5.sql",
"uid": "9004.sql",
"connection": {
"type": "direct",
"system": "SQL",
"name": "Test4",
"user": "sqlread4",
"server": "server2.domain.com",
"port": "1433",
"sid": "303",
"dim3": null
},
"current": false
},
{
"description": "Test6.csv",
"uid": "9005.csv",
"current": true
}
],
"changed": "User 4",
"path": "/1/2/4"
}]
在这个JSON中,数组被称为“reference”,数组可能不存在,存在3个维度,或者11个维度,每个条目都有一个组“connection”,并混合了短条目和长条目
当我在python中使用这段代码时,我可以将JSON展平,但数组将位于一列中
import pandas as pd
from pandas.io.json import json_normalize
df=pd.read_json ('Sample.json')
print(df)
但我想要的应该是这样的:
对于数组中的每个附加条目,应复制该行,使整个数组内容位于相应的列中,但位于单独的行中
是否有可能使脚本具有通用性
谢谢!
迈克
你也可以这样做:
是的,一切皆有可能!:) 你想做的事情通常被称为扁平化
你可以在这里找到一个例子:他们将数组拆分为多列,我希望有多行,而不是多行。结果与我的json\u规范化代码相同-数组没有“分解”
import pandas as pd
import json
with open("yourjson.json", "r") as read_file:
a = json.load(read_file)
df = pd.DataFrame(a)
print(df)
csv_data = df.to_csv('mynewjsonfile.csv', index = False)