Python 将嵌套JSON转换为CSV
这个问题在这里可能已经被问过好几次了。我一直在尝试将嵌套的JSON文件平面化并将其转换为CSV,但我能得到的最接近的结果是列出字段名:Python 将嵌套JSON转换为CSV,python,json,pandas,csv,Python,Json,Pandas,Csv,这个问题在这里可能已经被问过好几次了。我一直在尝试将嵌套的JSON文件平面化并将其转换为CSV,但我能得到的最接近的结果是列出字段名:MyCount、from、Mysize、Allhits、aggs,但没有值: Output.csv: "" Mycount from Mysize Allhits aggs 我一直在尝试将JSON转换为CSV: import json import csv def get_leaves(item, key=None): if isinstance(it
MyCount、from、Mysize、Allhits、aggs
,但没有值:
Output.csv:
""
Mycount
from
Mysize
Allhits
aggs
我一直在尝试将JSON转换为CSV:
import json
import csv
def get_leaves(item, key=None):
if isinstance(item, dict):
leaves = {}
for i in item.keys():
leaves.update(get_leaves(item[i], i))
return leaves
elif isinstance(item, list):
leaves = {}
for i in item:
leaves.update(get_leaves(i, key))
return leaves
else:
return {key : item}
with open('path/to/my/file.json') as f_input:
json_data = json.load(f_input)
# Paresing all entries to get the complete fieldname list
fieldnames = set()
for entry in json_data:
fieldnames.update(get_leaves(entry).keys())
with open('/path/to/myoutput.csv', 'w', newline='') as f_output:
csv_output = csv.DictWriter(f_output, fieldnames=sorted(fieldnames))
csv_output.writeheader()
csv_output.writerows(get_leaves(entry) for entry in json_data)
JSON结构如下所示:
{"Mycount":538,
"from":0,
"Mysize":1000,
"Allhits":[{
"isVerified":true,
"backgroundColor":"FF720B",
"name":"yourShop",
"Id":"12345678",
"ActionItems":[{
"subtitle":"Click here to start",
"body":null,
"language":"de",
"title":"Start here",
"isDefault":true}],
"tintColor":"FFFFFF",
"shoppingHours":[{"hours":[{"day":["SUNDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["MONDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["SATURDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["FRIDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["THURSDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["WEDNESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["TUESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]}]}],
"LogoUrl":"https://url/to/my/logo.png",
"coverage":[{
"country":"*",
"language":"*",
"ratio":1}],
"shoppingHours2":[{"hours":[{"day":["SUNDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["MONDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["SATURDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["FRIDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["THURSDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["WEDNESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["TUESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]}]}],
"group":"shop_open",
"timeZone":"CET",
"phone":"+1234567890",
"modTime":1234567890,
"intId":"+123456789",
"Logo2Url":"https://link/to/my/logo.png"}],
"aggs":{}}
with open('test.json') as f_input, open('result.csv', 'w', newline='') as f_output:
writer = csv.writer(f_output)
hits = json.load(f_input)["Allhits"]
writer.writerow(["Id", "intId", "name", "ratio"])
for hit in hits:
writer.writerow([hit["Id"], hit["intId"], hit["name"], hit["coverage"][0]["ratio"]])
使用pandas模块是否容易实现这一点?我仍在学习python,因此我非常感谢您的指导。我需要从这个json文件中提取id、intId、name、ratio
和该字段名称的值,并将其提取到CSV中
所需的输出应该是(或者它可以有所有字段名和值,然后我可以直接从CSV中提取我需要的字段):
这只是一条记录的版本,但我的输出文件必须包含JSON文件中所有ID的行
提前谢谢你的帮助
编辑
我还尝试了以下方法:
import json
import csv
x = '/path/to/myrecords.json'
x = json.loads(x)
f.writerow(["name", "id", "intId", "ratio"])
f = csv.writer(open("/path/to/my/output.csv", "w", newline=''))
for x in x:
f.writerow([x["Allhits"]["name"],
x["Allhits"]["id"],
x["Allhits"]["ActionItems"]["intId"],
x["Allhits"]["ActionItems"]["ratio"]])
但是在x=json.loads(x)
步骤中收到此错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/myusername/anaconda3/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/Users/myusername/anaconda3/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/myusername/anaconda3/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
文件“/Users/myusername/anaconda3/lib/python3.6/json/_init__.py”,第354行,加载
返回\u默认\u解码器。解码
文件“/Users/myusername/anaconda3/lib/python3.6/json/decoder.py”,第339行,在decode中
obj,end=self.raw\u decode(s,idx=\u w(s,0.end())
文件“/Users/myusername/anaconda3/lib/python3.6/json/decoder.py”,第357行,原始解码
从None引发JSONDecodeError(“预期值”,s,err.value)
json.decoder.JSONDecodeError:预期值:第1行第1列(字符0)
如果需要展平整个json(包括数组),可以通过重复执行以下操作:
导入json
导入csv
def展平(项目,前缀=无):
结果={}
如果存在(项目、列表):
item={i:item[i]表示范围(0,len(item))}
对于键,在item.items()中使用val:
前缀_key=f“{prefix}{key}”if prefix else str(key)
如果isinstance(val,列表)或isinstance(val,dict):
结果={**result,**展平(val,f“{prefixed_key}}}”
其他:
结果[带前缀的_键]=val
返回结果
以open('test.json')作为f_输入,open('result.csv','w',newline='')作为f_输出:
writer=csv.writer(f_输出)
hits=json.load(f_输入)[“Allhits”]
标题_writed=False
点击率:
展平=展平(命中)
如果未写入标题(u):
writer.writerow(flat.keys())
标题_writed=True
writer.writerow(flat.values())
有了这个,你会得到这个csv怪物:
isVerified,backgroundColor,name,Id,ActionItems_0_subtitle,ActionItems_0_body,ActionItems_0_language,ActionItems_0_title,ActionItems_0_isDefault,tintColor,shoppingHours_0_hours_0_day_0,shoppingHours_0_hours_0_timeRange_0_allDay,shoppingHours_0_hours_0_timeRange_0_from,shoppingHours_0_hours_0_timeRange_0_to,shoppingHours_0_hours_1_day_0,shoppingHours_0_hours_1_timeRange_0_allDay,shoppingHours_0_hours_1_timeRange_0_from,shoppingHours_0_hours_1_timeRange_0_to,shoppingHours_0_hours_2_day_0,shoppingHours_0_hours_2_timeRange_0_allDay,shoppingHours_0_hours_2_timeRange_0_from,shoppingHours_0_hours_2_timeRange_0_to,shoppingHours_0_hours_3_day_0,shoppingHours_0_hours_3_timeRange_0_allDay,shoppingHours_0_hours_3_timeRange_0_from,shoppingHours_0_hours_3_timeRange_0_to,shoppingHours_0_hours_4_day_0,shoppingHours_0_hours_4_timeRange_0_allDay,shoppingHours_0_hours_4_timeRange_0_from,shoppingHours_0_hours_4_timeRange_0_to,shoppingHours_0_hours_5_day_0,shoppingHours_0_hours_5_timeRange_0_allDay,shoppingHours_0_hours_5_timeRange_0_from,shoppingHours_0_hours_5_timeRange_0_to,shoppingHours_0_hours_6_day_0,shoppingHours_0_hours_6_timeRange_0_allDay,shoppingHours_0_hours_6_timeRange_0_from,shoppingHours_0_hours_6_timeRange_0_to,LogoUrl,coverage_0_country,coverage_0_language,coverage_0_ratio,shoppingHours2_0_hours_0_day_0,shoppingHours2_0_hours_0_timeRange_0_allDay,shoppingHours2_0_hours_0_timeRange_0_from,shoppingHours2_0_hours_0_timeRange_0_to,shoppingHours2_0_hours_1_day_0,shoppingHours2_0_hours_1_timeRange_0_allDay,shoppingHours2_0_hours_1_timeRange_0_from,shoppingHours2_0_hours_1_timeRange_0_to,shoppingHours2_0_hours_2_day_0,shoppingHours2_0_hours_2_timeRange_0_allDay,shoppingHours2_0_hours_2_timeRange_0_from,shoppingHours2_0_hours_2_timeRange_0_to,shoppingHours2_0_hours_3_day_0,shoppingHours2_0_hours_3_timeRange_0_allDay,shoppingHours2_0_hours_3_timeRange_0_from,shoppingHours2_0_hours_3_timeRange_0_to,shoppingHours2_0_hours_4_day_0,shoppingHours2_0_hours_4_timeRange_0_allDay,shoppingHours2_0_hours_4_timeRange_0_from,shoppingHours2_0_hours_4_timeRange_0_to,shoppingHours2_0_hours_5_day_0,shoppingHours2_0_hours_5_timeRange_0_allDay,shoppingHours2_0_hours_5_timeRange_0_from,shoppingHours2_0_hours_5_timeRange_0_to,shoppingHours2_0_hours_6_day_0,shoppingHours2_0_hours_6_timeRange_0_allDay,shoppingHours2_0_hours_6_timeRange_0_from,shoppingHours2_0_hours_6_timeRange_0_to,group,timeZone,phone,modTime,intId,Logo2Url
True,FF720B,yourShop,12345678,Click here to start,,de,Start here,True,FFFFFF,SUNDAY,False,25200,68400,MONDAY,False,25200,68400,SATURDAY,False,25200,68400,FRIDAY,False,25200,68400,THURSDAY,False,25200,68400,WEDNESDAY,False,25200,68400,TUESDAY,False,25200,68400,https://url/to/my/logo.png,*,*,1,SUNDAY,False,25200,68400,MONDAY,False,25200,68400,SATURDAY,False,25200,68400,FRIDAY,False,25200,68400,THURSDAY,False,25200,68400,WEDNESDAY,False,25200,68400,TUESDAY,False,25200,68400,shop_open,CET,+1234567890,1234567890,+123456789,https://link/to/my/logo.png
但是,如果您只需要特定的键,您只需遍历您的Allhits
,然后检索所需的内容,如下所示:
{"Mycount":538,
"from":0,
"Mysize":1000,
"Allhits":[{
"isVerified":true,
"backgroundColor":"FF720B",
"name":"yourShop",
"Id":"12345678",
"ActionItems":[{
"subtitle":"Click here to start",
"body":null,
"language":"de",
"title":"Start here",
"isDefault":true}],
"tintColor":"FFFFFF",
"shoppingHours":[{"hours":[{"day":["SUNDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["MONDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["SATURDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["FRIDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["THURSDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["WEDNESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["TUESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]}]}],
"LogoUrl":"https://url/to/my/logo.png",
"coverage":[{
"country":"*",
"language":"*",
"ratio":1}],
"shoppingHours2":[{"hours":[{"day":["SUNDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["MONDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["SATURDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["FRIDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["THURSDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["WEDNESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["TUESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]}]}],
"group":"shop_open",
"timeZone":"CET",
"phone":"+1234567890",
"modTime":1234567890,
"intId":"+123456789",
"Logo2Url":"https://link/to/my/logo.png"}],
"aggs":{}}
with open('test.json') as f_input, open('result.csv', 'w', newline='') as f_output:
writer = csv.writer(f_output)
hits = json.load(f_input)["Allhits"]
writer.writerow(["Id", "intId", "name", "ratio"])
for hit in hits:
writer.writerow([hit["Id"], hit["intId"], hit["name"], hit["coverage"][0]["ratio"]])
如果需要展平整个json(包括数组),可以通过重复执行以下操作:
导入json
导入csv
def展平(项目,前缀=无):
结果={}
如果存在(项目、列表):
item={i:item[i]表示范围(0,len(item))}
对于键,在item.items()中使用val:
前缀_key=f“{prefix}{key}”if prefix else str(key)
如果isinstance(val,列表)或isinstance(val,dict):
结果={**result,**展平(val,f“{prefixed_key}}}”
其他:
结果[带前缀的_键]=val
返回结果
以open('test.json')作为f_输入,open('result.csv','w',newline='')作为f_输出:
writer=csv.writer(f_输出)
hits=json.load(f_输入)[“Allhits”]
标题_writed=False
点击率:
展平=展平(命中)
如果未写入标题(u):
writer.writerow(flat.keys())
标题_writed=True
writer.writerow(flat.values())
有了这个,你会得到这个csv怪物:
isVerified,backgroundColor,name,Id,ActionItems_0_subtitle,ActionItems_0_body,ActionItems_0_language,ActionItems_0_title,ActionItems_0_isDefault,tintColor,shoppingHours_0_hours_0_day_0,shoppingHours_0_hours_0_timeRange_0_allDay,shoppingHours_0_hours_0_timeRange_0_from,shoppingHours_0_hours_0_timeRange_0_to,shoppingHours_0_hours_1_day_0,shoppingHours_0_hours_1_timeRange_0_allDay,shoppingHours_0_hours_1_timeRange_0_from,shoppingHours_0_hours_1_timeRange_0_to,shoppingHours_0_hours_2_day_0,shoppingHours_0_hours_2_timeRange_0_allDay,shoppingHours_0_hours_2_timeRange_0_from,shoppingHours_0_hours_2_timeRange_0_to,shoppingHours_0_hours_3_day_0,shoppingHours_0_hours_3_timeRange_0_allDay,shoppingHours_0_hours_3_timeRange_0_from,shoppingHours_0_hours_3_timeRange_0_to,shoppingHours_0_hours_4_day_0,shoppingHours_0_hours_4_timeRange_0_allDay,shoppingHours_0_hours_4_timeRange_0_from,shoppingHours_0_hours_4_timeRange_0_to,shoppingHours_0_hours_5_day_0,shoppingHours_0_hours_5_timeRange_0_allDay,shoppingHours_0_hours_5_timeRange_0_from,shoppingHours_0_hours_5_timeRange_0_to,shoppingHours_0_hours_6_day_0,shoppingHours_0_hours_6_timeRange_0_allDay,shoppingHours_0_hours_6_timeRange_0_from,shoppingHours_0_hours_6_timeRange_0_to,LogoUrl,coverage_0_country,coverage_0_language,coverage_0_ratio,shoppingHours2_0_hours_0_day_0,shoppingHours2_0_hours_0_timeRange_0_allDay,shoppingHours2_0_hours_0_timeRange_0_from,shoppingHours2_0_hours_0_timeRange_0_to,shoppingHours2_0_hours_1_day_0,shoppingHours2_0_hours_1_timeRange_0_allDay,shoppingHours2_0_hours_1_timeRange_0_from,shoppingHours2_0_hours_1_timeRange_0_to,shoppingHours2_0_hours_2_day_0,shoppingHours2_0_hours_2_timeRange_0_allDay,shoppingHours2_0_hours_2_timeRange_0_from,shoppingHours2_0_hours_2_timeRange_0_to,shoppingHours2_0_hours_3_day_0,shoppingHours2_0_hours_3_timeRange_0_allDay,shoppingHours2_0_hours_3_timeRange_0_from,shoppingHours2_0_hours_3_timeRange_0_to,shoppingHours2_0_hours_4_day_0,shoppingHours2_0_hours_4_timeRange_0_allDay,shoppingHours2_0_hours_4_timeRange_0_from,shoppingHours2_0_hours_4_timeRange_0_to,shoppingHours2_0_hours_5_day_0,shoppingHours2_0_hours_5_timeRange_0_allDay,shoppingHours2_0_hours_5_timeRange_0_from,shoppingHours2_0_hours_5_timeRange_0_to,shoppingHours2_0_hours_6_day_0,shoppingHours2_0_hours_6_timeRange_0_allDay,shoppingHours2_0_hours_6_timeRange_0_from,shoppingHours2_0_hours_6_timeRange_0_to,group,timeZone,phone,modTime,intId,Logo2Url
True,FF720B,yourShop,12345678,Click here to start,,de,Start here,True,FFFFFF,SUNDAY,False,25200,68400,MONDAY,False,25200,68400,SATURDAY,False,25200,68400,FRIDAY,False,25200,68400,THURSDAY,False,25200,68400,WEDNESDAY,False,25200,68400,TUESDAY,False,25200,68400,https://url/to/my/logo.png,*,*,1,SUNDAY,False,25200,68400,MONDAY,False,25200,68400,SATURDAY,False,25200,68400,FRIDAY,False,25200,68400,THURSDAY,False,25200,68400,WEDNESDAY,False,25200,68400,TUESDAY,False,25200,68400,shop_open,CET,+1234567890,1234567890,+123456789,https://link/to/my/logo.png
但是,如果您只需要特定的键,您只需遍历您的Allhits
,然后检索所需的内容,如下所示:
{"Mycount":538,
"from":0,
"Mysize":1000,
"Allhits":[{
"isVerified":true,
"backgroundColor":"FF720B",
"name":"yourShop",
"Id":"12345678",
"ActionItems":[{
"subtitle":"Click here to start",
"body":null,
"language":"de",
"title":"Start here",
"isDefault":true}],
"tintColor":"FFFFFF",
"shoppingHours":[{"hours":[{"day":["SUNDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["MONDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["SATURDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["FRIDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["THURSDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["WEDNESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["TUESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]}]}],
"LogoUrl":"https://url/to/my/logo.png",
"coverage":[{
"country":"*",
"language":"*",
"ratio":1}],
"shoppingHours2":[{"hours":[{"day":["SUNDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["MONDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["SATURDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["FRIDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["THURSDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["WEDNESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]},{"day":["TUESDAY"],"timeRange":[{"allDay":false,"from":25200,"to":68400}]}]}],
"group":"shop_open",
"timeZone":"CET",
"phone":"+1234567890",
"modTime":1234567890,
"intId":"+123456789",
"Logo2Url":"https://link/to/my/logo.png"}],
"aggs":{}}
with open('test.json') as f_input, open('result.csv', 'w', newline='') as f_output:
writer = csv.writer(f_output)
hits = json.load(f_input)["Allhits"]
writer.writerow(["Id", "intId", "name", "ratio"])
for hit in hits:
writer.writerow([hit["Id"], hit["intId"], hit["name"], hit["coverage"][0]["ratio"]])
试试这个。这将循环通过
所有点击次数列表,获得所需的最小数据集:
import json
import csv
with open('/path/to/myrecords.json') as f_input:
json_data = json.load(f_input)
with open('/path/to/my/output.csv', 'w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(["id", "intId", "name", "ratio"])
for x in json_data['Allhits']:
csv_output.writerow([x["Id"], x["intId"], x["name"], x["coverage"][0]["ratio"]])
试试这个。这将循环通过所有点击次数列表,获得所需的最小数据集:
import json
import csv
with open('/path/to/myrecords.json') as f_input:
json_data = json.load(f_input)
with open('/path/to/my/output.csv', 'w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(["id", "intId", "name", "ratio"])
for x in json_data['Allhits']:
csv_output.writerow([x["Id"], x["intId"], x["name"], x["coverage"][0]["ratio"]])
json.decoder.JSONDecodeError:预期值:第1行第1列(字符
0)
这是因为json.loads
需要一个包含json数据的字符串,而文件名作为x传递
正如“Allhits”:[{…}]
是包含字典的单个元素列表
将x[“Allhits”][“name”]
替换为x[“Allhits”][0][“name”]
。类似地,在访问其他元素时,如“Id”
json.decoder.JSONDecodeError:预期值:第1行第1列(字符
0)
这是因为json.loads
需要一个包含json数据的字符串,而文件名作为x传递
正如“Allhits”:[{…}]
是包含字典的单个元素列表
将x[“Allhits”][“name”]
替换为x[“Allhits”][0][“name”]
。同样,在访问其他元素时,如“Id”
为什么不使用键将所有必需的值检索到列表中并写入csv文件?这是否回答了您的问题@我已经试过了,但仍在挣扎(见我文章底部的编辑段落)。你能帮忙吗?@Olvin,你能分享一些代码吗?考虑到我只是Python的初学者,这将有很大帮助!谢谢@猴面包树1988,你在为x中的x做。首先,它不会工作,但即使工作,您的示例也只包含一个json对象,而不是一个数组。您试图迭代什么?为什么不使用键检索列表中的所有必需值并写入csv文件?这是否回答了您的问题@我已经试过了,但仍在挣扎(见我文章底部的编辑段落)。你能帮忙吗?@Olvin,你能分享一些代码吗?这会