python将一个json结构转换为嵌套结构
如何将以下json格式转换为下面的目标格式?我有5万条参赛作品。python将一个json结构转换为嵌套结构,python,json,converter,Python,Json,Converter,如何将以下json格式转换为下面的目标格式?我有5万条参赛作品。 基本上,从每个数组中获取唯一的国家,并在一个数组中包含具有相同国家名称的所有其他国家 原始json: [ { "unilist": [ { "country": "United States", "name": "The College of New Jersey",
基本上,从每个数组中获取唯一的国家,并在一个数组中包含具有相同国家名称的所有其他国家 原始json:
[
{
"unilist": [
{
"country": "United States",
"name": "The College of New Jersey",
"web_page": "http://www.tcnj.edu"
},
{
"country": "United States",
"name": "Abilene Christian University",
"web_page": "http://www.acu.edu/"
},
{
"country": "United States",
"name": "Adelphi University",
"web_page": "http://www.adelphi.edu/"
},
{
"country": "China",
"name": "Harbin Medical University",
"web_page": "http://www.hrbmu.edu.cn/"
},
{
"country": "China",
"name": "Harbin Normal University",
"web_page": "http://www.hrbnu.edu.cn/"
}
...
]
}
]
目标格式:
{
"unilist" : {
"United States" : [
{"name" : "The College of New Jersey", "web_page" : "http://www.tcnj.edu"},
{"name" : "Abilene Christian University", "web_page" : "http://www.acu.edu/"},
{"name" : "Adelphi University", "web_page" : "http://www.adelphi.edu/"}
],
"China" : [
{"name" : "Harbin Medical University", "web_page" : "http://www.hrbnu.edu.cn/"}
],
...
}
}
更新
我的尝试(在Python 2.7.11中)基于的,但是它没有按预期工作,我得到以下typeError:
from collections import defaultdict
import json
from pprint import pprint
with open('old_list.json') as orig_json:
newlist = defaultdict(list)
for country in orig_json[0]['unilist']:
newlist[country['country']].append({'name': country['name'], 'web_page': country['web_page']})
with open('new_list.json', 'w') as fp:
json.dump(newlist,fp)
pprint.pprint(dict(newlist))
类型错误:
Traceback (most recent call last):
File "convert.py", line 8, in <module>
for country in orig_json[0]['unilist']:
TypeError: 'file' object has no attribute '__getitem__'
回溯(最近一次呼叫最后一次):
文件“convert.py”,第8行,在
对于原始json[0]['unilist']中的国家/地区:
TypeError:“文件”对象没有属性“\uuuu getitem\uuuu”
这会产生几乎相同的目标输出,只是缺少“unilist”
键。但至少它会按国家对参赛作品进行分组:
import json
from collections import defaultdict
with open('original.json', 'r') as original:
orig_json = original.read()[1:-1] # Remove outermost list brackets([]) to enable parsing data as JSON data, not a list
oj = json.loads(orig_json)
newlist = defaultdict(list)
for country in oj['unilist']:
newlist[country['country']].append({'name': country['name'],
'web_page': country['web_page']})
with open('new.json', 'w') as outfile:
json.dump(newlist, outfile)
这将把newlist
保存到一个json文件“newlist.json”中
输出:
{'China': [{'name': 'Harbin Medical University',
'web_page': 'http://www.hrbmu.edu.cn/'},
{'name': 'Harbin Normal University',
'web_page': 'http://www.hrbnu.edu.cn/'}],
'United States': [{'name': 'The College of New Jersey',
'web_page': 'http://www.tcnj.edu'},
{'name': 'Abilene Christian University',
'web_page': 'http://www.acu.edu/'},
{'name': 'Adelphi University',
'web_page': 'http://www.adelphi.edu/'}]}
如果我找到更好的方法来获得准确的目标输出,我会更新这个答案。同时,我希望这对您有所帮助。好的,谢谢,我将测试这一点,并让您知道结果是什么。我如何打开定义我的文件条目,如何定义
defaultdict
?哦,对不起,我应该包括模块导入:从集合导入defaultdict
。添加到edit对不起,我是一个新手,你能帮我定义并将我的文件读入代码,然后将其另存为一个输出文件吗?我根据你的回答更新了我的问题,你能看看代码抛出的错误吗。为什么会这样?