python将一个json结构转换为嵌套结构_Python_Json_Converter

python将一个json结构转换为嵌套结构

python json

python将一个json结构转换为嵌套结构,python,json,converter,Python,Json,Converter,如何将以下json格式转换为下面的目标格式？我有5万条参赛作品。基本上，从每个数组中获取唯一的国家，并在一个数组中包含具有相同国家名称的所有其他国家原始json: [ { "unilist": [ { "country": "United States", "name": "The College of New Jersey",

如何将以下json格式转换为下面的目标格式？我有5万条参赛作品。
基本上，从每个数组中获取唯一的国家，并在一个数组中包含具有相同国家名称的所有其他国家

原始json:

[
    {
        "unilist": [
                {
                    "country": "United States",
                    "name": "The College of New Jersey",
                    "web_page": "http://www.tcnj.edu"
                },
                {
                    "country": "United States",
                    "name": "Abilene Christian University",
                    "web_page": "http://www.acu.edu/"
                },
                {
                    "country": "United States",
                    "name": "Adelphi University",
                    "web_page": "http://www.adelphi.edu/"
                },
                {
                    "country": "China",
                    "name": "Harbin Medical University",
                    "web_page": "http://www.hrbmu.edu.cn/"
                },
                {
                    "country": "China",
                    "name": "Harbin Normal University",
                    "web_page": "http://www.hrbnu.edu.cn/"
                }
                ...
                ]
    }
]

目标格式：

{
"unilist" : {
        "United States" : [
          {"name" : "The College of New Jersey", "web_page" : "http://www.tcnj.edu"},
          {"name" : "Abilene Christian University", "web_page" : "http://www.acu.edu/"},
          {"name" : "Adelphi University", "web_page" : "http://www.adelphi.edu/"}
        ],
        "China" : [
          {"name" : "Harbin Medical University", "web_page" : "http://www.hrbnu.edu.cn/"}
        ],
        ...
    }
}

更新我的尝试（在Python 2.7.11中）基于的，但是它没有按预期工作，我得到以下typeError：

from collections import defaultdict
import json
from pprint import pprint

with open('old_list.json') as orig_json:    
    newlist = defaultdict(list)

for country in orig_json[0]['unilist']:
    newlist[country['country']].append({'name': country['name'], 'web_page': country['web_page']})

with open('new_list.json', 'w') as fp:
            json.dump(newlist,fp)


pprint.pprint(dict(newlist))

类型错误：

Traceback (most recent call last):
  File "convert.py", line 8, in <module>
    for country in orig_json[0]['unilist']:
TypeError: 'file' object has no attribute '__getitem__'

回溯（最近一次呼叫最后一次）：
文件“convert.py”，第8行，在
对于原始json[0]['unilist']中的国家/地区：
TypeError:“文件”对象没有属性“\uuuu getitem\uuuu”

这会产生几乎相同的目标输出，只是缺少

“unilist”

键。但至少它会按国家对参赛作品进行分组：

import json
from collections import defaultdict

with open('original.json', 'r') as original:
    orig_json = original.read()[1:-1] # Remove outermost list brackets([]) to enable parsing data as JSON data, not a list

oj = json.loads(orig_json)

newlist = defaultdict(list)

for country in oj['unilist']:
    newlist[country['country']].append({'name': country['name'], 
                                        'web_page': country['web_page']})

with open('new.json', 'w') as outfile:
    json.dump(newlist, outfile)

这将把

newlist

保存到一个json文件“newlist.json”中

输出：

{'China': [{'name': 'Harbin Medical University',
            'web_page': 'http://www.hrbmu.edu.cn/'},
           {'name': 'Harbin Normal University',
            'web_page': 'http://www.hrbnu.edu.cn/'}],
 'United States': [{'name': 'The College of New Jersey',
                    'web_page': 'http://www.tcnj.edu'},
                   {'name': 'Abilene Christian University',
                    'web_page': 'http://www.acu.edu/'},
                   {'name': 'Adelphi University',
                    'web_page': 'http://www.adelphi.edu/'}]}

如果我找到更好的方法来获得准确的目标输出，我会更新这个答案。同时，我希望这对您有所帮助。

好的，谢谢，我将测试这一点，并让您知道结果是什么。我如何打开定义我的文件条目，如何定义

defaultdict

？哦，对不起，我应该包括模块导入：

从集合导入defaultdict

。添加到edit对不起，我是一个新手，你能帮我定义并将我的文件读入代码，然后将其另存为一个输出文件吗？我根据你的回答更新了我的问题，你能看看代码抛出的错误吗。为什么会这样？