Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/321.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/backbone.js/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Python中解析JSON嵌套数组,保留到JSON对象的映射_Python_Json_Pandas_Jsonparser - Fatal编程技术网

在Python中解析JSON嵌套数组,保留到JSON对象的映射

在Python中解析JSON嵌套数组,保留到JSON对象的映射,python,json,pandas,jsonparser,Python,Json,Pandas,Jsonparser,我有一个大JSON文件,其结构如下: { "Project": { "AAA": { "Version": [ { "id": "00001", "name": "08.12.2019", "description": null, "released"

我有一个大JSON文件,其结构如下:

    {
    "Project": {
        "AAA": {
            "Version": [
                {
                    "id": "00001",
                    "name": "08.12.2019",
                    "description": null,
                    "released": true,
                    "releaseDate": "2019-08-12"
                },
                {
                    "id": "00002",
                    "name": "2019.8.26",
                    "description": null,
                    "released": true,
                    "releaseDate": "2019-08-26"
                }
            ]
        },
        "BBB": {
            "Version": [
                {
                    "id": "00003",
                    "name": "AABBY3",
                    "description": "2019 Accounting Year End",
                    "released": false,
                    "releaseDate": null
                },
                {
                    "id": "00004",
                    "name": "AACCZ4",
                    "description": "Financial Statements 2019",
                    "released": false,
                    "releaseDate": null
                },
                {
                    "id": "00005",
                    "name": "AADDZ5",
                    "description": null,
                    "released": false,
                    "releaseDate": null
                }
            ]
        }
    }
}
df.head(3)
Out[10]: 
      description     id    name releaseDate  released
0  Version 5.4.1.  10703  V5R4M1  2010-09-15      True
1   Version 5.5.1  10704  V5R5M1  2015-04-20      True
2   Version 6.1.1  10705  V6R1M1  2016-10-14      True
由于嵌套数组,我在将其转换为Python数据帧时遇到问题。对于每个
项目
,如何提取每个
版本
中的所有数据,但保持对
项目
的引用

到目前为止,我只获得了以下结构的数据帧:

    {
    "Project": {
        "AAA": {
            "Version": [
                {
                    "id": "00001",
                    "name": "08.12.2019",
                    "description": null,
                    "released": true,
                    "releaseDate": "2019-08-12"
                },
                {
                    "id": "00002",
                    "name": "2019.8.26",
                    "description": null,
                    "released": true,
                    "releaseDate": "2019-08-26"
                }
            ]
        },
        "BBB": {
            "Version": [
                {
                    "id": "00003",
                    "name": "AABBY3",
                    "description": "2019 Accounting Year End",
                    "released": false,
                    "releaseDate": null
                },
                {
                    "id": "00004",
                    "name": "AACCZ4",
                    "description": "Financial Statements 2019",
                    "released": false,
                    "releaseDate": null
                },
                {
                    "id": "00005",
                    "name": "AADDZ5",
                    "description": null,
                    "released": false,
                    "releaseDate": null
                }
            ]
        }
    }
}
df.head(3)
Out[10]: 
      description     id    name releaseDate  released
0  Version 5.4.1.  10703  V5R4M1  2010-09-15      True
1   Version 5.5.1  10704  V5R5M1  2015-04-20      True
2   Version 6.1.1  10705  V6R1M1  2016-10-14      True
使用以下命令:

with open("fixVer2.json", "r") as read_file:
    data = json.load(read_file)

prj_list = ['AAA', 'BBB', 'CCC', 'DDD']

d_list = []
for x in prj_list:
    d = data['Project'][x]['Version']
    for el in d:
        d_list.append(el)

df = pd.DataFrame(d_list)
但是,由于不同发布日期的项目之间存在重复的
名称
,我需要保留
项目
名称,以便为每个
名称
识别正确的
发布日期

期望输出:

      description     id    name releaseDate  released  Project
0  Version 5.4.1.  10703  V5R4M1  2010-09-15      True  CCC
1   Version 5.5.1  10704  V5R5M1  2015-04-20      True  CCC
2   Version 6.1.1  10705  V6R1M1  2016-10-14      True  CCC

我不确定如何解析嵌套数组,保留
项目
名称详细信息,并将其整合到一个数据帧/其他Python结构中

您可以在解决方案中使用添加的版本更改append:

d_list = []
for x in prj_list:
    d = data['Project'][x]['Version']
    for el in d:
        el['Project'] = x
        d_list.append(el)
或使用列表理解:

prj_list = ['AAA', 'BBB']

d_list = [{**el, **{'version': x}} for x in prj_list for el in data['Project'][x]['Version']]
df = pd.DataFrame(d_list)
print (df)
      id        name                description  released releaseDate version
0  00001  08.12.2019                       null      True  2019-08-12     AAA
1  00002   2019.8.26                       null      True  2019-08-26     AAA
2  00003      AABBY3   2019 Accounting Year End     False        null     BBB
3  00004      AACCZ4  Financial Statements 2019     False        null     BBB
4  00005      AADDZ5                       null     False        null     BBB

您可以在解决方案中使用添加的版本更改附加:

d_list = []
for x in prj_list:
    d = data['Project'][x]['Version']
    for el in d:
        el['Project'] = x
        d_list.append(el)
或使用列表理解:

prj_list = ['AAA', 'BBB']

d_list = [{**el, **{'version': x}} for x in prj_list for el in data['Project'][x]['Version']]
df = pd.DataFrame(d_list)
print (df)
      id        name                description  released releaseDate version
0  00001  08.12.2019                       null      True  2019-08-12     AAA
1  00002   2019.8.26                       null      True  2019-08-26     AAA
2  00003      AABBY3   2019 Accounting Year End     False        null     BBB
3  00004      AACCZ4  Financial Statements 2019     False        null     BBB
4  00005      AADDZ5                       null     False        null     BBB
试试这个:

import json
import pandas as pd

with open("test.json", "r") as read_file:
    data = json.load(read_file)['Project']
d_list = []
for name,dat in data.items():
    for d in dat['Version']:
        d['Project']=name
        d_list.append(d)
df = pd.DataFrame(d_list)
print(df)

Project                description     id        name releaseDate  released
0     AAA                       None  00001  08.12.2019  2019-08-12      True
1     AAA                       None  00002   2019.8.26  2019-08-26      True
2     BBB   2019 Accounting Year End  00003      AABBY3        None     False
3     BBB  Financial Statements 2019  00004      AACCZ4        None     False
4     BBB                       None  00005      AADDZ5        None     False
使用这种方法,您不需要保留单独的项目列表。希望这有帮助

试试这个:

import json
import pandas as pd

with open("test.json", "r") as read_file:
    data = json.load(read_file)['Project']
d_list = []
for name,dat in data.items():
    for d in dat['Version']:
        d['Project']=name
        d_list.append(d)
df = pd.DataFrame(d_list)
print(df)

Project                description     id        name releaseDate  released
0     AAA                       None  00001  08.12.2019  2019-08-12      True
1     AAA                       None  00002   2019.8.26  2019-08-26      True
2     BBB   2019 Accounting Year End  00003      AABBY3        None     False
3     BBB  Financial Statements 2019  00004      AACCZ4        None     False
4     BBB                       None  00005      AADDZ5        None     False
使用这种方法,您不需要保留单独的项目列表。希望这有帮助