Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用Python和Pandas从嵌套JSON反序列化数据_Python_Json_Python 3.x_Pandas_Json Deserialization - Fatal编程技术网

使用Python和Pandas从嵌套JSON反序列化数据

使用Python和Pandas从嵌套JSON反序列化数据,python,json,python-3.x,pandas,json-deserialization,Python,Json,Python 3.x,Pandas,Json Deserialization,我有嵌套Json中的时间序列数据,我正努力将其放入扁平的数据帧中 输入数据 数据如下: 预期产量 扁平的熊猫数据框:国家|日期|病例|死亡|恢复 我试过的 现在,我可以使用df=pd.json\u normalizejson\u数据,max\u level=1,但这就留下了嵌入列表。 我也可以使用df=pd.json_normalizejson_数据,但这只会为每个日期创建一个新的列,随着时间的推移,这是不可持续的 一定有一种优雅的方式来做这件事。最后一种方法是编写Python循环。这是数据的一

我有嵌套Json中的时间序列数据,我正努力将其放入扁平的数据帧中

输入数据 数据如下:

预期产量 扁平的熊猫数据框:国家|日期|病例|死亡|恢复

我试过的 现在,我可以使用df=pd.json\u normalizejson\u数据,max\u level=1,但这就留下了嵌入列表。 我也可以使用df=pd.json_normalizejson_数据,但这只会为每个日期创建一个新的列,随着时间的推移,这是不可持续的


一定有一种优雅的方式来做这件事。最后一种方法是编写Python循环。

这是数据的一个子集,针对阿富汗国家json数据中的第一个条目:

content = [{"country":"Afghanistan","province":None,"timeline":{"cases":{"3/13/20":7,"3/14/20":11,"3/15/20":16,"3/16/20":21,"3/17/20":22,"3/18/20":22,"3/19/20":22,"3/20/20":24,"3/21/20":24,"3/22/20":40,"3/23/20":40,"3/24/20":74,"3/25/20":84,"3/26/20":94,"3/27/20":110,"3/28/20":110,"3/29/20":120,"3/30/20":170,"3/31/20":174,"4/1/20":237,"4/2/20":273,"4/3/20":281,"4/4/20":299,"4/5/20":349,"4/6/20":367,"4/7/20":423,"4/8/20":444,"4/9/20":484,"4/10/20":521,"4/11/20":555},"deaths":{"3/13/20":0,"3/14/20":0,"3/15/20":0,"3/16/20":0,"3/17/20":0,"3/18/20":0,"3/19/20":0,"3/20/20":0,"3/21/20":0,"3/22/20":1,"3/23/20":1,"3/24/20":1,"3/25/20":2,"3/26/20":4,"3/27/20":4,"3/28/20":4,"3/29/20":4,"3/30/20":4,"3/31/20":4,"4/1/20":4,"4/2/20":6,"4/3/20":6,"4/4/20":7,"4/5/20":7,"4/6/20":11,"4/7/20":14,"4/8/20":14,"4/9/20":15,"4/10/20":15,"4/11/20":18},"recovered":{"3/13/20":0,"3/14/20":0,"3/15/20":0,"3/16/20":1,"3/17/20":1,"3/18/20":1,"3/19/20":1,"3/20/20":1,"3/21/20":1,"3/22/20":1,"3/23/20":1,"3/24/20":1,"3/25/20":2,"3/26/20":2,"3/27/20":2,"3/28/20":2,"3/29/20":2,"3/30/20":2,"3/31/20":5,"4/1/20":5,"4/2/20":10,"4/3/20":10,"4/4/20":10,"4/5/20":15,"4/6/20":18,"4/7/20":18,"4/8/20":29,"4/9/20":32,"4/10/20":32,"4/11/20":32}}}]
一种方法是读入时间线数据,然后将国家和省份数据读入数据框:

res = pd.DataFrame(content[0]['timeline']).assign(country = content[0]['country'],
                                                  province = content[0]['province']
                                                  )

res.head()


         cases    deaths    recovered   country    province
3/13/20   7          0        0        Afghanistan  None
3/14/20   11         0        0        Afghanistan  None
3/15/20   16         0        0        Afghanistan  None
3/16/20   21         0        1        Afghanistan  None
3/17/20   22         0        1        Afghanistan  None

请注意,整个数据都包装在一个列表中,因此索引为0。

共享您的问题中的json数据示例,以便有人可以使用并提供可能的解决方案。必须有一种优雅的方法来实现这一点。最后的办法是编写Python循环。我认为最优雅的方式可能是循环。@sammywemmy原始json URL是个问题!我不仅提供了原始json,还提供了下载它的代码。你应该做的是共享一个样本,而不是整个数据。你已经下载了数据,剪下一小部分代表你想要达到的目标,并在你的问题中分享。不要误解我。很酷你分享了你所做的数据。无论谁想回答你的问题,处理其中的一个子集都会更容易,并允许你对整个数据集推断出任何答案。这是有用的+1向上投票。但我仍然没有解决我的问题的办法。如何应用这个中间步骤来交付最终的数据帧?我们的整个json被包装在一个列表中,因此如果你将这个特定的代码应用到它,其中内容代表整个json,那么你应该拥有我们的数据帧。这将是相当多的,并希望,应该适合记忆。划掉它,它应该放入内存就是这样。results=pd.DataFrame for i in rangelenjson_data:df=pd.DataFramejson_data[i]['timeline'].assigncountry=json_data[i]['country'],province=json_data[i]['province']results=results.appenddf,ignore_index=False
res = pd.DataFrame(content[0]['timeline']).assign(country = content[0]['country'],
                                                  province = content[0]['province']
                                                  )

res.head()


         cases    deaths    recovered   country    province
3/13/20   7          0        0        Afghanistan  None
3/14/20   11         0        0        Afghanistan  None
3/15/20   16         0        0        Afghanistan  None
3/16/20   21         0        1        Afghanistan  None
3/17/20   22         0        1        Afghanistan  None