Pandas 数据帧将具有dict值的列拆分为列

Pandas 数据帧将具有dict值的列拆分为列,pandas,dataframe,dictionary,json-normalize,Pandas,Dataframe,Dictionary,Json Normalize,我正在尝试将数据框架中的一列(包含字典值列表)拆分并转换为一个新列。将某些值用作参考似乎失败了,因为有些值是NaN。当遇到这些行时,会抛出一个错误,无法在浮点上迭代,如果Ifillna没有错误,则该错误将更改为str相关错误 我已尝试首次使用: df.explode('freshness_grades') df_new = pd.concat([df_new.drop('freshness_grades', axis=1), pd.DataFrame(df_new['freshness_gra

我正在尝试将数据框架中的一列(包含字典值列表)拆分并转换为一个新列。将某些值用作参考似乎失败了,因为有些值是NaN。当遇到这些行时,会抛出一个错误,无法在浮点上迭代,如果I
fillna
没有错误,则该错误将更改为
str
相关错误

我已尝试首次使用:

df.explode('freshness_grades')

df_new = pd.concat([df_new.drop('freshness_grades', axis=1), pd.DataFrame(df_new['freshness_grades'].tolist())], axis=1)
我这样做是为了从本质上将字典列表更改为字典

    _id  freshness_grades
0   57ea8d0d9c624c035f96f45e    [{'creation_date': '2019-04-20T06:02:02.865000+00:00', 'end_date': '2015-07-23T18:43:00+00:00', 'grade': 'A', 'start_date': '2015-03-05T01:54:47+00:00'}, {'creation_date': '2019-04-20T06:02:02.865000+00:00', 'end_date': '2015-08-22T18:43:00+00:00', 'grade': 'B', 'start_date': '2015-07-23T18:43:00+00:00'}, {'creation_date': '2019-04-20T06:02:02.865000+00:00', 'end_date': '2015-10-21T18:43:00+00:00', 'grade': 'C', 'start_date': '2015-08-22T18:43:00+00:00'}, {'creation_date': '2019-04-20T06:02:02.865000+00:00', 'end_date': '2016-02-02T12:12:00+00:00', 'grade': 'D', 'start_date': '2015-10-21T18:43:00+00:00'}, {'creation_date': '2019-04-20T06:02:02.865000+00:00', 'end_date': '2016-07-22T18:43:00+00:00', 'grade': 'E', 'start_date': '2016-02-02T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:02.865000+00:00', 'grade': 'F', 'start_date': '2016-07-22T18:43:00+00:00'}]
1   57ea8d0e9c624c035f96f460    [{'creation_date': '2019-06-25T10:54:40.387000+00:00', 'end_date': '2015-07-20T14:04:00+00:00', 'grade': 'A', 'start_date': '2015-07-14T08:48:49+00:00'}, {'creation_date': '2019-06-25T10:54:40.387000+00:00', 'end_date': '2015-08-19T14:04:00+00:00', 'grade': 'B', 'start_date': '2015-07-20T14:04:00+00:00'}, {'creation_date': '2019-06-25T10:54:40.387000+00:00', 'end_date': '2015-10-18T14:04:00+00:00', 'grade': 'C', 'start_date': '2015-08-19T14:04:00+00:00'}, {'creation_date': '2019-06-25T10:54:40.387000+00:00', 'end_date': '2016-02-02T12:12:00+00:00', 'grade': 'D', 'start_date': '2015-10-18T14:04:00+00:00'}, {'creation_date': '2019-06-25T10:54:40.387000+00:00', 'end_date': '2016-07-19T14:04:00+00:00', 'grade': 'E', 'start_date': '2016-02-02T12:12:00+00:00'}, {'creation_date': '2019-06-25T10:54:40.387000+00:00', 'grade': 'F', 'start_date': '2016-07-19T14:04:00+00:00'}]
2   57ea8d0e9c624c035f96f462    [{'creation_date': '2019-04-20T06:02:03.600000+00:00', 'end_date': '2015-09-29T09:46:00+00:00', 'grade': 'A', 'start_date': '2015-07-27T15:21:32+00:00'}, {'creation_date': '2019-04-20T06:02:03.600000+00:00', 'end_date': '2015-10-29T09:46:00+00:00', 'grade': 'B', 'start_date': '2015-09-29T09:46:00+00:00'}, {'creation_date': '2019-04-20T06:02:03.600000+00:00', 'end_date': '2015-12-04T12:12:00+00:00', 'grade': 'C', 'start_date': '2015-10-29T09:46:00+00:00'}, {'creation_date': '2019-04-20T06:02:03.600000+00:00', 'end_date': '2016-02-02T12:12:00+00:00', 'grade': 'D', 'start_date': '2015-12-04T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:03.600000+00:00', 'end_date': '2016-09-28T09:46:00+00:00', 'grade': 'E', 'start_date': '2016-02-02T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:03.600000+00:00', 'grade': 'F', 'start_date': '2016-09-28T09:46:00+00:00'}]
3   57ea8d0f9c624c035f96f466    [{'creation_date': '2019-04-20T06:02:04.305000+00:00', 'end_date': '2015-09-29T09:46:00+00:00', 'grade': 'A', 'start_date': '2015-09-09T13:20:14+00:00'}, {'creation_date': '2019-04-20T06:02:04.305000+00:00', 'end_date': '2015-10-29T09:46:00+00:00', 'grade': 'B', 'start_date': '2015-09-29T09:46:00+00:00'}, {'creation_date': '2019-04-20T06:02:04.305000+00:00', 'end_date': '2015-12-04T12:12:00+00:00', 'grade': 'C', 'start_date': '2015-10-29T09:46:00+00:00'}, {'creation_date': '2019-04-20T06:02:04.305000+00:00', 'end_date': '2016-02-02T12:12:00+00:00', 'grade': 'D', 'start_date': '2015-12-04T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:04.305000+00:00', 'end_date': '2016-09-28T09:46:00+00:00', 'grade': 'E', 'start_date': '2016-02-02T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:04.305000+00:00', 'grade': 'F', 'start_date': '2016-09-28T09:46:00+00:00'}]
4   57ea8d109c624c035f96f468    [{'creation_date': '2019-04-20T06:02:04.673000+00:00', 'end_date': '2015-11-04T12:12:00+00:00', 'grade': 'A', 'start_date': '2015-10-30T07:43:46+00:00'}, {'creation_date': '2019-04-20T06:02:04.673000+00:00', 'end_date': '2015-11-11T12:12:00+00:00', 'grade': 'B', 'start_date': '2015-11-04T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:04.673000+00:00', 'end_date': '2015-12-04T12:12:00+00:00', 'grade': 'C', 'start_date': '2015-11-11T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:04.673000+00:00', 'end_date': '2016-02-02T12:12:00+00:00', 'grade': 'D', 'start_date': '2015-12-04T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:04.673000+00:00', 'end_date': '2016-11-03T12:12:00+00:00', 'grade': 'E', 'start_date': '2016-02-02T12:12:00+00:00'}, {'creation_date': '2019-04-20T06:02:04.673000+00:00', 'grade': 'F', 'start_date': '2016-11-03T12:12:00+00:00'}]
5   5f1eb63dbed8bd4f99e2a280    NaN
以ehf第一排为例,我希望实现:

   _id                        creation_date                     end_date                   grade     start_date
0 57ea8d0d9c624c035f96f45e    2019-04-20T06:02:02.865000+00:00  2015-07-23T18:43:00+00:0    A        2015-03-05T01:54:47+00:00
0 57ea8d0d9c624c035f96f45e    2019-04-20T06:02:02.865000+00:00  2015-08-22T18:43:00+00:00   B        2015-07-23T18:43:00+00:00
...
我从
explode
开始,这一步非常有效

但是,我没有尝试使用
reset\u index()
。失败的是
pd.concat()
,我认为它要么与
NaN
相关,要么实际上列表中有多个词典。例如,在
explode()
之后,即
{}、{}、{}

  • json\u normalize
    在带有
    NaN
    • {}
      填充
      NaN
  • 也看到
#分解列表
df=df.explode(‘新鲜度等级’).reset_索引(drop=True)
#现在用一个空的dict填充NaN
df.freshness_grades=df.freshness_grades.fillna({i:{}表示df.index}中的i)
#然后规范化列
df=df.join(pd.json_normalize(df.freshness_grades))
#放下柱子
df.drop(列=['freshness_grades',],inplace=True)
输出
\u id创建\u日期结束\u日期等级开始\u日期
0 57ea8d0d9c624c035f96f45e 2019-04-20T06:02:02.865000+00:00 2015-07-23T18:43:00+00:00A 2015-03-05T01:54:47+00:00
157EA8D0D9C624C035F96F45E 2019-04-20T06:02:02.865000+00:00 2015-08-22T18:43:00+00:00 B 2015-07-23T18:43:00+00:00
257EA8D0D9C624C035F96F45E 2019-04-20T06:02:02.865000+00:00 2015-10-21T18:43:00+00:00 C 2015-08-22T18:43:00+00:00
357EA8D0D9C624C035F96F45E 2019-04-20T06:02:02.865000+00:00 2016-02-02T12:12:00+00:00D 2015-10-21T18:43:00+00:00
4 57ea8d0d9c624c035f96f45e 2019-04-20T06:02:02.865000+00:00 2016-07-22T18:43:00+00:00 E 2016-02-02T12:12:00+00:00