Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/307.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫将包含字典列表的单元格展开为行,每个行的键为列_Python_Pandas_Dataframe - Fatal编程技术网

Python 熊猫将包含字典列表的单元格展开为行,每个行的键为列

Python 熊猫将包含字典列表的单元格展开为行,每个行的键为列,python,pandas,dataframe,Python,Pandas,Dataframe,我有这样一个数据帧: col1 col2 col3 0 "[{'key1':'val1'}, {'key1':'val2'}]" a g 1 "[{'key1':'val3'}, {'key1':'val4'}]" b h 2 "[{'key1':'val5'}, {'ke

我有这样一个数据帧:

     col1                                     col2       col3
0    "[{'key1':'val1'}, {'key1':'val2'}]"        a          g
1    "[{'key1':'val3'}, {'key1':'val4'}]"        b          h
2    "[{'key1':'val5'}, {'key1':'val6'}]"        c          i
     col2       col3   key1
0    a          g      val1 
1    a          g      val2
2    b          h      val3
3    b          h      val4
4    c          i      val5
5    c          i      val6
我想把它处理成这样:

     col1                                     col2       col3
0    "[{'key1':'val1'}, {'key1':'val2'}]"        a          g
1    "[{'key1':'val3'}, {'key1':'val4'}]"        b          h
2    "[{'key1':'val5'}, {'key1':'val6'}]"        c          i
     col2       col3   key1
0    a          g      val1 
1    a          g      val2
2    b          h      val3
3    b          h      val4
4    c          i      val5
5    c          i      val6
这是稍微简化的。col1中的字典有更多的列,还有两个以上的列

我在其他帖子中也看到过类似的解决方案,但所有的帖子都假设col1是一个常规列表。我对熊猫还不太熟悉,不知道该如何找到适合我情况的解决方案。感谢您的帮助。谢谢

更新:我找到了解决方案

首先,我将字符串转换为字典列表:

df['col1'] = df['col1'].apply(json.loads)
然后我将其分解,使每个字典都有自己的行:

res = df.explode('col1')
然后,我为字典中的每个键创建一列:

res[['key1','key2','key3']] = res['col1'].apply(lambda x: self._explode_dict(x))
这是我的_explode_dict(行)函数。这样做的目的是避免空字典进入pd.Series的错误

if (isinstance(row, dict) and bool(row)):
   return pd.Series(row)
return pd.Series({
    'key1': '',
    'key2': '',
    'key3': '',
})
df=df.explode('col1')。reset_index(drop=True)
,然后
df.col1=df.col1.str.get('key1')
df=df.explode('col1')。reset_index(drop=True),然后
df.col1=df.col1.str.get('key1')