Python 如何使用具有不同dict和dict列表的数据分解Panda列
我有一个熊猫数据框,它有不同的值集,比如第一个是列表或数组,还有其他元素Python 如何使用具有不同dict和dict列表的数据分解Panda列,python,pandas,explode,Python,Pandas,Explode,我有一个熊猫数据框,它有不同的值集,比如第一个是列表或数组,还有其他元素 >>> df_3['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record'] 0 [{'Internalid': '24348', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL
>>> df_3['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record']
0 [{'Internalid': '24348', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3127'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4434'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5545'}]}}, {'Internalid': '24349', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3125'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel4268'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5418'}]}}, {'Internalid': '24350', 'isDelete': 'false', 'fields': {'field': [{'id': 'CATEGOR_LEVEL_1', 'value': 'MR'}, {'id': 'LOW_PRODSERV', 'value': 'RES'}, {'id': 'LOW_LEVEL_2', 'value': 'keylevel221'}, {'id': 'LOW_LEVEL_3', 'value': 'keylevel3122'}, {'id': 'LOW_LEVEL_4', 'value': 'keylevel425'}, {'id': 'LOW_LEVEL_5', 'value': 'keylevel5221'}]}}]
0 {'isDelete': 'false', 'fields': {'field': [{'id': 'S_EAST', 'value': 'N'}, {'id': 'W_EST', 'value': 'N'}, {'id': 'M_WEST', 'value': 'N'}, {'id': 'N_EAST', 'value': 'N'}, {'id': 'LOW_AREYOU_ASSET', 'value': '-1'}, {'id': 'LOW_SWART_PROG', 'value': '-1'}]}}
0 {'isDelete': 'false', 'fields': {'field': {'id': 'LOW_COD_CONDUCT', 'value': '-1'}}}
0 {'isDelete': 'false', 'fields': {'field': [{'id': 'LOW_SUPPLIER_TYPE', 'value': '2'}, {'id': 'LOW_DO_INT_BOTH', 'value': '1'}]}}
我想把它分解成多行。第一行是列表还是其他行
>>> type(df_3)
<class 'pandas.core.frame.DataFrame'>
>>> type(df_3['integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record'])
<class 'pandas.core.series.Series'>
我试着把这个柱炸开
>>> df_3.explode('integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib64/python3.6/site-packages/pandas/core/frame.py", line 6318, in explode
result = df[column].explode()
File "/usr/local/lib64/python3.6/site-packages/pandas/core/series.py", line 3504, in explode
values, counts = reshape.explode(np.asarray(self.array))
File "pandas/_libs/reshape.pyx", line 129, in pandas._libs.reshape.explode
KeyError: 0
df_3.explode('integration-outbound:IntegrationEntity.integrationEntityDetails.supplier.forms.form.records.record'))
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
explode中的文件“/usr/local/lib64/python3.6/site packages/pandas/core/frame.py”,第6318行
结果=df[列]。分解()
explode中的文件“/usr/local/lib64/python3.6/site packages/pandas/core/series.py”,第3504行
值,计数=重塑.explode(np.asarray(self.array))
文件“pandas/_libs/reforme.pyx”,第129行,在pandas._libs.reforme.explode中
关键错误:0
我可以浏览每一行,试着找出它是否是一个列表,并实现一些东西,但它似乎不正确
if str(type(df_3.loc[i,'{}'.format(c)])) == "<class 'list'>":
if str(类型(df_3.loc[i',{}.格式(c)])==”:
有什么方法可以对此类数据使用分解函数吗?我可以这样做,但是分解的行都被过滤到数据帧的顶部(以防在较低的行中有更多列表类型的对象) 它基本上和您所说的一样:检查对象类型是否为list,如果是,则分解。然后将分解后的序列与其余数据(即非列表)连接起来。较长的数据帧似乎不会对性能造成太大影响 输出:
Column
0 {'Internalid': '24348', 'isDelete': 'false', '...
0 {'Internalid': '24349', 'isDelete': 'false', '...
0 {'Internalid': '24350', 'isDelete': 'false', '...
1 {'isDelete': 'false', 'fields': {'field': [{'i...
2 {'isDelete': 'false', 'fields': {'field': {'id...
3 {'isDelete': 'false', 'fields': {'field': [{'i...
使用读取xml的替代方法
从pandas\u read\u xml导入展平,完全展平
df=展平(df)
很高兴它有帮助:)
pd.concat((df.iloc[[type(item) == list for item in df['Column']]].explode('Column'),
df.iloc[[type(item) != list for item in df['Column']]]))
Column
0 {'Internalid': '24348', 'isDelete': 'false', '...
0 {'Internalid': '24349', 'isDelete': 'false', '...
0 {'Internalid': '24350', 'isDelete': 'false', '...
1 {'isDelete': 'false', 'fields': {'field': [{'i...
2 {'isDelete': 'false', 'fields': {'field': {'id...
3 {'isDelete': 'false', 'fields': {'field': [{'i...