Python 熊猫分组索引在2行和3行中表现不同
我有两个数据帧df1和df2。它们的格式相同,唯一的区别是第一行有3行,第二行有2行Python 熊猫分组索引在2行和3行中表现不同,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我有两个数据帧df1和df2。它们的格式相同,唯一的区别是第一行有3行,第二行有2行 df1 = pd.DataFrame({ 'Company': ['Foo Inc.', 'Foo Inc.', 'Foo Inc.'], 'ID': ['123456', '123456', '123456'], 'Employee': ['John Doe', 'Richard Roe', 'Jane Doe'], 'Position': ['Executive Direct
df1 = pd.DataFrame({
'Company': ['Foo Inc.', 'Foo Inc.', 'Foo Inc.'],
'ID': ['123456', '123456', '123456'],
'Employee': ['John Doe', 'Richard Roe', 'Jane Doe'],
'Position': ['Executive Director', 'Director', 'Company Secretary']
})
df2 = pd.DataFrame({
'Company': ['Bar Inc.', 'Bar Inc.'],
'ID': ['56789', '56789'],
'Employee': ['Mark Moe', 'Larry Loe'],
'Position': ['Tax Consultant', 'Company Secretary']
})
print(df1)
Company Employee ID Position
0 Foo Inc. John Doe 123456 Executive Director
1 Foo Inc. Richard Roe 123456 Director
2 Foo Inc. Jane Doe 123456 Company Secretary
print(df2)
Company Employee ID Position
0 Bar Inc. Mark Moe 56789 Tax Consultant
1 Bar Inc. Larry Loe 56789 Company Secretary
当我试着做下面的事情时,它对第一个有效,但对第二个无效
gb1 = df1.set_index(['Company', 'ID', 'Employee']).groupby(['Company', 'ID'])
gb2 = df2.set_index(['Company', 'ID', 'Employee']).groupby(['Company', 'ID'])
for (name, id), new_df in gb1:
print(name)
print(id)
for (name, id), new_df in gb2:
print(name)
print(id)
Foo Inc.
123456
3 print(id)
4
----> 5 for (name, id), new_df in gb2:
6 print(name)
7 print(id)
ValueError: too many values to unpack (expected 2)
这是因为它们的指数不同
gb1.indices
>>> {('Foo Inc.', '123456'): array([0, 1, 2], dtype=int64)}
gb2.indices
>>> {'Company': array([0], dtype=int64), 'ID': array([1], dtype=int64)}
我错过什么了吗?这是一个错误吗?当你
设置索引时,你是groupby
索引不再是列,因此你可以添加level=
,groupby
索引而不使用level
会发生什么事情
gb2 = df2.set_index(['Company', 'ID', 'Employee']).groupby(level=['Company', 'ID'])
for (name, id), new_df in gb2:
print(name)
print(id)
Bar Inc.
56789