Python,与groupby一起出现错误
我有以下数据帧“df1”:Python,与groupby一起出现错误,python,pandas,group-by,Python,Pandas,Group By,我有以下数据帧“df1”: id_client product client1 product1 client1 product4 client1 product5 client2 product1 client2 product6 client3 product1
id_client product
client1 product1
client1 product4
client1 product5
client2 product1
client2 product6
client3 product1
首先,我想按id_客户机分组并检索列表中的匹配产品:
id_client product
client1 [product1,product4,product5]
client2 [product1,product6]
client3 [product1]
然后,对于每个列表的每个元素,我想向新的数据帧“df2”添加一个新行,如下所示(nb_product是每个列表的长度):
因此,首先我创建了一本新词典:
nb_of_combination = {}
nb_of_combination['product'] = []
nb_of_combination['nb_product'] = []
然后我声明了以下函数:
def nb_of_combination(my_list):
nb_comb = len(my_list)
for row in my_list:
nb_of_combination['product'].append(row)
nb_of_combination['nb_product'].append(nb_comb)
然后,我根据字段“id_client”按“df1”分组,并应用函数“nb_of_composition”:
df1 = df1.groupby('id_client',as_index=False).apply(lambda x: nb_of_combination(list(x.product)))
但我得到了以下错误:
df1 = df1.groupby('id_client',as_index=False).apply(lambda x: nb_of_combination(list(x.product)))
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 660, in apply
return self._python_apply_general(f)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 667, in _python_apply_general
not_indexed_same=mutated)
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 2821, in _wrap_applied_output
v = next(v for v in values if v is not None)
我真的不明白,因为:
df2 = pd.DataFrame(nb_of_combination)
似乎工作得很好。您的方法过于复杂。您可以通过调用
transform
并传递函数count
并将其作为新列分配回原始df来实现所需<代码>变换返回与原始df对齐的序列,请参见:
df2 = pd.DataFrame(nb_of_combination)
In [89]:
df['nb_product'] = df.groupby('id_client').transform(pd.Series.count)
df
Out[89]:
id_client product nb_product
0 client1 product1 3
1 client1 product4 3
2 client1 product5 3
3 client2 product1 2
4 client2 product6 2
5 client3 product1 1