Python 如何将熊猫分成多个群组?
我有一个交易和客户社会群体的数据框架:Python 如何将熊猫分成多个群组?,python,pandas,Python,Pandas,我有一个交易和客户社会群体的数据框架: print(df.sample(10)) Shop Transaction_value Social Group 7 KFC 7 Rich 22 Burger King 342 Rich 19 Burger King 6 Rich 5 KFC
print(df.sample(10))
Shop Transaction_value Social Group
7 KFC 7 Rich
22 Burger King 342 Rich
19 Burger King 6 Rich
5 KFC 2 Poor
14 McDonalds 245 Rich
2 KFC 3 Poor
16 McDonalds 56 Poor
6 KFC 6 Poor
20 Burger King 23 Poor
8 KFC 5 Poor
我做了一个groupby,它告诉我每个商店最常见的社交群体:
(df.groupby(['Shop', 'Social Group'])['Transaction_value'].count())
Shop Social Group
Burger King Poor 7
Rich 3
KFC Poor 6
Rich 3
McDonalds Poor 3
Rich 6
我想将上面的数字除以每个社会群体的值\u counts()
:
df['Social Group'].value_counts()
Poor 16
Rich 12
所以在我的第一个groupby中,无论哪里有Poor
,我想除以16。无论在哪里,我们都有Rich
我想除以12
我将有一个如下的数据帧:
Shop Social Group
Burger King Poor 0.43
Rich 0.25
KFC Poor 0.37
Rich 0.37
McDonalds Poor 0.25
Rich 0.5
我已经为此尝试了div()
。我以为索引会在每个数据帧中匹配,但它不起作用:
(df.groupby(['Shop', 'Social Group'])['Transaction_value']
.count()
.div(df['Social Group'].value_counts()))
ValueError: cannot join with no overlapping index names
我试图用内置函数实现的功能是否可行
我想我可以用for循环来实现这一点,但这需要很多时间
我的df:
df.to_dict()
{'Shop': {0: 'KFC',
1: 'KFC',
2: 'KFC',
3: 'KFC',
4: 'KFC',
5: 'KFC',
6: 'KFC',
7: 'KFC',
8: 'KFC',
9: 'McDonalds',
10: 'McDonalds',
11: 'McDonalds',
12: 'McDonalds',
13: 'McDonalds',
14: 'McDonalds',
15: 'McDonalds',
16: 'McDonalds',
17: 'McDonalds',
18: 'Burger King',
19: 'Burger King',
20: 'Burger King',
21: 'Burger King',
22: 'Burger King',
23: 'Burger King',
24: 'Burger King',
25: 'Burger King',
26: 'Burger King',
27: 'Burger King'},
'Transaction_value': {0: 1,
1: 2,
2: 3,
3: 34,
4: 2,
5: 2,
6: 6,
7: 7,
8: 5,
9: 4,
10: 3,
11: 2,
12: 12,
13: 31,
14: 245,
15: 123,
16: 56,
17: 67,
18: 68,
19: 6,
20: 23,
21: 44,
22: 342,
23: 234,
24: 3,
25: 234,
26: 666,
27: 88},
'Social Group': {0: 'Poor',
1: 'Rich',
2: 'Poor',
3: 'Poor',
4: 'Rich',
5: 'Poor',
6: 'Poor',
7: 'Rich',
8: 'Poor',
9: 'Rich',
10: 'Rich',
11: 'Rich',
12: 'Rich',
13: 'Rich',
14: 'Rich',
15: 'Poor',
16: 'Poor',
17: 'Poor',
18: 'Poor',
19: 'Rich',
20: 'Poor',
21: 'Poor',
22: 'Rich',
23: 'Poor',
24: 'Poor',
25: 'Rich',
26: 'Poor',
27: 'Poor'}}
您很接近,需要
level=1
匹配第二级多索引
:
s = df['Social Group'].value_counts()
s1 = df.groupby(['Shop', 'Social Group'])['Transaction_value'].count().div(s, level=1)
print (s1)
Shop Social Group
Burger King Poor 0.4375
Rich 0.2500
KFC Poor 0.3750
Rich 0.2500
McDonalds Poor 0.1875
Rich 0.5000
dtype: float64
多谢各位
div()
在这里非常有用。是否还有div()
版本的multiply()
?或者subtract()
或者add()
?@SCool-你说得对,有这样的函数-