panda python如何创建一个groupby多个列
这是我的问题: 我有一个csv文件,如下所示:panda python如何创建一个groupby多个列,python,pandas,Python,Pandas,这是我的问题: 我有一个csv文件,如下所示: SELL,NUMBER,TYPE,MONTH -1484829.72,25782,E,3 -1337196.63,26688,E,3 -1110271.83,15750,E,3 -1079426.55,16117,E,3 -964656.26,11344,D,1 -883818.81,10285,D,2 -836068.57,14668,E,3 -818612.27,13806,E,3 -765820.92,14973,E,3 -737911.62
SELL,NUMBER,TYPE,MONTH
-1484829.72,25782,E,3
-1337196.63,26688,E,3
-1110271.83,15750,E,3
-1079426.55,16117,E,3
-964656.26,11344,D,1
-883818.81,10285,D,2
-836068.57,14668,E,3
-818612.27,13806,E,3
-765820.92,14973,E,3
-737911.62,8685,D,2
-728828.93,8975,D,1
-632200.31,12384,E
41831481.50,18425,E,2
1835587.70,33516,E,1
1910671.45,20342,E,6
1916569.50,24088,E,6
1922369.40,25101,E,1
2011347.65,23814,E,3
2087659.35,18108,D,3
2126371.86,34803,E,2
2165531.50,35389,E,3
2231818.85,37515,E,3
2282611.90,32422,E,6
2284141.50,21199,A,1
2288121.05,32497,E,6
我想创建一个groupby类型,将列和数字相加,在负数和正数之间进行分隔
我发出这个命令:
end_result= info.groupby(['TEXTOCANAL']).agg({
'SELLS': (('negative', lambda x : x[x < 0].sum()), ('positiv', lambda x : x[x > 0].sum())),
'NUMBERS': (('negative', lambda x : x[info['SELLS'] <0].sum()), ('positive', lambda x : x[info['SELLS'] > 0].sum())),
})
但是我想通过添加列MONTH来创建这个组
诸如此类:
1 2
SELLS NUMBERS
negative positive negative positive negative positive negative positive
TYPE
A -1710.60 5145.25 17 9 -xxx.xx xx.xx xx xx
B -95.40 3391.10 1 29
C -3802.25 36428.40 191 1063
D 0.00 30.80 0 7
E -19143.30 102175.05 687 1532
有什么想法吗
提前感谢您的帮助这应该可以:
end_result = (
info.groupby(['TYPE', 'MONTH', np.sign(info.SELL)]) # groupby negative and positive SELL
['SELL', 'NUMBER'].sum() # select columns to be aggregated
# in this case is redundant to select columns
# since those are the only two columns left
# groupby moves TYPE and MONTH as index
.unstack([1, 2]) # reshape as you need it
.reorder_levels([0, 1, 3, 2]) # to have pos/neg as last level in MultiIndex
.rename({-1: 'negative', 1: 'positive'}, axis=1, level=-1)
)
与里奇耶夫的答案类似。我不知道np.sign是一个巧妙的把戏 另一种方法是
。使用np分配标志列。其中
标识正
或负
。然后,按所有非数字列分组,并使用.unstack([1,2]
)将第二个和第三个字段移动到列中
输出(图像,因为多个索引很混乱)
您好,有什么答案有用吗?
end_result = (
info.groupby(['TYPE', 'MONTH', np.sign(info.SELL)]) # groupby negative and positive SELL
['SELL', 'NUMBER'].sum() # select columns to be aggregated
# in this case is redundant to select columns
# since those are the only two columns left
# groupby moves TYPE and MONTH as index
.unstack([1, 2]) # reshape as you need it
.reorder_levels([0, 1, 3, 2]) # to have pos/neg as last level in MultiIndex
.rename({-1: 'negative', 1: 'positive'}, axis=1, level=-1)
)
info = (info.assign(flag=np.where((info['SELL'] > 0), 'postive', 'negative'))
.groupby(['TYPE','MONTH','flag'])['SELL', 'NUMBER'].sum()
.unstack([1,2]))