panda python如何创建一个groupby多个列_Python_Pandas

panda python如何创建一个groupby多个列

python pandas

panda python如何创建一个groupby多个列,python,pandas,Python,Pandas,这是我的问题: 我有一个csv文件，如下所示： SELL,NUMBER,TYPE,MONTH -1484829.72,25782,E,3 -1337196.63,26688,E,3 -1110271.83,15750,E,3 -1079426.55,16117,E,3 -964656.26,11344,D,1 -883818.81,10285,D,2 -836068.57,14668,E,3 -818612.27,13806,E,3 -765820.92,14973,E,3 -737911.62

这是我的问题:

我有一个csv文件，如下所示：

SELL,NUMBER,TYPE,MONTH
-1484829.72,25782,E,3
-1337196.63,26688,E,3
-1110271.83,15750,E,3
-1079426.55,16117,E,3
-964656.26,11344,D,1
-883818.81,10285,D,2
-836068.57,14668,E,3
-818612.27,13806,E,3
-765820.92,14973,E,3
-737911.62,8685,D,2
-728828.93,8975,D,1
-632200.31,12384,E
41831481.50,18425,E,2
1835587.70,33516,E,1
1910671.45,20342,E,6
1916569.50,24088,E,6
1922369.40,25101,E,1
2011347.65,23814,E,3
2087659.35,18108,D,3
2126371.86,34803,E,2
2165531.50,35389,E,3
2231818.85,37515,E,3
2282611.90,32422,E,6
2284141.50,21199,A,1
2288121.05,32497,E,6

我想创建一个groupby类型，将列和数字相加，在负数和正数之间进行分隔

我发出这个命令：

end_result= info.groupby(['TEXTOCANAL']).agg({
                                                'SELLS': (('negative', lambda x : x[x < 0].sum()), ('positiv', lambda x : x[x > 0].sum())),
                                                'NUMBERS': (('negative', lambda x : x[info['SELLS'] <0].sum()), ('positive', lambda x : x[info['SELLS'] > 0].sum())),
                                                })

但是我想通过添加列MONTH来创建这个组

诸如此类：

                    1                             2
                                                         SELLS                    NUMBERS
                   negative    positive     negative  positive            negative    positive  negative  positive
TYPE
A                  -1710.60    5145.25           17       9         -xxx.xx    xx.xx    xx    xx
B                  -95.40      3391.10           1        29
C                  -3802.25    36428.40          191      1063
D                   0.00       30.80             0        7
E                  -19143.30   102175.05         687      1532

有什么想法吗

提前感谢您的帮助

这应该可以：

end_result = (
    info.groupby(['TYPE', 'MONTH', np.sign(info.SELL)]) # groupby negative and positive SELL
    ['SELL', 'NUMBER'].sum() # select columns to be aggregated
    # in this case is redundant to select columns
    # since those are the only two columns left
    # groupby moves TYPE and MONTH as index
    .unstack([1, 2]) # reshape as you need it
    .reorder_levels([0, 1, 3, 2]) # to have pos/neg as last level in MultiIndex
    .rename({-1: 'negative', 1: 'positive'}, axis=1, level=-1)
    )

与里奇耶夫的答案类似。我不知道np.sign是一个巧妙的把戏

另一种方法是

。使用np分配标志列。其中
标识正
或负
。然后，按所有非数字列分组，并使用.unstack（[1,2]
）将第二个和第三个字段移动到列中
输出（图像，因为多个索引很混乱）
您好，有什么答案有用吗？
end_result = (
    info.groupby(['TYPE', 'MONTH', np.sign(info.SELL)]) # groupby negative and positive SELL
    ['SELL', 'NUMBER'].sum() # select columns to be aggregated
    # in this case is redundant to select columns
    # since those are the only two columns left
    # groupby moves TYPE and MONTH as index
    .unstack([1, 2]) # reshape as you need it
    .reorder_levels([0, 1, 3, 2]) # to have pos/neg as last level in MultiIndex
    .rename({-1: 'negative', 1: 'positive'}, axis=1, level=-1)
    )

info = (info.assign(flag=np.where((info['SELL'] > 0), 'postive', 'negative'))
            .groupby(['TYPE','MONTH','flag'])['SELL', 'NUMBER'].sum()
            .unstack([1,2]))