Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/347.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Groupby.cumsum()如果求和列等于零,则为空?_Python_Pandas_Group By_Sum_Series - Fatal编程技术网

Python Groupby.cumsum()如果求和列等于零,则为空?

Python Groupby.cumsum()如果求和列等于零,则为空?,python,pandas,group-by,sum,series,Python,Pandas,Group By,Sum,Series,我有一个DataFrame.groupby().cumsum(),其数据帧如下所示: Col_A Col_B Col_C 1 A 0 2 A 1 1 3 A 1 2 4 A 1 3 5 B 0 0 6 B 1 1 7 B 0 8 B 1 2

我有一个DataFrame.groupby().cumsum(),其数据帧如下所示:

   Col_A Col_B Col_C
 1   A    0            
 2   A    1     1      
 3   A    1     2      
 4   A    1     3      
 5   B    0     0      
 6   B    1     1      
 7   B    0            
 8   B    1     2      
 9   C    1     1      
10   C    1     2      
11   C    1     3      
12   C    0           
Col_B的总和是df.groupby(['Col_A'])['Col_B'].cumsum()。但是,当Col_B==0时,.cumsum()为空。即使列为空,如何记录
.cumsum()

生成的数据帧应类似于:

      Col_A Col_B Col_C
     1   A    0     0       
     2   A    1     1      
     3   A    1     2      
     4   A    1     3      
     5   B    0     0      
     6   B    1     1      
     7   B    0     1       
     8   B    1     2      
     9   C    1     1      
    10   C    1     2      
    11   C    1     3      
    12   C    0     3    

我认为您需要先通过或过滤:

或:

最后将
NaN
s替换为
ffill
(使用方法='ffill')。但获取的第一个值仍然是
NaN
s,它被替换为最后一个将列转换为
int

df['Col_C']  = df['Col_C'].ffill().fillna(0).astype(int)
print (df)
   Col_A  Col_B  Col_C
1      A      0      0
2      A      1      1
3      A      1      2
4      A      1      3
5      B      0      3
6      B      1      1
7      B      0      1
8      B      1      2
9      C      1      1
10     C      1      2
11     C      1      3
12     C      0      3

列为0与列为完全空白不同。 如果列中有NAs,则该列的.cumsum()实际上应该是NA(或您所说的“空白”)。 您可以检查整个列是否为NA,并相应地设置值

:


此解决方案在实现时是正确的。我认为.groupby()之后的.ffill()可能是正确的,事实确实如此。谢谢很高兴能帮忙,周末愉快!
df['Col_C'] = df.query('Col_B != 0').groupby(['Col_A'])['Col_B'].cumsum()
print (df)
   Col_A  Col_B  Col_C
1      A      0    NaN
2      A      1    1.0
3      A      1    2.0
4      A      1    3.0
5      B      0    NaN
6      B      1    1.0
7      B      0    NaN
8      B      1    2.0
9      C      1    1.0
10     C      1    2.0
11     C      1    3.0
12     C      0    NaN
df['Col_C']  = df['Col_C'].ffill().fillna(0).astype(int)
print (df)
   Col_A  Col_B  Col_C
1      A      0      0
2      A      1      1
3      A      1      2
4      A      1      3
5      B      0      3
6      B      1      1
7      B      0      1
8      B      1      2
9      C      1      1
10     C      1      2
11     C      1      3
12     C      0      3
DataFrame.cumsum(axis=None, skipna=True, *args, **kwargs)
Return cumulative sum over requested axis.

skipna : boolean, default True
Exclude NA/null values. If an entire row/column is NA, the result will be NA