Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/306.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 带窗函数的累积和_Python_Pandas_Pandas Groupby_Window Functions_Cumulative Sum - Fatal编程技术网

Python 带窗函数的累积和

Python 带窗函数的累积和,python,pandas,pandas-groupby,window-functions,cumulative-sum,Python,Pandas,Pandas Groupby,Window Functions,Cumulative Sum,我正在使用seaborn的数据集tips: import pandas as pd import seaborn as sns tips = sns.load_dataset("tips") tips['rowid'] = tips.index 我想创建一个专栏,将累计计算给小费超过3的人,包括男性和晚餐。计数不应包括当前行(在下面的查询中,前面的cf1) SQL等价物是: SELECT *, SUM(CASE WHEN tip >= 3 AND sex='male' AND

我正在使用seaborn的数据集
tips

import pandas as pd
import seaborn as sns
tips = sns.load_dataset("tips")
tips['rowid'] = tips.index
我想创建一个专栏,将累计计算给小费超过3的人,包括男性和晚餐。计数不应包括当前行(在下面的查询中,前面的cf
1

SQL等价物是:

SELECT *, 
    SUM(CASE WHEN tip >= 3 AND sex='male' AND time='Dinner' THEN 1 ELSE NULL END) 
            OVER (PARTITION BY sex, time ORDER BY rowid ROWS BETWEEN unbounded PRECEDING AND 1 PRECEDING) as cnt
FROM tips
ORDER BY rowid ;
我怎样才能在熊猫身上达到同样的效果?从我所读到的,我可能会使用一些滚动和变换函数,但我没有成功

最终数据帧应包括以下内容:

编辑:ansev请求的数据帧切片

    total_bill  tip sex smoker  day time    size    rowid   cnt
index                                   
0   16.99   1.01    Female  No  Sun Dinner  2   0   NaN
1   10.34   1.66    Male    No  Sun Dinner  3   1   NaN
2   21.01   3.50    Male    No  Sun Dinner  3   2   NaN
3   23.68   3.31    Male    No  Sun Dinner  2   3   1.0
4   24.59   3.61    Female  No  Sun Dinner  4   4   NaN
5   25.29   4.71    Male    No  Sun Dinner  4   5   2.0
6   8.77    2.00    Male    No  Sun Dinner  2   6   3.0
7   26.88   3.12    Male    No  Sun Dinner  4   7   3.0
8   15.04   1.96    Male    No  Sun Dinner  2   8   4.0
9   14.78   3.23    Male    No  Sun Dinner  2   9   4.0
10  10.27   1.71    Male    No  Sun Dinner  2   10  5.0
11  35.26   5.00    Female  No  Sun Dinner  4   11  NaN
12  15.42   1.57    Male    No  Sun Dinner  2   12  5.0
13  18.43   3.00    Male    No  Sun Dinner  4   13  5.0
14  14.83   3.02    Female  No  Sun Dinner  2   14  NaN
15  21.58   3.92    Male    No  Sun Dinner  2   15  6.0
16  10.33   1.67    Female  No  Sun Dinner  3   16  NaN
17  16.29   3.71    Male    No  Sun Dinner  3   17  7.0
18  16.97   3.50    Female  No  Sun Dinner  3   18  NaN
19  20.65   3.35    Male    No  Sat Dinner  3   19  8.0
20  17.92   4.08    Male    No  Sat Dinner  2   20  9.0
21  20.29   2.75    Female  No  Sat Dinner  2   21  NaN
22  15.77   2.23    Female  No  Sat Dinner  2   22  NaN
23  39.42   7.58    Male    No  Sat Dinner  4   23  10.0
24  19.82   3.18    Male    No  Sat Dinner  2   24  11.0
25  17.81   2.34    Male    No  Sat Dinner  4   25  12.0
26  13.37   2.00    Male    No  Sat Dinner  2   26  12.0
27  12.69   2.00    Male    No  Sat Dinner  2   27  12.0
28  21.70   4.30    Male    No  Sat Dinner  2   28  12.0
29  19.65   3.00    Female  No  Sat Dinner  2   29  NaN
我想你需要

df['cnt'] = ( df.loc[df['sex'].eq('Male') & df['time'].eq('Dinner'),'tip']
                .ge(3)
                .cumsum()
                .shift() )

# if not ordered
#df['cnt'] = ( df.sort_values('rowid')
#                .loc[df['sex'].eq('Male') & df['time'].eq('Dinner'),'tip']
#                .ge(3)
#                .cumsum()
#                .shift() )
更新

df['cnt']=( df.loc[df['sex'].eq('Male') & df['time'].eq('Dinner'),'tip']
              .ge(3)
              .cumsum()
              .shift()
              .where(lambda x: x.gt(0))
            )

#       total_bill   tip     sex smoker  day    time  size  rowid   cnt
#index                                                                 
#0           16.99  1.01  Female     No  Sun  Dinner     2      0   NaN
#1           10.34  1.66    Male     No  Sun  Dinner     3      1   NaN
#2           21.01  3.50    Male     No  Sun  Dinner     3      2   NaN
#3           23.68  3.31    Male     No  Sun  Dinner     2      3   1.0
#4           24.59  3.61  Female     No  Sun  Dinner     4      4   NaN
#5           25.29  4.71    Male     No  Sun  Dinner     4      5   2.0
#6            8.77  2.00    Male     No  Sun  Dinner     2      6   3.0
#7           26.88  3.12    Male     No  Sun  Dinner     4      7   3.0
#8           15.04  1.96    Male     No  Sun  Dinner     2      8   4.0
#9           14.78  3.23    Male     No  Sun  Dinner     2      9   4.0
#10          10.27  1.71    Male     No  Sun  Dinner     2     10   5.0
#11          35.26  5.00  Female     No  Sun  Dinner     4     11   NaN
#12          15.42  1.57    Male     No  Sun  Dinner     2     12   5.0
#13          18.43  3.00    Male     No  Sun  Dinner     4     13   5.0
#14          14.83  3.02  Female     No  Sun  Dinner     2     14   NaN
#15          21.58  3.92    Male     No  Sun  Dinner     2     15   6.0
#16          10.33  1.67  Female     No  Sun  Dinner     3     16   NaN
#17          16.29  3.71    Male     No  Sun  Dinner     3     17   7.0
#18          16.97  3.50  Female     No  Sun  Dinner     3     18   NaN
#19          20.65  3.35    Male     No  Sat  Dinner     3     19   8.0
#20          17.92  4.08    Male     No  Sat  Dinner     2     20   9.0
#21          20.29  2.75  Female     No  Sat  Dinner     2     21   NaN
#22          15.77  2.23  Female     No  Sat  Dinner     2     22   NaN
#23          39.42  7.58    Male     No  Sat  Dinner     4     23  10.0
#24          19.82  3.18    Male     No  Sat  Dinner     2     24  11.0
#25          17.81  2.34    Male     No  Sat  Dinner     4     25  12.0
#26          13.37  2.00    Male     No  Sat  Dinner     2     26  12.0
#27          12.69  2.00    Male     No  Sat  Dinner     2     27  12.0
#28          21.70  4.30    Male     No  Sat  Dinner     2     28  12.0
#29          19.65  3.00  Female     No  Sat  Dinner     2     29   NaN

你能复制并粘贴数据框吗?我可以使用
pd.read_clipboard()
检查我的答案并帮助您:)数据集包含244行,我编写了从seaborn库加载它的代码:)。你的答案似乎很好,我正在核对;)@ansev dataframe已添加:)谢谢,是否要删除首字母0?如果可能,请删除!