Python Pandas Dataframe-如何获得按值分组的滚动和？_Python_Pandas

Python Pandas Dataframe-如何获得按值分组的滚动和？

python pandas

Python Pandas Dataframe-如何获得按值分组的滚动和？,python,pandas,Python,Pandas,使用一些新冠病毒-19数据，我应该如何计算14天的病例数滚动总和以下是我现有的代码： import pandas as pd import matplotlib.pyplot as plt url = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv' all_counties = pd.read_csv(url, dtype={"fips": str}) all

使用一些新冠病毒-19数据，我应该如何计算14天的病例数滚动总和

以下是我现有的代码：

import pandas as pd
import matplotlib.pyplot as plt

url = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv'
all_counties = pd.read_csv(url, dtype={"fips": str})
all_counties.date = pd.to_datetime(all_counties.date)
oregon = all_counties.loc[all_counties['state'] == 'Oregon']

oregon.set_index('date', inplace=True)
oregon['delta']=oregon.groupby(['state','county'])['cases'].diff().fillna(0)
oregon.head()

此代码计算每日增量案例计数（感谢前面问题的答案）

下一步是计算滚动14天的总和，我尝试了以下步骤：

oregon['rolling_14']=oregon.groupby(['state','county'])['delta'].rolling(min_periods=1, window=14).sum()

不幸的是，它失败了。如果我有一个县的数据，这是可行的：

county['rolling_14']=county.rolling(min_periods=1, window=14).sum()

但不幸的是，当数据帧包含多个县的数据集时，这是不可行的。

groupby（）.rolling（）有两个额外的索引级别，即

state，country

。移除它们，任务就会生效

oregon['rolling_14'] = (oregon.groupby(['state','county'])['delta']
                            .rolling(min_periods=1, window=14).sum()
                            .reset_index(level=['state','county'])
                       )

此外，由于您正在使用多个groupby函数，因此lazy groupby将有助于稍微改进运行时/代码库：

groups = oregon.groupby(['state','county'])
oregon['delta'] = groups['cases'].diff().fillna(0)

oregon['rolling_14'] = (groups['delta']
                            .rolling(min_periods=1, window=14).sum()
                            .reset_index(level=['state','county'])
                       )