Python 需要帮助确定新冠病毒-19感染前后的药物“平均使用量”

Python 需要帮助确定新冠病毒-19感染前后的药物“平均使用量”,python,pandas,Python,Pandas,在这里,我们的目标是根据两天之间的距离,按用户和产品获得“平均平板电脑使用率”。我的数据集如下所示 df = pd.DataFrame({'Patient': ['John','Smith','John','Smith','John','Smith','John','Smith','John','Smith','John','Smith'], 'Drug': ['brufen','tylenol','brufen','tylenol','brufen','ty

在这里,我们的目标是根据两天之间的距离,按用户和产品获得“平均平板电脑使用率”。我的数据集如下所示

df = pd.DataFrame({'Patient': ['John','Smith','John','Smith','John','Smith','John','Smith','John','Smith','John','Smith'],
                  'Drug': ['brufen','tylenol','brufen','tylenol','brufen','tylenol','tylenol','brufen','tylenol','brufen','tylenol','brufen'],
                   'Date': [20200101, 20200102, 20200105, 20200108, 20200113, 20200110,20200120, 20200125, 20200124, 20200126, 20200126, 20200127],
                   'Tablets': [1,1,1,1,1,1,1,1,1,1,1,1]})
df['Date'] = pd.to_datetime(df['Date'], format='%Y%m%d')
df.head()
下面是我看到的结果…非常感谢你的帮助

df_result = pd.DataFrame({'Patient': ['John','Smith','John','Smith'],
                  'Drug': ['brufen','tylenol','brufen','tylenol'],
                   'Average_Days_Distance_In_Usage': [6, 4, 3, 1]})
您可以将groupby和transform与shift相结合,以获得每行的天数。然后,按患者和药物分组

df = df.sort_values(["Patient","Drug","Date"]).copy(deep=True)
df["Days"] = df.groupby(["Patient","Drug"])['Date'].transform(lambda x: x- x.shift()).dt.days
result_df = df.groupby(["Patient","Drug"])["Days"].mean().reset_index()