Python 基于两列对数据帧重新采样
我得到了一个有两列的熊猫数据框。日期和评级编号,如下所示:Python 基于两列对数据帧重新采样,python,pandas,Python,Pandas,我得到了一个有两列的熊猫数据框。日期和评级编号,如下所示: Date Rating 0 2020-07-28 9 1 2020-07-28 10 2 2020-07-27 8 3 2020-07-26 10 4 2020-07-26 9 <class 'pandas.core.frame.DataFrame'> RangeIndex: 100 ent
Date Rating
0 2020-07-28 9
1 2020-07-28 10
2 2020-07-27 8
3 2020-07-26 10
4 2020-07-26 9
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Date Amount of Ratings Average rating
0 2020-07-28 2 9.5
1 2020-07-27 1 8
2 2020-07-26 2 9.5
我该怎么做
我将索引更改为Datetimeindex,并使用count()对行进行计数,但它会对所有列进行计数,我希望将评级列重新采样为每日平均评级
这就是我所尝试的:
df = df.set_index(pd.to_datetime(df['Date']))
df_resampled = df.resample('D').count()
Output:
Date Rating
Date
2020-07-21 17 17
2020-07-22 14 14
2020-07-23 16 16
2020-07-24 14 14
2020-07-25 9 9
使用df.agg()
df_resampled = df.resample('D').agg({'Date': 'count', 'Value': 'mean'}))
df_resampled = df_resampled.rename(columns = {'Date' : 'Amount of Ratings' , 'Value' : 'Average rating'})
Output:
Amount of Ratings Average rating
Date
2020-07-26 2 9.5
2020-07-27 1 8.0
2020-07-28 2 9.5
您可以使用Group by Agg解决此问题:
df2= df.groupby(['Date'], as_index=False).agg(['mean', 'count'])
df2.columns = ['Average rating', 'Amount of Ratings']
df2 = df2.reset_index()
df2
输出:
Date Average rating Amount of Ratings
0 2020-07-26 9.5 2
1 2020-07-27 8.0 1
2 2020-07-28 9.5 2
更多信息请访问