Pandas Groupby对熊猫中唯一日期时间的计数
我有一个如下所示的数据框Pandas Groupby对熊猫中唯一日期时间的计数,pandas,pandas-groupby,Pandas,Pandas Groupby,我有一个如下所示的数据框 Doctor Start B_ID Session Finish NoShow A 2020-01-18 12:00:00 1 S1 2020-01-18 12:33:00 no A 2020-01-18 12:20:00 2 S1 2020-01-18 12:52:00 no A
Doctor Start B_ID Session Finish NoShow
A 2020-01-18 12:00:00 1 S1 2020-01-18 12:33:00 no
A 2020-01-18 12:20:00 2 S1 2020-01-18 12:52:00 no
A 2020-01-18 13:00:00 3 S1 2020-01-18 13:23:00 no
A 2020-01-18 13:00:00 4 S1 2020-01-18 13:37:00 yes
A 2020-01-18 13:35:00 5 S1 2020-01-18 13:56:00 no
A 2020-01-18 14:10:00 6 S1 2020-01-18 14:15:00 no
A 2020-01-18 14:10:00 7 S1 2020-01-18 14:28:00 yes
A 2020-01-18 14:10:00 8 S1 2020-01-18 14:40:00 yes
A 2020-01-18 14:10:00 9 S1 2020-01-18 15:01:00 no
A 2020-01-19 12:00:00 12 S2 2020-01-19 12:20:00 no
A 2020-01-19 12:30:00 13 S2 2020-01-19 12:40:00 no
A 2020-01-19 13:00:00 14 S2 2020-01-19 13:20:00 yes
A 2020-01-19 13:40:00 15 S2 2020-01-19 13:46:00 no
A 2020-01-19 14:00:00 16 S2 2020-01-19 14:10:00 yes
A 2020-01-19 14:00:00 17 S2 2020-01-19 14:20:00 no
A 2020-01-19 14:00:00 19 S2 2020-01-19 14:40:00 yes
B 2020-01-18 12:00:00 21 S3 2020-01-18 12:33:00 no
B 2020-01-18 12:30:00 22 S3 2020-01-18 12:52:00 no
B 2020-01-18 13:10:00 23 S3 2020-01-18 13:25:00 no
B 2020-01-18 13:10:00 24 S3 2020-01-18 13:39:00 no
B 2020-01-18 13:30:00 25 S3 2020-01-18 13:56:00 yes
B 2020-01-18 14:05:00 26 S3 2020-01-18 14:15:00 no
B 2020-01-18 14:30:00 27 S3 2020-01-18 14:48:00 yes
从上面我想准备下面的数据框架
预期产出:
Doctor Day No_of_slots No_of_bookings No_of_NoShow
A 2020-01-18 5 9 3
A 2020-01-19 5 7 3
b 2020-01-18 6 7 2
在哪里
与命名聚合一起使用,对于countyes
值由助手列sum
使用new
创建,通过比较并转换为数值:
与命名聚合一起使用,对于countyes
值由助手列sum
使用new
创建,通过比较并转换为数值:
如果可能的话,请研究一下这个问题@ALI-似乎有必要找到一些循环的解决方案,真的很复杂。在熊猫身上找到循环的解决方案并不容易。。。我尝试了一些东西,但仍然无法创建解决方案。抱歉:(如果可能的话,请研究一下这个问题@ALI-似乎需要一些循环解决方案,非常复杂。熊猫中的循环解决方案并不容易…我尝试了一些东西,但仍然无法创建解决方案。抱歉:(
No_of_slots = Total number of slots based on unique Start time
No_of_bookings = Total number of bookings
No_of_NoShow = Number of NoShow == 'yes'
df['Start'] = pd.to_datetime(df['Start'])
df['Finish'] = pd.to_datetime(df['Finish'])
d = df['Start'].dt.date.rename('Day')
df1 = (df.assign(new = df['NoShow'].eq('yes').view('i1'))
.groupby(['Doctor', d]).agg(No_of_slots=('Start','nunique'),
No_of_bookings=('Start','size'),
No_of_NoShow=('new', 'sum'))
.reset_index())
print (df1)
Doctor Day No_of_slots No_of_bookings No_of_NoShow
0 A 2020-01-18 5 9 3
1 A 2020-01-19 5 7 3
2 B 2020-01-18 6 7 2