Python 在数据帧中查找工作日组的平均值

Python 在数据帧中查找工作日组的平均值,python,pandas,datetime,group-by,Python,Pandas,Datetime,Group By,我的数据集如下所示: tripduration starttime User Type 0 732 7/1/2015 00:00:03 Subscriber 1 322 7/1/2015 00:00:06 Subscriber 2 790 7/1/2015 00:00:17 Subscriber 3 1228

我的数据集如下所示:

         tripduration           starttime   User Type
0                 732   7/1/2015 00:00:03  Subscriber
1                 322   7/1/2015 00:00:06  Subscriber
2                 790   7/1/2015 00:00:17  Subscriber
3                1228   7/1/2015 00:00:23  Subscriber
4                1383   7/1/2015 00:00:44  Subscriber
5                 603   7/1/2015 00:01:00  Subscriber
6                 520   7/1/2015 00:01:03  Subscriber
7                 289   7/1/2015 00:01:06  Subscriber
8                1771   7/1/2015 00:01:25    Customer
9                 813   7/1/2015 00:01:41  Subscriber
10               1735   7/1/2015 00:01:50    Customer
11                832   7/1/2015 00:01:58  Subscriber
12               1210   7/1/2015 00:02:06  Subscriber
13                746   7/1/2015 00:02:07  Subscriber
14                749   7/1/2015 00:02:26  Subscriber
15                463   7/1/2015 00:02:26  Subscriber
16                331   7/1/2015 00:02:35  Subscriber
17                951   7/1/2015 00:02:43    Customer
18               1352   7/1/2015 00:02:47    Customer
19                275   7/1/2015 00:02:47  Subscriber
20                199   7/1/2015 00:03:05  Subscriber
21                383   7/1/2015 00:03:16    Customer
22               4210   7/1/2015 00:03:27  Subscriber
23                584   7/1/2015 00:03:34  Subscriber
24                735   7/1/2015 00:03:48  Subscriber
25                827   7/1/2015 00:03:56  Subscriber
26                677   7/1/2015 00:03:57  Subscriber
27               2371   7/1/2015 00:03:58    Customer
28                666   7/1/2015 00:04:03  Subscriber
29                999   7/1/2015 00:04:17  Subscriber
...               ...                 ...         ...
1085646           243  7/31/2015 23:57:25  Subscriber
1085647          1378  7/31/2015 23:57:29    Customer
1085648           230  7/31/2015 23:57:32  Subscriber
1085649          1669  7/31/2015 23:57:33  Subscriber
1085650           493  7/31/2015 23:57:44  Subscriber
1085651           822  7/31/2015 23:57:54  Subscriber
1085652           617  7/31/2015 23:58:03  Subscriber
1085653           349  7/31/2015 23:58:08  Subscriber
1085654           818  7/31/2015 23:58:12    Customer
1085655          2062  7/31/2015 23:58:15  Subscriber
1085656           945  7/31/2015 23:58:18    Customer
1085657           346  7/31/2015 23:58:24  Subscriber
1085658           399  7/31/2015 23:58:27  Subscriber
1085659           641  7/31/2015 23:58:42  Subscriber
1085660          1872  7/31/2015 23:58:43  Subscriber
1085661         12065  7/31/2015 23:58:51    Customer
1085662           265  7/31/2015 23:58:53  Subscriber
1085663           936  7/31/2015 23:58:58  Subscriber
1085664           395  7/31/2015 23:59:04  Subscriber
1085665           238  7/31/2015 23:59:10  Subscriber
1085666           551  7/31/2015 23:59:24  Subscriber
1085667           423  7/31/2015 23:59:23    Customer
1085668          1623  7/31/2015 23:59:24  Subscriber
1085669          1632  7/31/2015 23:59:24  Subscriber
1085670           305  7/31/2015 23:59:38  Subscriber
1085671           275  7/31/2015 23:59:40  Subscriber
1085672           530  7/31/2015 23:59:41  Subscriber
1085673           273  7/31/2015 23:59:42    Customer
1085674          1273  7/31/2015 23:59:56  Subscriber
1085675          1667  7/31/2015 23:59:59  Subscriber
我的问题 订户在任何工作日(周一至周五)的平均行程时间是多少

我的代码 函数
a4()
应返回平均值(浮点数为两位小数):

我被困在这里是为了得到工作日(周一至周五)来计算
tripduration
的平均值。 我试图使用
parser.parse(df1['starttime')
解析
starttime
,但出现错误:

TypeError: Parser must be a string or character stream, not Series

获取工作日平均值的正确方法是什么?

我认为您需要首先转换列
starttime

然后过滤

如果需要为所有
workday
使用一个标量值,则使用
loc
选择包含
mean
的列:

def a4(rides):
    rides['starttime'] = pd.to_datetime(rides['starttime'])
    m = (rides['starttime'].dt.dayofweek < 5) & (rides['User Type'] == 'Subscriber')
    return round(rides.loc[m, 'tripduration'].mean(), 2)

print (a4(rides))
825.33
def a4(rides):
    rides['starttime'] = pd.to_datetime(rides['starttime'])
    df1 = rides[(rides['User Type'] == 'Subscriber') & (rides['starttime'].dt.dayofweek < 5)]
    return df1.groupby(df1['starttime'].dt.dayofweek)['tripduration'].mean().round(2)

print (a4(rides))
starttime
2    840.96
4    809.71
Name: tripduration, dtype: float64
如果不需要天数,请使用:

def a4(游乐设施):
乘坐次数['starttime']=pd.to_datetime(乘坐次数['starttime'])
df1=rides[(rides['User Type']=='Subscriber')和(rides['starttime'].dt.dayofweek<5)]
返回df1.groupby(df1['starttime'].dt.weekday_name]['tripduration'].mean().round(2)
印刷品(a4(游乐设施))
开始时间
星期五809.71
星期三840.96
名称:tripduration,数据类型:float64
使用布尔索引和
groupby
dayofweek
进行过滤,以计算
平均值

df = df[(df.starttime.dt.dayofweek < 5) & df['User Type'].eq('Subscriber')]   
g = np.round(df.groupby(df.starttime.dt.dayofweek).tripduration.mean(), 2)
df=df[(df.starttime.dt.dayofweek<5)&df['User Type'].eq('Subscriber')]
g=np.round(df.groupby(df.starttime.dt.dayofweek).tripduration.mean(),2)

df.groupby(df.starttime.dt.dayofweek).tripduration.mean()
如果您的问题得到了回答,您可以。
def a4(rides):
    rides['starttime'] = pd.to_datetime(rides['starttime'])
    df1 = rides[(rides['User Type'] == 'Subscriber') & (rides['starttime'].dt.dayofweek < 5)]
    return df1.groupby(df1['starttime'].dt.weekday_name)['tripduration'].mean().round(2)

print (a4(rides))
starttime
Friday       809.71
Wednesday    840.96
Name: tripduration, dtype: float64
df = pd.read_csv(...., parse_dates='starttime')
df = df[(df.starttime.dt.dayofweek < 5) & df['User Type'].eq('Subscriber')]   
g = np.round(df.groupby(df.starttime.dt.dayofweek).tripduration.mean(), 2)