在Python中，从dataframe中的时间戳列计算唯一的工作日_Python_Pandas_Series

在Python中，从dataframe中的时间戳列计算唯一的工作日

python pandas

在Python中，从dataframe中的时间戳列计算唯一的工作日,python,pandas,series,Python,Pandas,Series,我想计算一下时间戳中存在多少唯一的工作日。这是一个输入，我希望输出为4（因为8/5和8/6是周末）使用pd转换为日期时间。到\u datetime，获取唯一的dayofweek列表，并计算5以下的所有日期时间 out = (df.captureTime.apply(pd.to_datetime).dt.dayofweek.unique() < 5).sum() print(out) 4 一种方式是熊猫系列。dt。工作日 df['captureTime'] = pd.to_dateti

我想计算一下时间戳中存在多少唯一的工作日。这是一个输入，我希望输出为4（因为8/5和8/6是周末）

使用

pd转换为日期时间。到\u datetime

，获取唯一的dayofweek列表，并计算5以下的所有日期时间

out = (df.captureTime.apply(pd.to_datetime).dt.dayofweek.unique() < 5).sum()
print(out)

4

一种方式是熊猫系列。dt。工作日

df['captureTime'] = pd.to_datetime(df['captureTime'])
np.sum(df['captureTime'].dt.weekday.isin([0,1,2,3,4]))

它返回4

如果需要捕获日期，可以使用布尔索引

df[df['captureTime'].dt.weekday.isin([0,1,2,3,4])]

    captureTime
0   2017-08-01 00:05:00
1   2017-08-02 00:05:00
2   2017-08-03 00:05:00
3   2017-08-04 00:05:00

使用：

印刷品

如上所述，所有工作日计算一次。如果希望只计算一次相同的

日期时间

，可以使用

np.is_busday(df['captureTime'].unique().astype('datetime64[D]')).sum()

或者，如果您希望删除具有相同

date

组件的

datetime

s，请在调用

np.unique

之前转换为

datetime64[D]

dtype：

np.is_busday(np.unique(df['captureTime'].values.astype('datetime64[D]'))).sum()

假设您将captureTime作为datetime对象，则可以执行此操作


s=df['captureTime'].dt.工作日
s[s>=5].count（）#5，6对应于星期六，星期天

使用@Vaishali谢谢，

dt.weekday

也是一种很好的方法。@ejshin1如果没有其他方法，至少我有：p@Alexander我相信这就是OP想要的。“计算独特的工作日…”这绝对是最好的。与我的解决方案相比，每个循环只有35µs，而每个循环只有1.26 ms。如果我应用了你建议的最后一行，那么我的日期是7月17日到8月25日。我的预期产量是30，因为有6周和5个工作日。但是，它返回我180。@ejshin1:请将

df[['captureTime']]发布到[u dict（'list'）

，以便我们可以尝试重现该问题。也许更智能的调试方法是查看

np.unique（df['captureTime'].values.astype（'datetime64[D]'）[：41]

。2017-07-17和2017-08-25之间有40天（包括终点）。因此，根据鸽子洞原理，如果

np.unique

返回的物品超过40件，那么至少有两件物品必须落在同一日期。我认为这是不可能的，但如果真的发生了，这是理解问题的重要第一步。（如果

np.is_busday（…）.sum（）

返回180，则

np.unique

必须返回超过40项…）另一种可能性（可能更大）是

df['captureTime']

包含的日期超出2017-07-17至2017-08-25的范围。查看

np.unique（df['captureTime'].values.astype（'datetime64[D]'））[‌:41]

或

（（df['captureTime']=pd.Timestamp（'2017-08-26'））。任何（）

都会告诉我们这是否正确。

np.is_busday(df['captureTime'].unique().astype('datetime64[D]')).sum()

np.is_busday(np.unique(df['captureTime'].values.astype('datetime64[D]'))).sum()