Python 熊猫从CSV和groupby中读取日期每月的总营业日

Python 熊猫从CSV和groupby中读取日期每月的总营业日,python,pandas,datetime,pandas-groupby,Python,Pandas,Datetime,Pandas Groupby,data.CSV ID Activity Month Activity Date 0 04/2019 04-01-2019 1 05/2019 05-13-2019 2 05/2019 05-25-2019 3 06/2019 06-10-2019 4 06/2019 06-19-2019 5 07/2019 07-15-2019 6 07/2019 07-18-2019 7 07/201

data.CSV

ID Activity Month   Activity Date

0   04/2019     04-01-2019

1   05/2019     05-13-2019

2   05/2019     05-25-2019

3   06/2019     06-10-2019

4   06/2019     06-19-2019

5   07/2019     07-15-2019

6   07/2019     07-18-2019

7   07/2019     07-29-2019

8   08/2019     06-03-2019

9   08/2019     06-15-2019

10  08/2019     06-20-2019
我的计划

阅读csv:

df=pd.read\u csv('data.csv'))

转换为日期时间:

df['Activity Date']=pd.to_datetime(df['Activity Date'],dayfirst=True)

按“活动月”列分组:

grouped=df.groupby(['Activity Month'])['Activity Date'].count()

打印(分组)

对日期进行分组时,执行工作日计算:

这部分我不知道该怎么做。已经输了

我用来计算工作日的代码

import calendar
import datetime

x = datetime.date(2019, 4, 1)
cal = calendar.Calendar()
working_days = len([x for x in cal.itermonthdays2(x.year, x.month) if x[0] !=0 and x[1] < 5])
print ("Total business days for month (" + str(x.month) +  ") is " + str(working_days) + " days")

我不完全清楚这里的问题陈述,但如果您想计算每个
活动月的工作日数
,您可以将您的计算包装在一个方法中,并将该方法应用于
活动月
列(lambda表达式基本上是针对指定列的每一行的for循环操作)


但是,在每个单元格中存储重复的信息是一个坏主意。最好是简单地返回
工作日
,而不是将其嵌入字符串中。

为了不忘记这一点,我在手机上,会在几个小时内检查并给你答案。我最近也在这个库中工作过!感谢@CeliusS的努力tingherI我想这就是我想要的。只是为了学习,所以现在还可以。无论如何,谢谢!无论如何,我可以知道为什么我们需要添加“.reset_index()”吗请?我试图删除它,但当您运行
groupby
操作时,代码不起作用,它将
Activity Month
作为
DataFrame
reset\u index()
将当前索引替换为行号,并将原始索引作为新列放入(仅当您未通过
drop=True
时)。
import calendar
import datetime

x = datetime.date(2019, 4, 1)
cal = calendar.Calendar()
working_days = len([x for x in cal.itermonthdays2(x.year, x.month) if x[0] !=0 and x[1] < 5])
print ("Total business days for month (" + str(x.month) +  ") is " + str(working_days) + " days")
Total business days for month (4) is 22 days
Total business days for month (5) is 23 days
Total business days for month (6) is 20 days
Total business days for month (7) is 23 days
Total business days for month (8) is 22 days
grouped = df.groupby(['Activity Month'])['Activity Date'].count().reset_index()

def get_business_days(x):
    x = datetime.date(int(x.split('/')[1]), int(x.split('/')[0]), 1)
    cal = calendar.Calendar()
    working_days = len([x for x in cal.itermonthdays2(x.year, x.month) if x[0] !=0 and x[1] < 5])
    return ("Total business days for month (" + str(x.month) +  ") is " + str(working_days) + " days")

grouped['Activity Month'].apply(get_business_days)
0    Total business days for month (4) is 22 days
1    Total business days for month (5) is 23 days
2    Total business days for month (6) is 20 days
3    Total business days for month (7) is 23 days
4    Total business days for month (8) is 22 days