Pandas 计算一组时间戳中的小时数和一个固定数字之间的差值
我有以下数据帧。我试图实现的是,对于每个“工作日期”,“花费的时间”列中的小时数总和应等于7。例如,2019年6月10日,小时数之和已经是7,因此无需调整。2019年6月12日,小时数之和为4.25,因此我需要插入一行,其中包含“Tab_description”差异,该差异将显示为“花费的时间”下的差异2.75。2019年6月13日和2019年6月14日已经达到7,因此无需在那里做任何事情。在2019年6月19日的情况下,我需要执行与2019年6月12日相同的操作,插入一行,总和为6,使总和达到7。谢谢你的帮助Pandas 计算一组时间戳中的小时数和一个固定数字之间的差值,pandas,timestamp,difference,Pandas,Timestamp,Difference,我有以下数据帧。我试图实现的是,对于每个“工作日期”,“花费的时间”列中的小时数总和应等于7。例如,2019年6月10日,小时数之和已经是7,因此无需调整。2019年6月12日,小时数之和为4.25,因此我需要插入一行,其中包含“Tab_description”差异,该差异将显示为“花费的时间”下的差异2.75。2019年6月13日和2019年6月14日已经达到7,因此无需在那里做任何事情。在2019年6月19日的情况下,我需要执行与2019年6月12日相同的操作,插入一行,总和为6,使总和达到
Date_worked Tab_description Time_spent
0 6/10/2019 Perform planning procedures 7.0
1 6/11/2019 Perform planning procedures 7.0
2 6/12/2019 Time off (away from the office) 2.25
3 6/12/2019 Staff meeting 1.0
4 6/12/2019 Accounting & Risk Management Luncheon 1.0
5 6/13/2019 Perform planning procedures 7.0
6 6/14/2019 Time off (away from the office) 2.0
7 6/14/2019 Review policies and procedures 5.0
8 6/17/2019 Time off (away from the office) 7.0
9 6/18/2019 Perform planning procedures 7.0
10 6/19/2019 Staff meeting 1.0
11 6/20/2019 Time off (away from the office) 2.0
12 6/21/2019 Time off (away from the office) 1.0
13 6/24/2019 Staff meeting (FY 20 planning) 7.0
14 6/25/2019 FCR Kick-off meeting 1.0
15 6/26/2019 Time off (away from the office) 1.5
16 6/26/2019 Staff meeting 1.0
17 6/28/2019 Time off (away from the office) 1.0
有很多方法可以做到这一点,我将向您展示如何使用
groupby
&concat
pd.concat(
[
df,
df2.dropna()
.drop("Time_spent", axis=1)
.rename(columns={"variance": "Time_spent"}),
],
sort=False,
)
print(df)
Date_worked Tab_description Time_spent
0 6/10/2019 Perform planning procedures 7.00
1 6/11/2019 Perform planning procedures 7.00
2 6/12/2019 Time off (away from the office) 0.25
3 6/12/2019 Staff meeting 1.00
4 6/12/2019 Accounting & Risk Management Luncheon 1.00
5 6/13/2019 Perform planning procedures 7.00
6 6/14/2019 Time off (away from the office) 2.00
7 6/14/2019 Review policies and procedures 5.00
8 6/17/2019 Time off (away from the office) 7.00
9 6/18/2019 Perform planning procedures 7.00
10 6/19/2019 Staff meeting 1.00
11 6/20/2019 Time off (away from the office) 2.00
12 6/21/2019 Time off (away from the office) 1.00
13 6/24/2019 Staff meeting (FY 7.00
14 6/25/2019 FCR Kick-off meeting 1.00
15 6/26/2019 Time off (away from the office) 1.50
16 6/26/2019 Staff meeting 1.00
17 6/28/2019 Time off (away from the office) 1.00
2 6/12/2019 Difference -4.75
7 6/19/2019 Difference -6.00
8 6/20/2019 Difference -5.00
9 6/21/2019 Difference -6.00
11 6/25/2019 Difference -6.00
12 6/26/2019 Difference -4.50
13 6/28/2019 Difference -6.00
首先让我们算出总时间和差值
print(df)
Date_worked Tab_description Time_spent
0 6/10/2019 Perform planning procedures 7.00
1 6/11/2019 Perform planning procedures 7.00
2 6/12/2019 Time off (away from the office) 0.25
3 6/12/2019 Staff meeting 1.00
4 6/12/2019 Accounting & Risk Management Luncheon 1.00
5 6/13/2019 Perform planning procedures 7.00
6 6/14/2019 Time off (away from the office) 2.00
7 6/14/2019 Review policies and procedures 5.00
8 6/17/2019 Time off (away from the office) 7.00
9 6/18/2019 Perform planning procedures 7.00
10 6/19/2019 Staff meeting 1.00
11 6/20/2019 Time off (away from the office) 2.00
12 6/21/2019 Time off (away from the office) 1.00
13 6/24/2019 Staff meeting (FY 7.00
14 6/25/2019 FCR Kick-off meeting 1.00
15 6/26/2019 Time off (away from the office) 1.50
16 6/26/2019 Staff meeting 1.00
17 6/28/2019 Time off (away from the office) 1.00
我们从groupby
和一个简单的差分和开始,将其分配给一个名为df2的新变量
df2 = df.groupby('Date_worked')['Time_spent'].sum().reset_index()
df2['variance'] = df2['Time_spent'] - 7.00
我们现在创建您的选项卡列并创建您要求的描述
df2.loc[df2['variance'] != 0, 'Tab_description'] = 'Difference'
然后,我们删除所有NaN行,删除'Time\u-spend'
列,并将concat中的'Variance'列重命名为timespent
pd.concat(
[
df,
df2.dropna()
.drop("Time_spent", axis=1)
.rename(columns={"variance": "Time_spent"}),
],
sort=False,
)
print(df)
Date_worked Tab_description Time_spent
0 6/10/2019 Perform planning procedures 7.00
1 6/11/2019 Perform planning procedures 7.00
2 6/12/2019 Time off (away from the office) 0.25
3 6/12/2019 Staff meeting 1.00
4 6/12/2019 Accounting & Risk Management Luncheon 1.00
5 6/13/2019 Perform planning procedures 7.00
6 6/14/2019 Time off (away from the office) 2.00
7 6/14/2019 Review policies and procedures 5.00
8 6/17/2019 Time off (away from the office) 7.00
9 6/18/2019 Perform planning procedures 7.00
10 6/19/2019 Staff meeting 1.00
11 6/20/2019 Time off (away from the office) 2.00
12 6/21/2019 Time off (away from the office) 1.00
13 6/24/2019 Staff meeting (FY 7.00
14 6/25/2019 FCR Kick-off meeting 1.00
15 6/26/2019 Time off (away from the office) 1.50
16 6/26/2019 Staff meeting 1.00
17 6/28/2019 Time off (away from the office) 1.00
2 6/12/2019 Difference -4.75
7 6/19/2019 Difference -6.00
8 6/20/2019 Difference -5.00
9 6/21/2019 Difference -6.00
11 6/25/2019 Difference -6.00
12 6/26/2019 Difference -4.50
13 6/28/2019 Difference -6.00
@尽管我很乐意帮忙,但我已经花了很多时间研究工作时间表和劳动数据。如果这个答案足够,请随意打绿色勾,这样问题就可以结束了