Python 熊猫-如何合并不同格式的datetime列上的数据帧?
我需要根据日期合并两个数据帧。第一个数据帧看起来像:Python 熊猫-如何合并不同格式的datetime列上的数据帧?,python,pandas,merge,timestamp,Python,Pandas,Merge,Timestamp,我需要根据日期合并两个数据帧。第一个数据帧看起来像: Time Stamp HP_1H_mean Coolant1_1H_mean Extreme_1H_mean 0 2019-07-26 07:00:00 410.637966 414.607081 0.0 1 2019-07-26 08:00:00 403.521735 424.787366 0.0 2
Time Stamp HP_1H_mean Coolant1_1H_mean Extreme_1H_mean
0 2019-07-26 07:00:00 410.637966 414.607081 0.0
1 2019-07-26 08:00:00 403.521735 424.787366 0.0
2 2019-07-26 09:00:00 403.143925 425.739639 0.0
3 2019-07-26 10:00:00 410.542895 426.210538 0.0
...
17 2019-07-27 00:00:00 0.000000 0.000000 0.0
18 2019-07-27 01:00:00 0.000000 0.000000 0.0
19 2019-07-27 02:00:00 0.000000 0.000000 0.0
20 2019-07-27 03:00:00 0.000000 0.000000 0.0
第二个是这样的:
Time Stamp Qty Compl
0 2019-07-26 150
1 2019-07-27 20
2 2019-07-29 230
3 2019-07-30 230
4 2019-07-31 170
Time Stamp HP_1H_mean Coolant1_1H_mean Extreme_1H_mean Qty Compl
0 2019-07-26 07:00:00 410.637966 414.607081 0.0 150
1 2019-07-26 08:00:00 403.521735 424.787366 0.0 150
2 2019-07-26 09:00:00 403.143925 425.739639 0.0 150
3 2019-07-26 10:00:00 410.542895 426.210538 0.0 150
...
17 2019-07-27 00:00:00 0.000000 0.000000 0.0 20
18 2019-07-27 01:00:00 0.000000 0.000000 0.0 20
19 2019-07-27 02:00:00 0.000000 0.000000 0.0 20
20 2019-07-27 03:00:00 0.000000 0.000000 0.0 20
两个时间戳
列都是datetime64[ns]
。我想合并左侧,并将日期向前填充到所有其他行中一天。我的问题是在合并时,来自第二个df的数量Compl
在每天的午夜应用,并且有些天没有午夜时间戳,例如第一个数据帧中的第一天
是否有方法合并和匹配包含同一天的每一行?所需的输出如下所示:
Time Stamp Qty Compl
0 2019-07-26 150
1 2019-07-27 20
2 2019-07-29 230
3 2019-07-30 230
4 2019-07-31 170
Time Stamp HP_1H_mean Coolant1_1H_mean Extreme_1H_mean Qty Compl
0 2019-07-26 07:00:00 410.637966 414.607081 0.0 150
1 2019-07-26 08:00:00 403.521735 424.787366 0.0 150
2 2019-07-26 09:00:00 403.143925 425.739639 0.0 150
3 2019-07-26 10:00:00 410.542895 426.210538 0.0 150
...
17 2019-07-27 00:00:00 0.000000 0.000000 0.0 20
18 2019-07-27 01:00:00 0.000000 0.000000 0.0 20
19 2019-07-27 02:00:00 0.000000 0.000000 0.0 20
20 2019-07-27 03:00:00 0.000000 0.000000 0.0 20
使用按日期时间排序的两个数据帧:
#if necessary
df1['Time Stamp'] = pd.to_datetime(df1['Time Stamp'])
df2['Time Stamp'] = pd.to_datetime(df2['Time Stamp'])
df1 = df1.sort_values('Time Stamp')
df2 = df2.sort_values('Time Stamp')
df = pd.merge_asof(df1, df2, on='Time Stamp')
print (df)
Time Stamp HP_1H_mean Coolant1_1H_mean Extreme_1H_mean \
0 2019-07-26 07:00:00 410.637966 414.607081 0.0
1 2019-07-26 08:00:00 403.521735 424.787366 0.0
2 2019-07-26 09:00:00 403.143925 425.739639 0.0
3 2019-07-26 10:00:00 410.542895 426.210538 0.0
4 2019-07-27 00:00:00 0.000000 0.000000 0.0
5 2019-07-27 01:00:00 0.000000 0.000000 0.0
6 2019-07-27 02:00:00 0.000000 0.000000 0.0
7 2019-07-27 03:00:00 0.000000 0.000000 0.0
Qty Compl
0 150
1 150
2 150
3 150
4 20
5 20
6 20
7 20
太棒了,我从来没有使用过
merge\u asof
。它默认为左合并总是这样吗?谢谢你的帮助,太好了。当时间结束时会接受。。。你太快了;)@55thSwiss-是的,这类似于左连接,只是我们匹配最近的键而不是相等的键