Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/353.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 合并/连接/连接两个具有重复列但日期时间索引不同的数据帧的最佳方法?_Python_Pandas - Fatal编程技术网

Python 合并/连接/连接两个具有重复列但日期时间索引不同的数据帧的最佳方法?

Python 合并/连接/连接两个具有重复列但日期时间索引不同的数据帧的最佳方法?,python,pandas,Python,Pandas,我有两个数据帧,一个是过去的数据。另一个是预测。我想合并它们,这样就不会有重复的列 我的代码如下所示: Past = X RealData 2019-03-27 12:30:00 8.295 True 2019-03-27 13:00:00 7.707 True 2019-03-27 13:30:00 7.518 True 2019-03-27 14:00:00 7.518 True 2019-03-27 14:30:00 7.518

我有两个数据帧,一个是过去的数据。另一个是预测。我想合并它们,这样就不会有重复的列

我的代码如下所示:

Past = 
                      X RealData
2019-03-27 12:30:00 8.295   True
2019-03-27 13:00:00 7.707   True
2019-03-27 13:30:00 7.518   True
2019-03-27 14:00:00 7.518   True
2019-03-27 14:30:00 7.518   True
2019-03-27 15:00:00 7.455   True
2019-03-27 15:30:00 7.518   True
2019-03-27 16:00:00 20.244  True
2019-03-27 16:30:00 20.895  True
2019-03-27 17:00:00 21.630  True
2019-03-27 17:30:00 24.360  True
2019-03-27 18:00:00 24.591  True
2019-03-27 18:30:00 26.460  True
2019-03-27 19:00:00 14.280  True
2019-03-27 19:30:00 12.180  True
2019-03-27 20:00:00 11.550  True
2019-03-27 20:30:00 9.051   True
2019-03-27 21:00:00 8.673   True
2019-03-27 21:30:00 7.791   True

Future = 
                        X           RealData
2019-03-27 22:30:00 8.450913    False
2019-03-27 23:00:00 8.494944    False
2019-03-27 23:30:00 9.058649    False
2019-03-28 00:00:00 22.055525   False
2019-03-28 00:30:00 23.344284   False
2019-03-28 01:00:00 24.793011   False
2019-03-28 01:30:00 26.203117   False
2019-03-28 02:00:00 27.897289   False
2019-03-28 02:30:00 14.187933   False
2019-03-28 03:00:00 14.110393   False
目前,我正在努力:

past_future = pd.concat([Future, Past], axis=1, sort=True)
我明白了:

                  X RealData    X   RealData
2019-03-27 12:30:00 8.295   True    NaN NaN
2019-03-27 13:00:00 7.707   True    NaN NaN
2019-03-27 13:30:00 7.518   True    NaN NaN
2019-03-27 14:00:00 7.518   True    NaN NaN
2019-03-27 14:30:00 7.518   True    NaN NaN
2019-03-27 15:00:00 7.455   True    NaN NaN
2019-03-27 15:30:00 7.518   True    NaN NaN
2019-03-27 16:00:00 20.244  True    NaN NaN
2019-03-27 16:30:00 20.895  True    NaN NaN
2019-03-27 17:00:00 21.630  True    NaN NaN
2019-03-27 17:30:00 24.360  True    NaN NaN
2019-03-27 18:00:00 24.591  True    NaN NaN
2019-03-27 18:30:00 26.460  True    NaN NaN
2019-03-27 19:00:00 14.280  True    NaN NaN
2019-03-27 19:30:00 12.180  True    NaN NaN
2019-03-27 20:00:00 11.550  True    NaN NaN
2019-03-27 20:30:00 9.051   True    NaN NaN
2019-03-27 21:00:00 8.673   True    NaN NaN
2019-03-27 21:30:00 7.791   True    NaN NaN
2019-03-27 22:30:00 NaN NaN 8.450913    False
2019-03-27 23:00:00 NaN NaN 8.494944    False
2019-03-27 23:30:00 NaN NaN 9.058649    False
2019-03-28 00:00:00 NaN NaN 22.055525   False
2019-03-28 00:30:00 NaN NaN 23.344284   False
2019-03-28 01:00:00 NaN NaN 24.793011   False
2019-03-28 01:30:00 NaN NaN 26.203117   False
2019-03-28 02:00:00 NaN NaN 27.897289   False
2019-03-28 02:30:00 NaN NaN 14.187933   False
2019-03-28 03:00:00 NaN NaN 14.110393   False
我的预期输出只有两列:

                      X         RealData
2019-03-27 12:30:00 8.295   True
2019-03-27 13:00:00 7.707   True
2019-03-27 13:30:00 7.518   True
2019-03-27 14:00:00 7.518   True
...                 ...         ...
2019-03-27 22:30:00 8.450913    False
2019-03-27 23:00:00 8.494944    False
2019-03-27 23:30:00 9.058649    False

你知道怎么处理吗?

我的简单建议是:保持一切井然有序。 那么一切都很容易

import pandas as pd

df1 = pd.read_csv('c:/4/a1.csv')
df2 = pd.read_csv('c:/4/a2.csv')
df2.dtypes


只是为了使ags29在这里所写的内容正式化


虽然沃伊切赫·莫斯钦斯基的回答要彻底得多,但这似乎做得相当好

您期望的输出是什么?
pd.concat([Future,pass])。drop_duplicates()
?@anky_91对我不起作用,除非括号中缺少一个kwarg。您可以尝试类似于
output=pd.concat([Future.reset_index(),pass.reset_index()],axis=0)
然后使用
输出设置索引。设置索引('index',inplace=True)
df1.date = pd.to_datetime(df1.date)
df2.date = pd.to_datetime(df1.date)
df2.dtypes

df1.set_index(df1.date, inplace=True)
df2.set_index(df2.date, inplace=True)


df = df1.append(df2)
df.sort_index()
df.drop_duplicates('date',keep='last', inplace=True)
df
output = pd.concat([Future.reset_index(), Past.reset_index()], axis=0)
output.set_index('index', inplace=True)