Python 如何加入2熊猫时间序列

Python 如何加入2熊猫时间序列,python,pandas,Python,Pandas,我有一个价格数据帧(df1),如下所示: price 2007-01-01 00:00:00 0.789510 2007-01-01 04:00:00 0.789380 2007-01-01 20:00:00 0.789485 2007-01-02 01:00:00 0.791290 2007-01-02 02:00:00 0.791630 2007-01-02 16:00:00 0.793100 2007-01-02 17:00:00

我有一个价格数据帧(df1),如下所示:

                        price
2007-01-01 00:00:00  0.789510
2007-01-01 04:00:00  0.789380
2007-01-01 20:00:00  0.789485
2007-01-02 01:00:00  0.791290
2007-01-02 02:00:00  0.791630
2007-01-02 16:00:00  0.793100
2007-01-02 17:00:00  0.793605
2007-01-03 18:00:00  0.780640
2007-01-03 19:00:00  0.780005
2007-01-03 20:00:00  0.779410
2007-01-01 15:00:00    0.7882
2007-01-02 15:00:00    0.7962
2007-01-03 15:00:00    0.7909
2007-01-04 15:00:00    0.7862
2007-01-05 15:00:00    0.7787
2007-01-08 15:00:00    0.7812
2007-01-09 15:00:00    0.7800
2007-01-10 15:00:00    0.7769
一系列收盘价(s1)如下:

                        price
2007-01-01 00:00:00  0.789510
2007-01-01 04:00:00  0.789380
2007-01-01 20:00:00  0.789485
2007-01-02 01:00:00  0.791290
2007-01-02 02:00:00  0.791630
2007-01-02 16:00:00  0.793100
2007-01-02 17:00:00  0.793605
2007-01-03 18:00:00  0.780640
2007-01-03 19:00:00  0.780005
2007-01-03 20:00:00  0.779410
2007-01-01 15:00:00    0.7882
2007-01-02 15:00:00    0.7962
2007-01-03 15:00:00    0.7909
2007-01-04 15:00:00    0.7862
2007-01-05 15:00:00    0.7787
2007-01-08 15:00:00    0.7812
2007-01-09 15:00:00    0.7800
2007-01-10 15:00:00    0.7769
我想将收盘价从s1添加到df1,这样df1的指数就可以保持,对于df1中的每个日期时间戳,最新的收盘价就可以从s1添加

因此,生成的数据帧如下所示:

                        price  closing_price
2007-01-01 00:00:00  0.789510        0.7882
2007-01-01 04:00:00  0.789380        0.7882
2007-01-01 20:00:00  0.789485        0.7962
2007-01-02 01:00:00  0.791290        0.7962
2007-01-02 02:00:00  0.791630        0.7962
2007-01-02 16:00:00  0.793100        0.7909
2007-01-02 17:00:00  0.793605        0.7909
2007-01-03 18:00:00  0.780640        0.7862
2007-01-03 19:00:00  0.780005        0.7862
2007-01-03 20:00:00  0.779410        0.7862

您需要沿行(axis=1)将收盘价连接到数据帧。然后,您需要填写远期收盘价。最后,过滤掉价格为空的行

s1 = pd.Series([0.7882, 0.7962, 0.7909, 0.7862, 0.7787, 0.7812, 0.7800, 0.7769], 
               index=pd.date_range('2007-01-01 15:00', periods=8, freq='B'), name='close')

df1 = pd.DataFrame({'price': {
  pd.Timestamp('2007-01-01 00:00:00'): 0.789510,
  pd.Timestamp('2007-01-01 04:00:00'): 0.789380,
  pd.Timestamp('2007-01-01 20:00:00'): 0.789485,
  pd.Timestamp('2007-01-02 01:00:00'): 0.791290,
  pd.Timestamp('2007-01-02 02:00:00'): 0.791630,
  pd.Timestamp('2007-01-02 16:00:00'): 0.793100,
  pd.Timestamp('2007-01-02 17:00:00'): 0.793605,
  pd.Timestamp('2007-01-03 18:00:00'): 0.780640,
  pd.Timestamp('2007-01-03 19:00:00'): 0.780005,
  pd.Timestamp('2007-01-03 20:00:00'): 0.779410}})

df = pd.concat([df1, s1], axis=1)
df.close.ffill(inplace=True)
df = df[~df.price.isnull()]
>>> df
                        price   close
2007-01-01 00:00:00  0.789510     NaN
2007-01-01 04:00:00  0.789380     NaN
2007-01-01 20:00:00  0.789485  0.7882
2007-01-02 01:00:00  0.791290  0.7882
2007-01-02 02:00:00  0.791630  0.7882
2007-01-02 16:00:00  0.793100  0.7962
2007-01-02 17:00:00  0.793605  0.7962
2007-01-03 18:00:00  0.780640  0.7909
2007-01-03 19:00:00  0.780005  0.7909
2007-01-03 20:00:00  0.779410  0.7909

您需要沿行(axis=1)将收盘价连接到数据帧。然后,您需要填写远期收盘价。最后,过滤掉价格为空的行

s1 = pd.Series([0.7882, 0.7962, 0.7909, 0.7862, 0.7787, 0.7812, 0.7800, 0.7769], 
               index=pd.date_range('2007-01-01 15:00', periods=8, freq='B'), name='close')

df1 = pd.DataFrame({'price': {
  pd.Timestamp('2007-01-01 00:00:00'): 0.789510,
  pd.Timestamp('2007-01-01 04:00:00'): 0.789380,
  pd.Timestamp('2007-01-01 20:00:00'): 0.789485,
  pd.Timestamp('2007-01-02 01:00:00'): 0.791290,
  pd.Timestamp('2007-01-02 02:00:00'): 0.791630,
  pd.Timestamp('2007-01-02 16:00:00'): 0.793100,
  pd.Timestamp('2007-01-02 17:00:00'): 0.793605,
  pd.Timestamp('2007-01-03 18:00:00'): 0.780640,
  pd.Timestamp('2007-01-03 19:00:00'): 0.780005,
  pd.Timestamp('2007-01-03 20:00:00'): 0.779410}})

df = pd.concat([df1, s1], axis=1)
df.close.ffill(inplace=True)
df = df[~df.price.isnull()]
>>> df
                        price   close
2007-01-01 00:00:00  0.789510     NaN
2007-01-01 04:00:00  0.789380     NaN
2007-01-01 20:00:00  0.789485  0.7882
2007-01-02 01:00:00  0.791290  0.7882
2007-01-02 02:00:00  0.791630  0.7882
2007-01-02 16:00:00  0.793100  0.7962
2007-01-02 17:00:00  0.793605  0.7962
2007-01-03 18:00:00  0.780640  0.7909
2007-01-03 19:00:00  0.780005  0.7909
2007-01-03 20:00:00  0.779410  0.7909

这实际上不是一个“连接”问题,而是一个“重新索引”问题。pandas支持这一点,并且可以在一行代码中完成这一点。见下文

df1['close'] = s1.reindex(df1.index, method='bfill')
这就产生了,

                        price   close
2007-01-01 00:00:00  0.789510  0.7882
2007-01-01 04:00:00  0.789380  0.7882
2007-01-01 20:00:00  0.789485  0.7962
2007-01-02 01:00:00  0.791290  0.7962
2007-01-02 02:00:00  0.791630  0.7962
2007-01-02 16:00:00  0.793100  0.7909
2007-01-02 17:00:00  0.793605  0.7909
2007-01-03 18:00:00  0.780640  0.7862
2007-01-03 19:00:00  0.780005  0.7862
2007-01-03 20:00:00  0.779410  0.7862

这实际上不是一个“连接”问题,而是一个“重新索引”问题。pandas支持这一点,并且可以在一行代码中完成这一点。见下文

df1['close'] = s1.reindex(df1.index, method='bfill')
这就产生了,

                        price   close
2007-01-01 00:00:00  0.789510  0.7882
2007-01-01 04:00:00  0.789380  0.7882
2007-01-01 20:00:00  0.789485  0.7962
2007-01-02 01:00:00  0.791290  0.7962
2007-01-02 02:00:00  0.791630  0.7962
2007-01-02 16:00:00  0.793100  0.7909
2007-01-02 17:00:00  0.793605  0.7909
2007-01-03 18:00:00  0.780640  0.7862
2007-01-03 19:00:00  0.780005  0.7862
2007-01-03 20:00:00  0.779410  0.7862

作为pd进口大熊猫;pd.concat([df1,s1]);这就是你要找的吗?我想这只会把这个系列添加到dataframeAh的末尾,我对这个问题有了更好的理解;pd.concat([df1,s1]);这就是你想要的吗?我想那只会把这个系列添加到dataframeAh的末尾,我更了解这个问题。感谢Alexander,不幸的是,我真正的df1在连接之前包含了很多NaN…还有很多15:00:00的日期时间,所以我不认为这个解决方案会像我所希望的那样工作,不幸的是,我真正的df1在连接之前包含了很多NAN…还有很多15:00:00的日期时间,所以我不认为这个解决方案会像我所希望的那样工作