Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/309.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在python中将时间序列数据移动一个月_Python_Pandas_Time Series - Fatal编程技术网

如何在python中将时间序列数据移动一个月

如何在python中将时间序列数据移动一个月,python,pandas,time-series,Python,Pandas,Time Series,我尝试将DataFrame.shift()函数与freq='M'一起使用,但当我偏移1个月时,日期会偏移到月末,而不是下个月的同一日期 我有没有办法用一个月的时间来抵消呢。i、 e.如果我有一个时间序列数据帧,并且第一个索引值是8月23日的话,在移动一个月后,我希望9月23日的索引值在8月23日的索引值之前 请建议一种方法。这将节省大量时间,否则,我将不得不使用循环 我想在此数据帧中创建一个新列,这样对应于索引20-10-01 10:00:00和ticker AAPL的新列中的值应该是时间20-

我尝试将
DataFrame.shift()
函数与
freq='M'
一起使用,但当我偏移1个月时,日期会偏移到月末,而不是下个月的同一日期

我有没有办法用一个月的时间来抵消呢。i、 e.如果我有一个时间序列数据帧,并且第一个索引值是8月23日
的话,在移动一个月后,我希望9月23日
的索引值在8月23日
的索引值之前

请建议一种方法。这将节省大量时间,否则,我将不得不使用循环

我想在此数据帧中创建一个新列,这样对应于索引20-10-01 10:00:00和ticker AAPL的新列中的值应该是时间20-11-01 10:00:00和ticker AAPL的列“c”的值。其他行也是如此。示例数据:

Timestamp('2019-10-01 10:00:00+0000', tz='UTC'): 56.5675,
Timestamp('2019-10-01 16:00:00+0000', tz='UTC'): 56.2725,
Timestamp('2019-10-01 22:00:00+0000', tz='UTC'): 56.2925,
Timestamp('2019-10-02 04:00:00+0000', tz='UTC'): 55.6525,
Timestamp('2019-10-02 10:00:00+0000', tz='UTC'): 54.8025,
Timestamp('2019-10-02 16:00:00+0000', tz='UTC'): 54.625,
Timestamp('2019-10-02 22:00:00+0000', tz='UTC'): 54.625,
Timestamp('2019-10-03 04:00:00+0000', tz='UTC'): 54.825,
Timestamp('2019-10-03 10:00:00+0000', tz='UTC'): 54.7075,
Timestamp('2019-10-03 16:00:00+0000', tz='UTC'): 55.1575,
Timestamp('2019-10-03 22:00:00+0000', tz='UTC'): 55.125,
Timestamp('2019-10-04 04:00:00+0000', tz='UTC'): 55.88,
Timestamp('2019-10-04 10:00:00+0000', tz='UTC'): 56.51,
Timestamp('2019-10-04 16:00:00+0000', tz='UTC'): 56.77,
Timestamp('2019-10-04 22:00:00+0000', tz='UTC'): 56.7375,
Timestamp('2019-10-07 04:00:00+0000', tz='UTC'): 56.5,
Timestamp('2019-10-07 10:00:00+0000', tz='UTC'): 57.3525,
Timestamp('2019-10-07 16:00:00+0000', tz='UTC'): 56.7875,
Timestamp('2019-10-07 22:00:00+0000', tz='UTC'): 56.86,
Timestamp('2019-10-08 04:00:00+0000', tz='UTC'): 56.75,
Timestamp('2019-10-08 10:00:00+0000', tz='UTC'): 56.525,
Timestamp('2019-10-08 16:00:00+0000', tz='UTC'): 55.9775,
Timestamp('2019-10-08 22:00:00+0000', tz='UTC'): 55.925,
Timestamp('2019-10-09 04:00:00+0000', tz='UTC'): 56.75,
Timestamp('2019-10-09 10:00:00+0000', tz='UTC'): 56.6783,
Timestamp('2019-10-09 16:00:00+0000', tz='UTC'): 56.77,
Timestamp('2019-10-09 22:00:00+0000', tz='UTC'): 56.075,
Timestamp('2019-10-10 04:00:00+0000', tz='UTC'): 56.875,
Timestamp('2019-10-10 10:00:00+0000', tz='UTC'): 57.5175,
Timestamp('2019-10-10 16:00:00+0000', tz='UTC'): 57.71,
Timestamp('2019-10-10 22:00:00+0000', tz='UTC'): 57.8125,
Timestamp('2019-10-11 04:00:00+0000', tz='UTC'): 58.235,
Timestamp('2019-10-11 10:00:00+0000', tz='UTC'): 58.62,
Timestamp('2019-10-11 16:00:00+0000', tz='UTC'): 59.1825,
Timestamp('2019-10-11 22:00:00+0000', tz='UTC'): 59.3125,
Timestamp('2019-10-14 04:00:00+0000', tz='UTC'): 58.5925,
Timestamp('2019-10-14 10:00:00+0000', tz='UTC'): 59.25,
Timestamp('2019-10-14 16:00:00+0000', tz='UTC'): 58.975,
Timestamp('2019-10-14 22:00:00+0000', tz='UTC'): 59.1125,
Timestamp('2019-10-15 04:00:00+0000', tz='UTC'): 59.2525,
Timestamp('2019-10-15 10:00:00+0000', tz='UTC'): 58.9238,
Timestamp('2019-10-15 16:00:00+0000', tz='UTC'): 58.9,
Timestamp('2019-10-15 22:00:00+0000', tz='UTC'): 58.75,
Timestamp('2019-10-16 04:00:00+0000', tz='UTC'): 58.565,
Timestamp('2019-10-16 10:00:00+0000', tz='UTC'): 58.59,
Timestamp('2019-10-16 16:00:00+0000', tz='UTC'): 58.6825,
Timestamp('2019-10-16 22:00:00+0000', tz='UTC'): 58.5875,
Timestamp('2019-10-17 04:00:00+0000', tz='UTC'): 58.9375,
Timestamp('2019-10-17 10:00:00+0000', tz='UTC'): 58.48,
Timestamp('2019-10-17 16:00:00+0000', tz='UTC'): 58.8375,
Timestamp('2019-10-17 22:00:00+0000', tz='UTC'): 58.8025,
Timestamp('2019-10-18 04:00:00+0000', tz='UTC'): 58.7275,
Timestamp('2019-10-18 10:00:00+0000', tz='UTC'): 58.7838,
Timestamp('2019-10-18 16:00:00+0000', tz='UTC'): 59.0675,
Timestamp('2019-10-18 22:00:00+0000', tz='UTC'): 59.0525,
Timestamp('2019-10-21 04:00:00+0000', tz='UTC'): 59.3775,
Timestamp('2019-10-21 10:00:00+0000', tz='UTC'): 60.1825,
Timestamp('2019-10-21 16:00:00+0000', tz='UTC'): 60.165,
Timestamp('2019-10-21 22:00:00+0000', tz='UTC'): 60.1725,
Timestamp('2019-10-22 04:00:00+0000', tz='UTC'): 60.1975,
Timestamp('2019-10-22 10:00:00+0000', tz='UTC'): 60.2975,
Timestamp('2019-10-22 16:00:00+0000', tz='UTC'): 59.8025,
Timestamp('2019-10-22 22:00:00+0000', tz='UTC'): 59.755,
Timestamp('2019-10-23 04:00:00+0000', tz='UTC'): 60.3975,
Timestamp('2019-10-23 10:00:00+0000', tz='UTC'): 60.6265,
Timestamp('2019-10-23 16:00:00+0000', tz='UTC'): 60.8875,
Timestamp('2019-10-23 22:00:00+0000', tz='UTC'): 61.0275,
Timestamp('2019-10-24 04:00:00+0000', tz='UTC'): 61.0525,
Timestamp('2019-10-24 10:00:00+0000', tz='UTC'): 60.82,
Timestamp('2019-10-24 16:00:00+0000', tz='UTC'): 60.8125,
Timestamp('2019-10-24 22:00:00+0000', tz='UTC'): 60.8225,
Timestamp('2019-10-25 04:00:00+0000', tz='UTC'): 60.75,
Timestamp('2019-10-25 10:00:00+0000', tz='UTC'): 61.3425,
Timestamp('2019-10-25 16:00:00+0000', tz='UTC'): 61.7,
Timestamp('2019-10-25 22:00:00+0000', tz='UTC'): 61.6875,
Timestamp('2019-10-28 04:00:00+0000', tz='UTC'): 61.8575,
Timestamp('2019-10-28 10:00:00+0000', tz='UTC'): 62.1388,
Timestamp('2019-10-28 16:00:00+0000', tz='UTC'): 62.285,
Timestamp('2019-10-28 22:00:00+0000', tz='UTC'): 62.2875,
Timestamp('2019-10-29 04:00:00+0000', tz='UTC'): 62.15,
Timestamp('2019-10-29 10:00:00+0000', tz='UTC'): 60.7952,
Timestamp('2019-10-29 16:00:00+0000', tz='UTC'): 60.9525,
Timestamp('2019-10-29 22:00:00+0000', tz='UTC'): 60.9575,
Timestamp('2019-10-30 04:00:00+0000', tz='UTC'): 60.9575,
Timestamp('2019-10-30 10:00:00+0000', tz='UTC'): 60.5125,
Timestamp('2019-10-30 16:00:00+0000', tz='UTC'): 62.05,
Timestamp('2019-10-30 22:00:00+0000', tz='UTC'): 62.0475,
Timestamp('2019-10-31 04:00:00+0000', tz='UTC'): 61.76,
Timestamp('2019-10-31 10:00:00+0000', tz='UTC'): 62.0523,
Timestamp('2019-10-31 16:00:00+0000', tz='UTC'): 62.105,
Timestamp('2019-10-31 22:00:00+0000', tz='UTC'): 62.14,
Timestamp('2019-11-01 04:00:00+0000', tz='UTC'): 62.35,
Timestamp('2019-11-01 10:00:00+0000', tz='UTC'): 63.3099,
Timestamp('2019-11-01 16:00:00+0000', tz='UTC'): 63.9725,
Timestamp('2019-11-01 22:00:00+0000', tz='UTC'): 64.025,
Timestamp('2019-11-04 10:00:00+0000', tz='UTC'): 64.2388,
Timestamp('2019-11-04 16:00:00+0000', tz='UTC'): 64.375,
Timestamp('2019-11-04 22:00:00+0000', tz='UTC'): 64.4975,
Timestamp('2019-11-05 04:00:00+0000', tz='UTC'): 64.575}}
这是数据集 预期的新列是:62.35 63.3099、63.9725、64.025等 我想要提前1个月的值
但是使用
df['new_column']=df.shift(1,freq='M')['c']
并不能完成这项工作

这个问题非常简单,但是你需要在日期上做一些具体的事情才能得到
n

  • 找到需要移位的行数,我称之为
    n
    ,并使用
    pd.DateOffset(months=1)
  • 您需要按
    n
    行向上移动
    -

  • 请注意,为了获得上述输出,我使用了:

    df = pd.DataFrame(
        {pd.Timestamp('2019-10-01 10:00:00+0000', tz='UTC'): 56.5675,
        pd.Timestamp('2019-10-01 16:00:00+0000', tz='UTC'): 56.2725,
        pd.Timestamp('2019-10-01 22:00:00+0000', tz='UTC'): 56.2925,
        pd.Timestamp('2019-10-02 04:00:00+0000', tz='UTC'): 55.6525,
        pd.Timestamp('2019-10-02 10:00:00+0000', tz='UTC'): 54.8025,
        pd.Timestamp('2019-10-02 16:00:00+0000', tz='UTC'): 54.625,
        pd.Timestamp('2019-10-02 22:00:00+0000', tz='UTC'): 54.625,
        pd.Timestamp('2019-10-03 04:00:00+0000', tz='UTC'): 54.825,
        pd.Timestamp('2019-10-03 10:00:00+0000', tz='UTC'): 54.7075,
        pd.Timestamp('2019-10-03 16:00:00+0000', tz='UTC'): 55.1575,
        pd.Timestamp('2019-10-03 22:00:00+0000', tz='UTC'): 55.125,
        pd.Timestamp('2019-10-04 04:00:00+0000', tz='UTC'): 55.88,
        pd.Timestamp('2019-10-04 10:00:00+0000', tz='UTC'): 56.51,
        pd.Timestamp('2019-10-04 16:00:00+0000', tz='UTC'): 56.77,
        pd.Timestamp('2019-10-04 22:00:00+0000', tz='UTC'): 56.7375,
        pd.Timestamp('2019-10-07 04:00:00+0000', tz='UTC'): 56.5,
        pd.Timestamp('2019-10-07 10:00:00+0000', tz='UTC'): 57.3525,
        pd.Timestamp('2019-10-07 16:00:00+0000', tz='UTC'): 56.7875,
        pd.Timestamp('2019-10-07 22:00:00+0000', tz='UTC'): 56.86,
        pd.Timestamp('2019-10-08 04:00:00+0000', tz='UTC'): 56.75,
        pd.Timestamp('2019-10-08 10:00:00+0000', tz='UTC'): 56.525,
        pd.Timestamp('2019-10-08 16:00:00+0000', tz='UTC'): 55.9775,
        pd.Timestamp('2019-10-08 22:00:00+0000', tz='UTC'): 55.925,
        pd.Timestamp('2019-10-09 04:00:00+0000', tz='UTC'): 56.75,
        pd.Timestamp('2019-10-09 10:00:00+0000', tz='UTC'): 56.6783,
        pd.Timestamp('2019-10-09 16:00:00+0000', tz='UTC'): 56.77,
        pd.Timestamp('2019-10-09 22:00:00+0000', tz='UTC'): 56.075,
        pd.Timestamp('2019-10-10 04:00:00+0000', tz='UTC'): 56.875,
        pd.Timestamp('2019-10-10 10:00:00+0000', tz='UTC'): 57.5175,
        pd.Timestamp('2019-10-10 16:00:00+0000', tz='UTC'): 57.71,
        pd.Timestamp('2019-10-10 22:00:00+0000', tz='UTC'): 57.8125,
        pd.Timestamp('2019-10-11 04:00:00+0000', tz='UTC'): 58.235,
        pd.Timestamp('2019-10-11 10:00:00+0000', tz='UTC'): 58.62,
        pd.Timestamp('2019-10-11 16:00:00+0000', tz='UTC'): 59.1825,
        pd.Timestamp('2019-10-11 22:00:00+0000', tz='UTC'): 59.3125,
        pd.Timestamp('2019-10-14 04:00:00+0000', tz='UTC'): 58.5925,
        pd.Timestamp('2019-10-14 10:00:00+0000', tz='UTC'): 59.25,
        pd.Timestamp('2019-10-14 16:00:00+0000', tz='UTC'): 58.975,
        pd.Timestamp('2019-10-14 22:00:00+0000', tz='UTC'): 59.1125,
        pd.Timestamp('2019-10-15 04:00:00+0000', tz='UTC'): 59.2525,
        pd.Timestamp('2019-10-15 10:00:00+0000', tz='UTC'): 58.9238,
        pd.Timestamp('2019-10-15 16:00:00+0000', tz='UTC'): 58.9,
        pd.Timestamp('2019-10-15 22:00:00+0000', tz='UTC'): 58.75,
        pd.Timestamp('2019-10-16 04:00:00+0000', tz='UTC'): 58.565,
        pd.Timestamp('2019-10-16 10:00:00+0000', tz='UTC'): 58.59,
        pd.Timestamp('2019-10-16 16:00:00+0000', tz='UTC'): 58.6825,
        pd.Timestamp('2019-10-16 22:00:00+0000', tz='UTC'): 58.5875,
        pd.Timestamp('2019-10-17 04:00:00+0000', tz='UTC'): 58.9375,
        pd.Timestamp('2019-10-17 10:00:00+0000', tz='UTC'): 58.48,
        pd.Timestamp('2019-10-17 16:00:00+0000', tz='UTC'): 58.8375,
        pd.Timestamp('2019-10-17 22:00:00+0000', tz='UTC'): 58.8025,
        pd.Timestamp('2019-10-18 04:00:00+0000', tz='UTC'): 58.7275,
        pd.Timestamp('2019-10-18 10:00:00+0000', tz='UTC'): 58.7838,
        pd.Timestamp('2019-10-18 16:00:00+0000', tz='UTC'): 59.0675,
        pd.Timestamp('2019-10-18 22:00:00+0000', tz='UTC'): 59.0525,
        pd.Timestamp('2019-10-21 04:00:00+0000', tz='UTC'): 59.3775,
        pd.Timestamp('2019-10-21 10:00:00+0000', tz='UTC'): 60.1825,
        pd.Timestamp('2019-10-21 16:00:00+0000', tz='UTC'): 60.165,
        pd.Timestamp('2019-10-21 22:00:00+0000', tz='UTC'): 60.1725,
        pd.Timestamp('2019-10-22 04:00:00+0000', tz='UTC'): 60.1975,
        pd.Timestamp('2019-10-22 10:00:00+0000', tz='UTC'): 60.2975,
        pd.Timestamp('2019-10-22 16:00:00+0000', tz='UTC'): 59.8025,
        pd.Timestamp('2019-10-22 22:00:00+0000', tz='UTC'): 59.755,
        pd.Timestamp('2019-10-23 04:00:00+0000', tz='UTC'): 60.3975,
        pd.Timestamp('2019-10-23 10:00:00+0000', tz='UTC'): 60.6265,
        pd.Timestamp('2019-10-23 16:00:00+0000', tz='UTC'): 60.8875,
        pd.Timestamp('2019-10-23 22:00:00+0000', tz='UTC'): 61.0275,
        pd.Timestamp('2019-10-24 04:00:00+0000', tz='UTC'): 61.0525,
        pd.Timestamp('2019-10-24 10:00:00+0000', tz='UTC'): 60.82,
        pd.Timestamp('2019-10-24 16:00:00+0000', tz='UTC'): 60.8125,
        pd.Timestamp('2019-10-24 22:00:00+0000', tz='UTC'): 60.8225,
        pd.Timestamp('2019-10-25 04:00:00+0000', tz='UTC'): 60.75,
        pd.Timestamp('2019-10-25 10:00:00+0000', tz='UTC'): 61.3425,
        pd.Timestamp('2019-10-25 16:00:00+0000', tz='UTC'): 61.7,
        pd.Timestamp('2019-10-25 22:00:00+0000', tz='UTC'): 61.6875,
        pd.Timestamp('2019-10-28 04:00:00+0000', tz='UTC'): 61.8575,
        pd.Timestamp('2019-10-28 10:00:00+0000', tz='UTC'): 62.1388,
        pd.Timestamp('2019-10-28 16:00:00+0000', tz='UTC'): 62.285,
        pd.Timestamp('2019-10-28 22:00:00+0000', tz='UTC'): 62.2875,
        pd.Timestamp('2019-10-29 04:00:00+0000', tz='UTC'): 62.15,
        pd.Timestamp('2019-10-29 10:00:00+0000', tz='UTC'): 60.7952,
        pd.Timestamp('2019-10-29 16:00:00+0000', tz='UTC'): 60.9525,
        pd.Timestamp('2019-10-29 22:00:00+0000', tz='UTC'): 60.9575,
        pd.Timestamp('2019-10-30 04:00:00+0000', tz='UTC'): 60.9575,
        pd.Timestamp('2019-10-30 10:00:00+0000', tz='UTC'): 60.5125,
        pd.Timestamp('2019-10-30 16:00:00+0000', tz='UTC'): 62.05,
        pd.Timestamp('2019-10-30 22:00:00+0000', tz='UTC'): 62.0475,
        pd.Timestamp('2019-10-31 04:00:00+0000', tz='UTC'): 61.76,
        pd.Timestamp('2019-10-31 10:00:00+0000', tz='UTC'): 62.0523,
        pd.Timestamp('2019-10-31 16:00:00+0000', tz='UTC'): 62.105,
        pd.Timestamp('2019-10-31 22:00:00+0000', tz='UTC'): 62.14,
        pd.Timestamp('2019-11-01 04:00:00+0000', tz='UTC'): 62.35,
        pd.Timestamp('2019-11-01 10:00:00+0000', tz='UTC'): 63.3099,
        pd.Timestamp('2019-11-01 16:00:00+0000', tz='UTC'): 63.9725,
        pd.Timestamp('2019-11-01 22:00:00+0000', tz='UTC'): 64.025,
        pd.Timestamp('2019-11-04 10:00:00+0000', tz='UTC'): 64.2388,
        pd.Timestamp('2019-11-04 16:00:00+0000', tz='UTC'): 64.375,
        pd.Timestamp('2019-11-04 22:00:00+0000', tz='UTC'): 64.4975,
        pd.Timestamp('2019-11-05 04:00:00+0000', tz='UTC'): 64.575}, index=['c']).T
    df = df.reset_index().rename({'index': 'Date'}, axis=1)
    
    # and then my answer:
    
    df['Date'] = pd.to_datetime(pd.to_datetime(df['Date']).dt.date)
    n = df[df['Date'].isin(pd.to_datetime(df['Date'] + 
    pd.DateOffset(months=1)))].index[0]
    df['new_column'] = df['c'].shift(-n)
    df
    

    假设您每天都有唯一的时间戳,并且没有任何缺少的时间戳值。下面的方法可能会奏效,因为您只需要根据天移动行,并且每天不需要唯一的时间戳

    import pandas as pd
    
    # Dummy data
    # I assumed you have 4 unique values for a day's timestamp and don't have any missing values
    lst1 = list(pd.date_range('2020-08-01 04:00:00', periods=60))
    lst2 = list(pd.date_range('2020-08-01 10:00:00', periods=60))
    lst3 = list(pd.date_range('2020-08-01 16:00:00', periods=60))
    lst4 = list(pd.date_range('2020-08-01 22:00:00', periods=60))
    
    lst1.extend(lst2)
    lst1.extend(lst3)
    lst1.extend(lst4)
    
    data = {
        'date': lst1,
        'value': [v for v in range(0,240)]
    }
    
    # Preprocessing
    df = pd.DataFrame(data)
    df = df.sort_values(by=['date'])
    df.reset_index(drop=True, inplace=True)
    
    def update(row,df):
      row['value'] = df.loc[row.name]['value']
      return row
    
    # factor is = X days of shift * Y unique time stamps per day 
    factor = 31 * 4 
    df.apply(update,axis=1,args=[df.shift(-factor)])
    

    能否提供数据帧的代码/文本和预期输出?请不要图像。提供例如
    df.head(20).to_dict()的输出instead@anon01编辑question@anon01,明白了吗?c这是一个列的名称,它是一个多索引dataframe@anon01现在明白了吗?
    
    import pandas as pd
    
    # Dummy data
    # I assumed you have 4 unique values for a day's timestamp and don't have any missing values
    lst1 = list(pd.date_range('2020-08-01 04:00:00', periods=60))
    lst2 = list(pd.date_range('2020-08-01 10:00:00', periods=60))
    lst3 = list(pd.date_range('2020-08-01 16:00:00', periods=60))
    lst4 = list(pd.date_range('2020-08-01 22:00:00', periods=60))
    
    lst1.extend(lst2)
    lst1.extend(lst3)
    lst1.extend(lst4)
    
    data = {
        'date': lst1,
        'value': [v for v in range(0,240)]
    }
    
    # Preprocessing
    df = pd.DataFrame(data)
    df = df.sort_values(by=['date'])
    df.reset_index(drop=True, inplace=True)
    
    def update(row,df):
      row['value'] = df.loc[row.name]['value']
      return row
    
    # factor is = X days of shift * Y unique time stamps per day 
    factor = 31 * 4 
    df.apply(update,axis=1,args=[df.shift(-factor)])