Python 计算分钟数据的4小时百分比回报率

Python 计算分钟数据的4小时百分比回报率,python,pandas,Python,Pandas,我试图在1分钟的基础上计算4小时的回报 frequency = int(1*60*4) a = data[1:]['close'] / data[:-frequency]['close'].values - 1 但我得到了一个错误: ValueError: operands could not be broadcast together with shapes (253062,) (252823,) 当我以分钟为单位对return执行相同操作时,它会起作用: a = data[1:]['c

我试图在1分钟的基础上计算4小时的回报

frequency = int(1*60*4)
a = data[1:]['close'] / data[:-frequency]['close'].values - 1
但我得到了一个错误:

ValueError: operands could not be broadcast together with shapes (253062,) (252823,) 
当我以分钟为单位对return执行相同操作时,它会起作用:

a = data[1:]['close'] / data[:-1]['close'].values - 1
我怎样才能解决这个问题

这是示例ata:

          date             open     high     low      close  t
0       20150101 130000  1.20965  1.20977  1.20962  1.20962  0
1       20150101 130100  1.20963  1.20968  1.20962  1.20962  0
2       20150101 130200  1.20965  1.20970  1.20961  1.20961  0
3       20150101 130400  1.20959  1.21008  1.20959  1.20983  0
4       20150101 130500  1.20988  1.20988  1.20988  1.20988  0
5       20150101 130600  1.20984  1.20984  1.20982  1.20982  0
6       20150101 130700  1.20986  1.20999  1.20986  1.20987  0
7       20150101 130800  1.20998  1.21022  1.20987  1.21008  0
8       20150101 130900  1.20996  1.20996  1.20996  1.20996  0
9       20150101 131000  1.21013  1.21019  1.20967  1.20977  0
10      20150101 131100  1.20976  1.20999  1.20976  1.20988  0
11      20150101 131200  1.20987  1.20987  1.20987  1.20987  0
12      20150101 131300  1.20997  1.21006  1.20993  1.21006  0
13      20150101 131400  1.21007  1.21007  1.21006  1.21006  0
14      20150101 131600  1.21004  1.21004  1.21004  1.21004  0
15      20150101 131800  1.21003  1.21004  1.20979  1.20979  0
16      20150101 132700  1.21003  1.21003  1.20979  1.21003  0
17      20150101 132800  1.20979  1.21003  1.20979  1.21003  0
18      20150101 132900  1.21001  1.21003  1.20999  1.21003  0
19      20150101 133100  1.21033  1.21041  1.21033  1.21033  0
20      20150101 133200  1.21028  1.21035  1.21006  1.21035  0
21      20150101 133500  1.21005  1.21006  1.21005  1.21005  0
22      20150101 133600  1.21004  1.21006  1.21004  1.21006  0
23      20150101 133700  1.20991  1.21006  1.20991  1.21004  0
24      20150101 133800  1.21003  1.21004  1.20982  1.21004  0
25      20150101 133900  1.21019  1.21034  1.21019  1.21034  0
26      20150101 134000  1.21030  1.21034  1.21030  1.21032  0
27      20150101 134100  1.21006  1.21007  1.21006  1.21007  0
28      20150101 134300  1.21003  1.21006  1.21003  1.21006  0
29      20150101 134400  1.21003  1.21007  1.21003  1.21007  0
...                 ...      ...      ...      ...      ... ..
253033  20150904 162900  1.11511  1.11515  1.11499  1.11509  0
253034  20150904 163000  1.11509  1.11511  1.11507  1.11507  0
253035  20150904 163100  1.11507  1.11530  1.11507  1.11524  0
253036  20150904 163200  1.11521  1.11546  1.11520  1.11537  0
253037  20150904 163300  1.11533  1.11533  1.11520  1.11528  0
253038  20150904 163400  1.11528  1.11528  1.11528  1.11528  0
253039  20150904 163500  1.11527  1.11527  1.11486  1.11491  0
253040  20150904 163600  1.11492  1.11517  1.11489  1.11513  0
253041  20150904 163700  1.11513  1.11513  1.11499  1.11503  0
253042  20150904 163800  1.11498  1.11502  1.11482  1.11491  0
253043  20150904 163900  1.11490  1.11491  1.11489  1.11491  0
253044  20150904 164000  1.11490  1.11491  1.11490  1.11490  0
253045  20150904 164100  1.11488  1.11488  1.11477  1.11480  0
253046  20150904 164200  1.11482  1.11483  1.11481  1.11483  0
253047  20150904 164300  1.11482  1.11484  1.11482  1.11483  0
253048  20150904 164400  1.11483  1.11484  1.11480  1.11484  0
253049  20150904 164500  1.11480  1.11502  1.11480  1.11501  0
253050  20150904 164600  1.11502  1.11506  1.11488  1.11506  0
253051  20150904 164700  1.11501  1.11501  1.11496  1.11501  0
253052  20150904 164800  1.11501  1.11510  1.11499  1.11505  0
253053  20150904 164900  1.11504  1.11520  1.11503  1.11504  0
253054  20150904 165000  1.11506  1.11513  1.11502  1.11511  0
253055  20150904 165100  1.11509  1.11509  1.11500  1.11501  0
253056  20150904 165200  1.11500  1.11510  1.11500  1.11500  0
253057  20150904 165300  1.11516  1.11516  1.11498  1.11505  0
253058  20150904 165400  1.11503  1.11503  1.11454  1.11469  0
253059  20150904 165500  1.11472  1.11472  1.11454  1.11462  0
253060  20150904 165600  1.11462  1.11487  1.11447  1.11479  0
253061  20150904 165700  1.11484  1.11506  1.11477  1.11497  0
253062  20150904 165800  1.11495  1.11497  1.11432  1.11435  0

而不是自己安排切片, 您可以使用
pct\u change
方法计算百分比变化

如果将日期字符串转换为时间戳并将
date
列设置为索引,则可以使用参数
freq='4H'
指定4小时的频率:

data['date'] = pd.to_datetime(data['date'], format='%Y%m%d %H%M%S')
data = data.set_index('date')
a = data['close'].pct_change(freq='4H')

数据[:-frequency]['close']。值是长度为252823的数组

data[1:]['close']
是长度为253062的序列

要将一个元素除以另一个元素,两个元素必须具有相同的长度。 因此,要在不改变pct的情况下计算此值,您需要使用

a = data[frequency:]['close'] / data[:-frequency]['close'].values - 1

比如说,

In [182]: N = 7; s = pd.Series(range(N), index=pd.date_range('2000-1-1', periods=N, freq='H')); s
Out[182]: 
2000-01-01 00:00:00    0
2000-01-01 01:00:00    1
2000-01-01 02:00:00    2
2000-01-01 03:00:00    3
2000-01-01 04:00:00    4
2000-01-01 05:00:00    5
2000-01-01 06:00:00    6
Freq: H, dtype: int64
比较pct\U变化

In [183]: s.pct_change(freq='4H')
Out[183]: 
2000-01-01 00:00:00         NaN
2000-01-01 01:00:00         NaN
2000-01-01 02:00:00         NaN
2000-01-01 03:00:00         NaN
2000-01-01 04:00:00         inf
2000-01-01 05:00:00    4.000000
2000-01-01 06:00:00    2.000000
2000-01-01 07:00:00         NaN
2000-01-01 08:00:00         NaN
2000-01-01 09:00:00         NaN
2000-01-01 10:00:00         NaN
Freq: H, dtype: float64
除名结果:

In [184]: s[4:] / s[:-4].values - 1
Out[184]: 
2000-01-01 04:00:00         inf
2000-01-01 05:00:00    4.000000
2000-01-01 06:00:00    2.000000
Freq: H, dtype: float64

而不是自己安排切片, 您可以使用
pct\u change
方法计算百分比变化

如果将日期字符串转换为时间戳并将
date
列设置为索引,则可以使用参数
freq='4H'
指定4小时的频率:

data['date'] = pd.to_datetime(data['date'], format='%Y%m%d %H%M%S')
data = data.set_index('date')
a = data['close'].pct_change(freq='4H')

数据[:-frequency]['close']。值是长度为252823的数组

data[1:]['close']
是长度为253062的序列

要将一个元素除以另一个元素,两个元素必须具有相同的长度。 因此,要在不改变pct的情况下计算此值,您需要使用

a = data[frequency:]['close'] / data[:-frequency]['close'].values - 1

比如说,

In [182]: N = 7; s = pd.Series(range(N), index=pd.date_range('2000-1-1', periods=N, freq='H')); s
Out[182]: 
2000-01-01 00:00:00    0
2000-01-01 01:00:00    1
2000-01-01 02:00:00    2
2000-01-01 03:00:00    3
2000-01-01 04:00:00    4
2000-01-01 05:00:00    5
2000-01-01 06:00:00    6
Freq: H, dtype: int64
比较pct\U变化

In [183]: s.pct_change(freq='4H')
Out[183]: 
2000-01-01 00:00:00         NaN
2000-01-01 01:00:00         NaN
2000-01-01 02:00:00         NaN
2000-01-01 03:00:00         NaN
2000-01-01 04:00:00         inf
2000-01-01 05:00:00    4.000000
2000-01-01 06:00:00    2.000000
2000-01-01 07:00:00         NaN
2000-01-01 08:00:00         NaN
2000-01-01 09:00:00         NaN
2000-01-01 10:00:00         NaN
Freq: H, dtype: float64
除名结果:

In [184]: s[4:] / s[:-4].values - 1
Out[184]: 
2000-01-01 04:00:00         inf
2000-01-01 05:00:00    4.000000
2000-01-01 06:00:00    2.000000
Freq: H, dtype: float64

您可以添加示例数据吗?在第一种情况下,您的切片长度不同,请尝试
a=data[1:1-frequency]['close']/data[:-frequency]['close'].values-1
@gionni错误消失,但当我将其添加到数据帧时,数据分配不正确。第一个“频率”n行应为空且已填充,而最后一个“频率”n行为空。如何修复此问题?@cmaher我添加了示例数据。尝试使用
shift
,或
pct\u change
periods=frequency
可以添加示例数据吗?在第一种情况下,您的切片长度不同,请尝试
a=data[1:1-频率]['close']/data[:-频率]['close']。值-1
@gionni错误消失,但是当我把它添加到数据框中时,数据并没有得到很好的分配。第一个“频率”n行应为空且已填充,而最后一个“频率”n行为空。如何解决此问题?@cmaher我添加了样本数据。请尝试使用
shift
,或
pct\u change
periods=frequency