Python 3.x 如何使用熊猫数据帧和买卖价格计算成交量加权平均价格(VWAP)?
如果我的表如下所示,如何创建另一个名为vwap的列来计算vwapPython 3.x 如何使用熊猫数据帧和买卖价格计算成交量加权平均价格(VWAP)?,python-3.x,pandas,numpy,dataframe,quantitative-finance,Python 3.x,Pandas,Numpy,Dataframe,Quantitative Finance,如果我的表如下所示,如何创建另一个名为vwap的列来计算vwap time bid_size bid ask ask_size trade trade_size phase 0 2019-01-07 07:45:01.064515 495 152.52 152.54 19 NaN NaN OPEN 1 2019-01-07 07:45:01.110072 31 1
time bid_size bid ask ask_size trade trade_size phase
0 2019-01-07 07:45:01.064515 495 152.52 152.54 19 NaN NaN OPEN
1 2019-01-07 07:45:01.110072 31 152.53 152.54 19 NaN NaN OPEN
2 2019-01-07 07:45:01.116596 32 152.53 152.54 19 NaN NaN OPEN
3 2019-01-07 07:45:01.116860 32 152.53 152.54 21 NaN NaN OPEN
4 2019-01-07 07:45:01.116905 34 152.53 152.54 21 NaN NaN OPEN
5 2019-01-07 07:45:01.116982 34 152.53 152.54 31 NaN NaN OPEN
6 2019-01-07 07:45:01.147901 38 152.53 152.54 31 NaN NaN OPEN
7 2019-01-07 07:45:01.189971 38 152.53 152.54 31 ask 15.0 OPEN
8 2019-01-07 07:45:01.189971 38 152.53 152.54 16 NaN NaN OPEN
9 2019-01-07 07:45:01.190766 37 152.53 152.54 16 NaN NaN OPEN
10 2019-01-07 07:45:01.190856 37 152.53 152.54 15 NaN NaN OPEN
11 2019-01-07 07:45:01.190856 37 152.53 152.54 16 ask 1.0 OPEN
12 2019-01-07 07:45:01.193938 37 152.53 152.55 108 NaN NaN OPEN
13 2019-01-07 07:45:01.193938 37 152.53 152.54 15 ask 15.0 OPEN
14 2019-01-07 07:45:01.194326 2 152.54 152.55 108 NaN NaN OPEN
15 2019-01-07 07:45:01.194453 2 152.54 152.55 97 NaN NaN OPEN
16 2019-01-07 07:45:01.194479 6 152.54 152.55 97 NaN NaN OPEN
17 2019-01-07 07:45:01.194507 19 152.54 152.55 97 NaN NaN OPEN
18 2019-01-07 07:45:01.194532 19 152.54 152.55 77 NaN NaN OPEN
19 2019-01-07 07:45:01.194598 19 152.54 152.55 79 NaN NaN OPEN
抱歉,该表不清楚,但最右边的第二列是trade_size,左边是trade,它显示了交易的一侧(买入或卖出)。如果trade_size和trade均为NaN,则表示在该时间戳没有发生交易
如果df['trade]==“ask”,则交易价格将是“ask”列中的价格,如果df['trade]==“bid”,则交易价格将是“bid”列中的价格。既然有两种价格,我可以问一下如何计算vwap,df['vwap']
我的想法是使用np.cumsum()。谢谢大家! 这里有一种可能的方法 追加
VMAP
列,其中包含NaN
s
df['VMAP'] = np.nan
计算VMAP
(基于方程式)并根据ask
或bid
赋值
编辑
作为@edinho
,VMAP
与trade\u price
列相同。好的,在这里
df['trade_price'] = df.apply(lambda x: x['bid'] if x['trade']=='bid' else x['ask'], axis=1)
df['vwap'] = (df['trade_price'] * df['trade_size']).cumsum() / df['trade_size'].fillna(0).cumsum()
第一行:它将交易价格保存在一个新列中,以便以后更容易检索。
如果您愿意,您可以删除这一行并创建一个函数(可能更容易阅读)。但我更愿意看到中间结果。
问:为什么即使没有交易,它也有价值?
答:因为lambda的书写方式。
else
捕获ask
价格。但这不会有什么不同,因为下一步就是这样
第二行:在这里进行实际计算。
第一部分计算到那一刻为止的总交易量(如你所说,使用累计金额使生活更轻松)。
第二部分计算截至该时刻的总交易量(同样是累计金额)。
如果您愿意,您可以打断此行并创建更多中间列。
问:为什么
fillna(0)
?答:所以总体积不会得到
NaNs
,也不会得到除法错误
问:为什么在vwap
栏中有这么多nan
A:因为没有交易的线路。您可以用0s
填充它们,但最好保留“无交易”信息
注:你可能会得到一个错误的结果,因为它只在同一个方向上考虑数量和价格。但是,您可以尝试反转一些信号,以按照您预期的方式固定音量(例如:将ask
价格更改为负值)
此代码输出:
trade_price vwap
1 152.54 NaN
2 152.54 NaN
3 152.54 NaN
4 152.54 NaN
5 152.54 NaN
6 152.54 NaN
7 152.54 NaN
8 152.54 152.54
9 152.54 NaN
10 152.54 NaN
11 152.54 NaN
12 152.54 152.54
13 152.55 NaN
14 152.54 152.54
15 152.55 NaN
16 152.55 NaN
17 152.55 NaN
18 152.55 NaN
19 152.55 NaN
20 152.55 NaN
根据trade
列中的值,您可以使用从正确的列(bid
或ask
)中向您提供价格。请注意,当没有交易发生时,这会给出出价,但因为这会乘以NaN
交易规模,所以这无关紧要。我还向前填充了VWAP
volume = df['trade_size']
price = np.where(df['trade'].eq('ask'), df['ask'], df['bid'])
df = df.assign(VWAP=((volume * price).cumsum() / vol.cumsum()).ffill())
>>> df
time bid_size bid ask ask_size trade trade_size phase VWAP
0 2019-01-07 07:45:01.064515 495 152.52 152.54 19 NaN NaN OPEN NaN
1 2019-01-07 07:45:01.110072 31 152.53 152.54 19 NaN NaN OPEN NaN
2 2019-01-07 07:45:01.116596 32 152.53 152.54 19 NaN NaN OPEN NaN
3 2019-01-07 07:45:01.116860 32 152.53 152.54 21 NaN NaN OPEN NaN
4 2019-01-07 07:45:01.116905 34 152.53 152.54 21 NaN NaN OPEN NaN
5 2019-01-07 07:45:01.116982 34 152.53 152.54 31 NaN NaN OPEN NaN
6 2019-01-07 07:45:01.147901 38 152.53 152.54 31 NaN NaN OPEN NaN
7 2019-01-07 07:45:01.189971 38 152.53 152.54 31 ask 15.0 OPEN 152.54
8 2019-01-07 07:45:01.189971 38 152.53 152.54 16 NaN NaN OPEN 152.54
9 2019-01-07 07:45:01.190766 37 152.53 152.54 16 NaN NaN OPEN 152.54
10 2019-01-07 07:45:01.190856 37 152.53 152.54 15 NaN NaN OPEN 152.54
11 2019-01-07 07:45:01.190856 37 152.53 152.54 16 ask 1.0 OPEN 152.54
12 2019-01-07 07:45:01.193938 37 152.53 152.55 108 NaN NaN OPEN 152.54
13 2019-01-07 07:45:01.193938 37 152.53 152.54 15 ask 15.0 OPEN 152.54
14 2019-01-07 07:45:01.194326 2 152.54 152.55 108 NaN NaN OPEN 152.54
15 2019-01-07 07:45:01.194453 2 152.54 152.55 97 NaN NaN OPEN 152.54
16 2019-01-07 07:45:01.194479 6 152.54 152.55 97 NaN NaN OPEN 152.54
17 2019-01-07 07:45:01.194507 19 152.54 152.55 97 NaN NaN OPEN 152.54
18 2019-01-07 07:45:01.194532 19 152.54 152.55 77 NaN NaN OPEN 152.54
19 2019-01-07 07:45:01.194598 19 152.54 152.55 79 NaN NaN OPEN 152.54
VMAP的方程式是什么?你应该使用哪个价格?出价还是要价?如果df['trade']==“ask”,使用df['ask]。否则,如果df['trade]==“bid”,则使用df[“bid”]。如果df['trade']==NaN,则表示无交易“vol.cumsum()中的vol应该是“volume”
volume = df['trade_size']
price = np.where(df['trade'].eq('ask'), df['ask'], df['bid'])
df = df.assign(VWAP=((volume * price).cumsum() / vol.cumsum()).ffill())
>>> df
time bid_size bid ask ask_size trade trade_size phase VWAP
0 2019-01-07 07:45:01.064515 495 152.52 152.54 19 NaN NaN OPEN NaN
1 2019-01-07 07:45:01.110072 31 152.53 152.54 19 NaN NaN OPEN NaN
2 2019-01-07 07:45:01.116596 32 152.53 152.54 19 NaN NaN OPEN NaN
3 2019-01-07 07:45:01.116860 32 152.53 152.54 21 NaN NaN OPEN NaN
4 2019-01-07 07:45:01.116905 34 152.53 152.54 21 NaN NaN OPEN NaN
5 2019-01-07 07:45:01.116982 34 152.53 152.54 31 NaN NaN OPEN NaN
6 2019-01-07 07:45:01.147901 38 152.53 152.54 31 NaN NaN OPEN NaN
7 2019-01-07 07:45:01.189971 38 152.53 152.54 31 ask 15.0 OPEN 152.54
8 2019-01-07 07:45:01.189971 38 152.53 152.54 16 NaN NaN OPEN 152.54
9 2019-01-07 07:45:01.190766 37 152.53 152.54 16 NaN NaN OPEN 152.54
10 2019-01-07 07:45:01.190856 37 152.53 152.54 15 NaN NaN OPEN 152.54
11 2019-01-07 07:45:01.190856 37 152.53 152.54 16 ask 1.0 OPEN 152.54
12 2019-01-07 07:45:01.193938 37 152.53 152.55 108 NaN NaN OPEN 152.54
13 2019-01-07 07:45:01.193938 37 152.53 152.54 15 ask 15.0 OPEN 152.54
14 2019-01-07 07:45:01.194326 2 152.54 152.55 108 NaN NaN OPEN 152.54
15 2019-01-07 07:45:01.194453 2 152.54 152.55 97 NaN NaN OPEN 152.54
16 2019-01-07 07:45:01.194479 6 152.54 152.55 97 NaN NaN OPEN 152.54
17 2019-01-07 07:45:01.194507 19 152.54 152.55 97 NaN NaN OPEN 152.54
18 2019-01-07 07:45:01.194532 19 152.54 152.55 77 NaN NaN OPEN 152.54
19 2019-01-07 07:45:01.194598 19 152.54 152.55 79 NaN NaN OPEN 152.54