python：使用时间序列金融数据的矢量化模拟器_Python_Pandas_Dataframe_Vectorization

python：使用时间序列金融数据的矢量化模拟器

python pandas dataframe

python：使用时间序列金融数据的矢量化模拟器,python,pandas,dataframe,vectorization,Python,Pandas,Dataframe,Vectorization,来自多个报价器的一天报价数据的数据帧： df=pd.数据帧（{'time'：[090000，090000，090000，090001…150000]，'ticker'：[A，B，A，C…Z]，'price'：[10,20,10,30…40]，'volume'：[-10,10,20，-100…50]）使用for-loop制作了一个模拟器，但是当然，在上面运行回测需要很长时间。我制作了一个ticker列表，并循环每个ticker以创建ticker的子数据帧并测试逻辑。这是我对代码进行矢量化的尝

来自多个报价器的一天报价数据的数据帧： df=pd.数据帧（{'time'：[090000，090000，090000，090001…150000]，'ticker'：[A，B，A，C…Z]，'price'：[10,20,10,30…40]，'volume'：[-10,10,20，-100…50]）

使用for-loop制作了一个模拟器，但是当然，在上面运行回测需要很长时间。我制作了一个ticker列表，并循环每个ticker以创建ticker的子数据帧并测试逻辑。这是我对代码进行矢量化的尝试（但它只将性能降低了一半）：

如何使用矢量化加快速度？或者任何需要处理的提示都将不胜感激。多谢各位

def logic(time_ticker):
   preset = selected_df.index.isin(pd.date_range(end=time_ticker,periods=2,freq='S').shift(-1)) #doing this to see how many transactions happened during two seconds ex) if time_ticker is 090000 than it will look for transactions in 095958 and 095959 but not 090000
   adj_df = selected_df.loc[preset]
   if len(adj_df) >5: #see if more than 5 transactions occurred during 2 seconds
      if adj_df.volume.sum() > 100: #if sum of volumes during 2 seconds is great than 100 shares
         buy()

def simulator(ticker):
   preset = df.ticker.isin([ticker]) #from main df, make sub-dataframe consist of selected ticker
   selected_df = df.loc[preset]
   time_idx = selected_df[~selected_df.index.duplicated()].index #time index of ex) of 090000 ~ 150000
   list(map(lambda row: logic(row), time_idx))

def main()
   df = pd.read_csv(PATH_FILE_BOOK) #load df
   ticker_list = list(set(df['ticker'].tolist())) #make ticker list
   list(map(lambda row: simulator(row), id_list)) 

main()