python性能，循环速度提升_Python_Hadoop_Multiprocessing_Python Multithreading

python性能，循环速度提升

python hadoop

python性能，循环速度提升,python,hadoop,multiprocessing,python-multithreading,Python,Hadoop,Multiprocessing,Python Multithreading,下面的函数应该每38天对股票价格与时间进行一次核回归 def kernelregressions(assetprices, windowperiod): #regresses prices on time timeindex = np.linspace(1, assetprices.shape[0], assetprices.shape[0]) #time to start from 1 upto the length of the dataset smoothprices = pd.DataF

下面的函数应该每38天对股票价格与时间进行一次核回归

def kernelregressions(assetprices, windowperiod): #regresses prices on time
timeindex = np.linspace(1, assetprices.shape[0], assetprices.shape[0]) #time to start from 1 upto the length of the dataset
smoothprices = pd.DataFrame(index=timeindex, columns=assetprices.columns, dtype="float") # empty frame to store the smoothed prices
for symbol in assetprices.columns:
    for i in range(0,assetprices.shape[0]-windowperiod-1, windowperiod+1): #regressions are done for each 38 day period
        smoothprice = KernelReg(assetprices[symbol].iloc[i:i+windowperiod], 
                                timeindex[i:i+windowperiod], var_type="c", bw="cv_ls").fit()[0]
        smoothprices[symbol].iloc[i:i+windowperiod]= smoothprice
return smoothprices #returns smooth prices

“assetprices”输入是一个包含股票价格列的数据框，在我的例子中，窗口期是38

当dataframe有100只股票，价格系列为500天时，上述函数将永远运行

是否有任何python模块、包等。。（或者更好的循环方式）我可以利用它来加速这个函数？

多处理

或

多线程

是否适用于此处？

您可以在assetprices中使用

多线程

而不是

。列：

循环，因为它们没有任何共享变量，但如果我理解正确的话，这段代码只在38天内运行一次，那么这段代码还不够长吗？@它不是每38天运行一次。核回归不是针对价格序列进行一次，而是针对价格的每38次观察进行一次，即从时间1到t=38，然后t=39到77，然后t=78到t=116……一直到t=500。但是您仍然可以对循环使用多线程，如果按照标签的建议使用

hadoop

，您可以很容易地做到。对。。谢谢但是你能写下多线程的代码吗。我不知道它是如何工作的…我还是一个python的初学者。你应该在另一个帖子里问它。每个帖子只允许一个问题。还有一些比我更了解python的人帮助你。但是对于简单的多线程，为什么不呢