python：并行运行函数_Python_Parallel Processing

python：并行运行函数

python parallel-processing

python：并行运行函数,python,parallel-processing,Python,Parallel Processing,我想并行运行两个函数。这些函数在循环中执行多次。这是我的密码： #get the html content of the first rental previous_url_rental=BeautifulSoup(urllib.urlopen(rentals[0])) #for each rental on the page for rental_num in xrange(1, len(rentals)): #get the html content of the page

我想并行运行两个函数。这些函数在循环中执行多次。这是我的密码：

#get the html content of the first rental
previous_url_rental=BeautifulSoup(urllib.urlopen(rentals[0]))

#for each rental on the page
for rental_num in xrange(1, len(rentals)):
    #get the html content of the page
    url_rental=BeautifulSoup(urllib.urlopen(rentals[rental_num]))
    #get and save the rental data in the csv file
    writer.writerow(get_data_rental(previous_url_rental))
    previous_url_rental=url_rental

#save last rental
writer.writerow(get_data_rental(previous_url_rental))

主要有两件事：

1/获取页面的html内容：

url\u rental=BeautifulSoup（urllib.urlopen（rentals[rental\u num]））

2/从上一页（而不是当前页）的html内容检索和保存数据，因为这两个过程是相互依赖的：

writer.writerow（获取数据租用（以前的url租用））

我想并行运行这两行：第一个进程将获取页面

n+1

的html内容，而第二个进程将检索并保存页面

的数据。到目前为止，我已搜索并找到此帖子：。但是我不知道怎么用它

谢谢您的时间。

为了在Python中并行运行函数（即在多个CPU上），您需要使用

然而，我怀疑这是否值得为两个例子付出努力

如果您可以并行运行两个以上的进程，请使用上述模块中的Pool类，文档中有一个示例

池中的每个工作者都将从一个页面检索和保存数据，然后获取下一个要执行的作业。然而，这并不容易，因为编写器必须能够同时处理多个写操作。因此，您可能还需要一个队列来序列化写操作，而每个工作人员只需检索页面、提取信息并将结果发送到队列中供编写者处理。

也许python的标准线程模块对您来说很有趣？像贝尔说的那样排队对我来说似乎是件好事

通过这种方式，我使用线程库（不带队列），如果您想：

#!/usr/bin/python

import threading
from threading import Thread
import time

fetch_stop = threading.Event()
process_stop = threading.Event()

def fetch_rental(arg1, stop_event):
    while(not stop_event.is_set()):
        #fetch content from url and add to Queue

def process_rental(arg1, stop_event):
    while(not stop_event.is_set()):
        #get item(s) from Queue, process them, and write to CSV


try:
    Thread(target=fetch_rental,   name="Fetch rental",   args=(2, fetch_stop  )).start()
    Thread(target=process_rental, name="Process rental", args=(2, process_stop)).start()
    while True:
        time.sleep(10) #wait here while the processes run
except:
    fetch_stop.set()
    process_stop.set()
    exit()

现在，您可以使用锁和事件与流程交互（请参阅文档）

下载第n页后，可以将其添加到列表或队列中。然后可以通知第二个进程有一个新页面要处理。

您看到的是多处理，这是一个很长的过程。此外，你为什么要同时做呢？如果尚未检索数据，则无法写入行。这就是为什么我希望获取页面

\n+1

，同时写入页面

\n

的数据。有可能吗？我们谈论的数据有多大？有多少页？每个租金（int或string）需要检索50个变量，这看起来很复杂，可能不会节省很多时间。无论如何，我都想试试看，亲自看看。但是我不明白如何改编你刚才分享的链接中的例子。你能把这些例子应用到我的具体任务中吗，或者给我看其他的例子？非常感谢。