Python 机械化&x2B;异步浏览器调用

Python 机械化&x2B;异步浏览器调用,python,asynchronous,multiprocessing,mechanize,Python,Asynchronous,Multiprocessing,Mechanize,我正在寻找一种解决方案,可以在不等待答复的情况下发出大量异步web请求。 这是我当前的代码: import mechanize from mechanize._opener import urlopen from mechanize._form import ParseResponse from multiprocessing import Pool brow = mechanize.Browser() brow.open('https://website.com') #Login brow

我正在寻找一种解决方案,可以在不等待答复的情况下发出大量异步web请求。

这是我当前的代码:

import mechanize
from mechanize._opener import urlopen
from mechanize._form import ParseResponse
from multiprocessing import Pool

brow = mechanize.Browser()
brow.open('https://website.com')

#Login
brow.select_form(nr = 0)

brow.form['username'] = 'user'
brow.form['password'] = 'password'
brow.submit()

while(true):
    #async open the browser until some state is fullfilled
    brow.open('https://website.com/needthiswebsite')
上面代码的问题是,如果我尝试打开两个浏览器,bro2必须等待bro1完成启动。(它的阻塞)

解决方案的尝试:

#PSUDO-CODE

#GLOBAL VARIABLE STATE
boolean state = true

while(state):
    #async open the browser until some state is full filled
    #I spam this function until I get a positive answer from one of the calls
    pool = Pool(processes = 1)
    result = pool.apply_async(openWebsite,[brow1],callback = updateState)

def openWebsite(browser):
   result = browser.open('https://website.com/needthiswebsite')
   if result.something() == WHATIWANT:
        return true
   return false

def updateState(state):
    state = true
我试图为我的问题实施类似的解决方案,如中的答案: 关于堆栈溢出的问题

问题是我在尝试使用pool.apply\u async(brow.open())时出错

错误消息:

PicklingError:无法pickle:属性查找内置。函数失败

我尝试了很多方法来修复PicklingError,但似乎没有任何效果

  • 有可能用机械化来实现这一点吗
  • 我应该换一个像urllib2之类的库吗

非常感谢您的帮助:)

mechanize.Browser对象不可pickle,因此无法将其传递到
池。请应用异步
(或需要将对象发送到子进程的任何其他方法):


理想情况下,您可以使用父进程中的
浏览器
对象登录,然后跨多个进程发出并行请求,但要使对象可pickle可能需要花费大量的精力(如果可能的话)-即使您成功删除了导致当前错误的
instancemethod
对象,除此之外,
浏览器中可能还有更多不可点击的对象。

谢谢,这种情况告诉我,mechanize可能不是我问题的解决方案,因为在我开始向网站发出这些异步请求之前,我需要提前登录,所以每次登录都不起作用。@geb12您可以尝试将请求分批处理在一起,例如一次向每个函数调用传递1000/2000个链接
#PSUDO-CODE

#GLOBAL VARIABLE STATE
boolean state = true

while(state):
    #async open the browser until some state is full filled
    #I spam this function until I get a positive answer from one of the calls
    pool = Pool(processes = 1)
    result = pool.apply_async(openWebsite,[brow1],callback = updateState)

def openWebsite(browser):
   result = browser.open('https://website.com/needthiswebsite')
   if result.something() == WHATIWANT:
        return true
   return false

def updateState(state):
    state = true
>>> b = mechanize.Browser()
>>> import pickle
>>> pickle.dumps(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/pickle.py", line 1374, in dumps
    Pickler(file, protocol).dump(obj)
  File "/usr/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 725, in save_inst
    save(stuff)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 663, in _batch_setitems
    save(v)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 600, in save_list
    self._batch_appends(iter(obj))
  File "/usr/lib/python2.7/pickle.py", line 615, in _batch_appends
    save(x)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 725, in save_inst
    save(stuff)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
    self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 663, in _batch_setitems
    save(v)
  File "/usr/lib/python2.7/pickle.py", line 306, in save
    rv = reduce(self.proto)
  File "/usr/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
    raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle instancemethod objects
def openWebsite(url):
    brow = mechanize.Browser()
    brow.open('https://website.com')

    #Login
    brow.select_form(nr=0)

    brow.form['username'] = 'user'
    brow.form['password'] = 'password'
    brow.submit()

    result = brow.open(url)
    if result.something() == WHATIWANT:
         return True
    return False