Javascript 使用python mechanize登录quora
我正在尝试使用mechanize模块登录quora。以下是我正在使用的代码:Javascript 使用python mechanize登录quora,javascript,python-2.7,xmlhttprequest,web-scraping,mechanize-python,Javascript,Python 2.7,Xmlhttprequest,Web Scraping,Mechanize Python,我正在尝试使用mechanize模块登录quora。以下是我正在使用的代码: import mechanize import cookielib br = mechanize.Browser() # create a browser object br.set_handle_equiv(True) br.set_handle_redirect(True) br.set_handle_referer(True) br.set_handle_robots(False) br.set_headers
import mechanize
import cookielib
br = mechanize.Browser() # create a browser object
br.set_handle_equiv(True)
br.set_handle_redirect(True)
br.set_handle_referer(True)
br.set_handle_robots(False)
br.set_headers = [('User-Agent', 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)')]
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)
res = br.open('http://www.quora.com')
br.select_form(nr = 0)
br.form['email'] = 'uuuu'
br.form['password'] = 'pppp'
res = br.submit()
print res.read()
我遇到以下错误:
Traceback (most recent call last):
File "mech.py", line 29, in <module>
res = br.submit()
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 541, in submit
return self.open(self.click(*args, **kwds))
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 203, in open
return self._mech_open(url, data, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/mechanize/_mechanize.py", line 255, in _mech_open
raise response
mechanize._response.httperror_seek_wrapper: HTTP Error 500: Internal Server Error
回溯(最近一次呼叫最后一次):
文件“mech.py”,第29行,在
res=br.submit()
文件“/usr/local/lib/python2.7/dist-packages/mechanize/_-mechanize.py”,第541行,提交
返回self.open(self.click(*args,**kwds))
文件“/usr/local/lib/python2.7/dist packages/mechanize/_mechanize.py”,第203行,打开
返回self.\u mech\u open(url、数据、超时=超时)
文件“/usr/local/lib/python2.7/dist-packages/mechanize/\u-mechanize.py”,第255行,处于打开状态
提出回应
mechanize.\u response.httperror\u seek\u包装器:HTTP错误500:内部服务器错误
同一问题的一个答案是,这是因为此特定表单是通过javascript XHR POST请求提交的,请求参数如下所示:
json:{"args":[],"kwargs":{"email":"<email>","password":"<password>","passwordless":1}}
formkey:62c4f0d88246bfd81b27cf0dca410d75
window_id:dep4-4597603286175583039
_lm_transaction_id:0.4317954108119011
_lm_window_id:dep4-4597603286175583039
__vcon_json:["hmac","t1cKg1QhQsYPCA"]
__vcon_method:do_login
js_init:{}
json:{“args”:[],“kwargs”:{“email”:“password”:“passwordless”:1}
formkey:62c4f0d88246bfd81b27cf0dca410d75
窗口id:dep4-4597603286175583039
_lm_交易编号:0.4317954108119011
_lm_窗口id:dep4-4597603286175583039
__vcon_json:[“hmac”,“t1cKg1QhQsYPCA”]
__vcon_方法:do_登录
js_init:{}
有没有解决这个问题的模块
另外,我使用了selenium,它工作起来很慢。你有没有尝试过使用selenium的无头浏览器(例如PhantomJS)?没有,我没有尝试过,但是速度会显著提高吗??没有其他方法可以做到这一点吗?你绝对应该尝试
PhantomJS
-这就像将webdriver.Firefox
更改为webdriver.PhantomJS
(当然,你还需要下载PhantomJS驱动程序)一样简单。mechanize对我来说也适用于类似的代码,除了我只使用:br.addheaders=[('User-agent','Firefox')]