Python 使用cookies从google scholar(bibtex)导入数据

Python 使用cookies从google scholar(bibtex)导入数据,python,firefox,cookies,Python,Firefox,Cookies,代码如下: import cookielib import urllib2 from bs4 import BeautifulSoup headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0'} url='http://scholar.google.co.in/scholar_setprefs?sciifh=1&scisig=AAGBfm0A

代码如下:

import cookielib
import urllib2 
from bs4 import  BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0'}
url='http://scholar.google.co.in/scholar_setprefs?sciifh=1&scisig=AAGBfm0AAAAAU9jcmEN2h2yuBuZqQK8Es5dQG3ksjutw&inststart=0&num=10&scis=yes&scisf=4&hl=en&lang=all&instq=&save='

filename = "cookies.txt"
request = urllib2.Request(url, None, headers)
cookies = cookielib.MozillaCookieJar(filename, None, None)
cookies.load()
cookie_handler= urllib2.HTTPCookieProcessor(cookies)
redirect_handler= urllib2.HTTPRedirectHandler()
opener = urllib2.build_opener(redirect_handler,cookie_handler)
response = opener.open(request)
print response.read()
输出错误:

C:\Python27\lib\_MozillaCookieJar.py:109: UserWarning: cookielib bug!
Traceback (most recent call last):
  File "C:\Python27\lib\_MozillaCookieJar.py", line 71, in _really_load
    line.split("\t")
ValueError: need more than 1 value to unpack

  _warn_unhandled_exception()
Traceback (most recent call last):
  File "C:\Users\new user\Desktop\pythonprac\working\googlescholar.py", line 10, in <module>
    cookies.load()
  File "C:\Python27\lib\cookielib.py", line 1763, in load
    self._really_load(f, filename, ignore_discard, ignore_expires)
  File "C:\Python27\lib\_MozillaCookieJar.py", line 111, in _really_load
    (filename, line))
cookielib.LoadError: invalid Netscape format cookies file 'cookies.txt': '.scholar.google.com     TRUE    /       FALSE   2147483647      GSP     ID=353e8f974d766dcd:CF=2'
C:\Python27\lib\\u MozillaCookieJar.py:109:UserWarning:cookielib bug!
回溯(最近一次呼叫最后一次):
文件“C:\Python27\lib\\u MozillaCookieJar.py”,第71行,在加载中
行分割(“\t”)
ValueError:需要超过1个值才能解包
_警告\u未处理的\u异常()
回溯(最近一次呼叫最后一次):
文件“C:\Users\new user\Desktop\pythonprac\working\googlescholar.py”,第10行,在
cookies.load()
文件“C:\Python27\lib\cookielib.py”,第1763行,正在加载中
self.\u真的\u加载(f、文件名、忽略\u放弃、忽略\u过期)
文件“C:\Python27\lib\\u MozillaCookieJar.py”,第111行,在加载中
(文件名,行))
cookielib.LoadError:无效的Netscape格式cookies文件“cookies.txt”:“.scholar.google.com真/假2147483647 GSP ID=353e8f974d766dcd:CF=2”
这段代码来源于网络,我正试图从google scholar bibtex数据下载数据到一个txt文件中。为此,我需要将用户设置保存到cookie中。我正在将数据写入cookie.txt。但是我得到了上面的错误。
请指导如何处理此cookie错误以及如何使用cookie为google.scolar.com保存用户定义的首选项。

我可以建议使用另一组库吗

from bs4 import BeautifulSoup
import requests

url= 'http://scholar.google.co.in/scholar_setprefs?sciifh=1&' +\
     'scisig=AAGBfm0AAAAAU9jcmEN2h2yuBuZqQK8Es5dQG3ksjutw' +\
     '&inststart=0&num=10&scis=yes&scisf=4&hl=en&lang=all&instq=&save='

page = requests.get(url)
cookies = page.cookies

page = requests.get(url, cookies=cookies)

print page.content
使用
cookies=page.cookies
我检索cookies并将其保存到
cookies
变量中。我重新请求传递该变量的同一页。如果您有
cookies.txt
文件,可以将其作为dict加载


如果要使用标准库urllib2和cookielib执行此操作,请确保cookies.txt文件中的第一行是

# Netscape HTTP Cookie File
否则cookielib不会加载它: