Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/337.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python scrapy:Error Download AND TypeError:to_字节必须接收unicode、str或bytes对象,Get-NoneType_Python_Proxy_Scrapy - Fatal编程技术网

Python scrapy:Error Download AND TypeError:to_字节必须接收unicode、str或bytes对象,Get-NoneType

Python scrapy:Error Download AND TypeError:to_字节必须接收unicode、str或bytes对象,Get-NoneType,python,proxy,scrapy,Python,Proxy,Scrapy,这是我第一次在代理中使用scrapy。当我测试代码时,会发生错误,但我无法找到代码错误的地方 pycharm告诉我这个错误:下载错误https://movie.douban.com/subject/25754848/reviews>和类型错误:to_字节必须接收unicode、str或bytes对象got NoneType 以下是中间件代码 import requests import lxml from bs4 import BeautifulSoup from scrapy import s

这是我第一次在代理中使用scrapy。当我测试代码时,会发生错误,但我无法找到代码错误的地方

pycharm告诉我这个错误:下载错误https://movie.douban.com/subject/25754848/reviews>和类型错误:to_字节必须接收unicode、str或bytes对象got NoneType

以下是中间件代码

import requests
import lxml
from bs4 import BeautifulSoup
from scrapy import signals

class ProxyMiddleware(object):

    def process_request(self, request, spider):
        url = 'http://127.0.0.1:5000/get'
        r = requests.get(url)
        request.meta['proxy'] = BeautifulSoup(r.text, "lxml").get_text()
代码注释:我有一个代理池。当它运行时,我可以从地址“”获得不同的代理ip和端口,如“113.122.136.41:808”

以下是错误和回溯列表

2017-04-16 10:20:06 [scrapy.core.scraper] ERROR: Error downloading <GET             
https://movie.douban.com/subject/25754848/reviews>
Traceback (most recent call last):
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\twisted\internet\defer.py", line 1299, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\twisted\python\failure.py", line 393, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\core\downloader\middleware.py", line 43, in process_request
    defer.returnValue((yield download_func(request=request,spider=spider)))
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\utils\defer.py", line 45, in mustbe_deferred
    result = f(*args, **kw)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\core\downloader\handlers\__init__.py", line 65, in download_request
    return handler.download_request(request, spider)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 61, in download_request
    return agent.download_request(request)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 260, in download_request
    agent = self._get_agent(request, timeout)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 240, in _get_agent
_, _, proxyHost, proxyPort, proxyParams = _parse(proxy)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\core\downloader\webclient.py", line 37, in _parse
    return _parsed_url_args(parsed)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\core\downloader\webclient.py", line 20, in _parsed_url_args
    host = b(parsed.hostname)
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\core\downloader\webclient.py", line 17, in <lambda>
    b = lambda s: to_bytes(s, encoding='ascii')
  File "C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site-packages\scrapy\utils\python.py", line 117, in to_bytes
'object, got %s' % type(text).__name__)
TypeError: to_bytes must receive a unicode, str or bytes object, got NoneType
2017-04-16 10:20:06[scrapy.core.scraper]错误:下载错误
回溯(最近一次呼叫最后一次):
文件“C:\Users\empra\AppData\Local\Programs\Python\36\lib\site packages\twisted\internet\defer.py”,第1299行,在\u inlineCallbacks中
结果=结果。通过ExceptionToGenerator(g)
文件“C:\Users\empra\AppData\Local\Programs\Python\36\lib\site packages\twisted\Python\failure.py”,第393行,位于ThroweExceptionToGenerator中
返回g.throw(self.type、self.value、self.tb)
文件“C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site packages\scrapy\core\downloader\middleware.py”,第43行,进程中\u请求
defer.returnValue((yield download_func(request=request,spider=spider)))
文件“C:\Users\empra\AppData\Local\Programs\Python\36\lib\site packages\scrapy\utils\defer.py”,第45行,必须延迟
结果=f(*参数,**kw)
文件“C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site packages\scrapy\core\downloader\handlers\\uuuuuuu init\uuuuuuu.py”,第65行,在下载请求中
返回处理程序。下载\u请求(请求,spider)
文件“C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site packages\scrapy\core\downloader\handlers\http11.py”,第61行,在下载请求中
返回代理。下载请求(请求)
文件“C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site packages\scrapy\core\downloader\handlers\http11.py”,第260行,在下载请求中
代理=self.\u获取\u代理(请求,超时)
文件“C:\Users\empra\AppData\Local\Programs\Python\36\lib\site packages\scrapy\core\downloader\handlers\http11.py”,第240行,在get\U代理中
_,u,代理主机,代理端口,代理参数=_解析(代理)
文件“C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site packages\scrapy\core\downloader\webclient.py”,第37行,在\u parse中
返回已解析的url参数(已解析)
文件“C:\Users\empra\AppData\Local\Programs\Python\Python36\lib\site packages\scrapy\core\downloader\webclient.py”,第20行,在解析的url参数中
主机=b(已解析的主机名)
文件“C:\Users\empra\AppData\Local\Programs\Python\36\lib\site packages\scrapy\core\downloader\webclient.py”,第17行,在
b=lambda s:to_字节(s,encoding='ascii')
文件“C:\Users\empra\AppData\Local\Programs\Python\36\lib\site packages\scrapy\utils\Python.py”,第117行,以字节为单位
'对象,获取了%s'%type(text)。\u名称\u)
TypeError:to_字节必须接收unicode、str或bytes对象,Get-NoneType

我可以告诉您如何将来自url的流转换为unicode

import requests
import urllib2
import lxml
from bs4 import BeautifulSoup
from scrapy import signals

class ProxyMiddleware(object):

    def process_request(self, request, spider):
        url = 'http://127.0.0.1:5000/get'
        r = requests.urlib2.open(url).read()
        data=r.decode("utf-8")
        request.meta['proxy'] = BeautifulSoup(data, "lxml").get_text()

我可以告诉您如何将来自url的流转换为unicode

import requests
import urllib2
import lxml
from bs4 import BeautifulSoup
from scrapy import signals

class ProxyMiddleware(object):

    def process_request(self, request, spider):
        url = 'http://127.0.0.1:5000/get'
        r = requests.urlib2.open(url).read()
        data=r.decode("utf-8")
        request.meta['proxy'] = BeautifulSoup(data, "lxml").get_text()