Python 3.x AttributeError:响应内容不是';t文本。有什么问题?
我使用的是python3.8.5、scrapy2.4.0,我还使用了scrapy代理池和scrapy用户代理,我得到了“AttributeError:响应内容不是文本”。我正在python3 venv上运行此代码。你愿意帮我解释和解决这个问题吗 这是我的密码:Python 3.x AttributeError:响应内容不是';t文本。有什么问题?,python-3.x,web-scraping,scrapy,Python 3.x,Web Scraping,Scrapy,我使用的是python3.8.5、scrapy2.4.0,我还使用了scrapy代理池和scrapy用户代理,我得到了“AttributeError:响应内容不是文本”。我正在python3 venv上运行此代码。你愿意帮我解释和解决这个问题吗 这是我的密码: import scrapy import json class BasisMembersSpider(scrapy.Spider): name = 'basis' allowed_domains = ['www.bas
import scrapy
import json
class BasisMembersSpider(scrapy.Spider):
name = 'basis'
allowed_domains = ['www.basis.org.bd']
def start_requests(self):
start_url = 'https://basis.org.bd/get-member-list?page=1&team='
yield scrapy.Request(url=start_url, callback=self.get_membership_no)
def get_membership_no(self, response):
data_array = json.loads(response.body)['data']
next_page = json.loads(response.body)['links']['next']
for data in data_array:
next_url = 'https://basis.org.bd/get-company-profile/{0}'.format(data['membership_no'])
yield scrapy.Request(url=next_url, callback=self.parse)
if next_page:
yield scrapy.Request(url=next_page, callback=self.get_membership_no)
def parse(self, response):
print("Printing informations....................................................")
这是我的settings.py文件:
BOT_NAME = 'web_scraping'
SPIDER_MODULES = ['web_scraping.spiders']
NEWSPIDER_MODULE = 'web_scraping.spiders'
AUTOTHROTTLE_ENABLED = True
# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'web_scraping (+http://www.yourdomain.com)'
# Obey robots.txt rules
ROBOTSTXT_OBEY = False
PROXY_POOL_ENABLED = True
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
'scrapy_user_agents.middlewares.RandomUserAgentMiddleware': 800,
'scrapy_proxy_pool.middlewares.ProxyPoolMiddleware': 610,
'scrapy_proxy_pool.middlewares.BanDetectionMiddleware': 620,
}
和是来自控制台输出的错误消息:
谢谢你