Python 多个方法之间无法正确传递刮擦元或cb_kwargs
我正在浏览一个健身网站 我有不同的方法,比如,抓取主页、分类和产品信息,我试图使用meta/cb_-kwargs在字典中传递所有这些级别的信息 代码: 问题: 我有两个变量要监控,调用了多少次Python 多个方法之间无法正确传递刮擦元或cb_kwargs,python,web-scraping,scrapy,Python,Web Scraping,Scrapy,我正在浏览一个健身网站 我有不同的方法,比如,抓取主页、分类和产品信息,我试图使用meta/cb_-kwargs在字典中传递所有这些级别的信息 代码: 问题: 我有两个变量要监控,调用了多少次parse_by_category和get_product_info,启动的计数器变量分别是category_counter和product_counter 我正在为每个产品(总共132个产品)调用get\u product\u info方法,self.product\u计数器应该是132,但它只产生了3次
parse_by_category
和get_product_info
,启动的计数器变量分别是category_counter
和product_counter
我正在为每个产品(总共132个产品)调用get\u product\u info
方法,self.product\u计数器
应该是132,但它只产生了3次
我还使用了scrapy信号来检查计数器及其输出
SPIDER CLOSED
Category Counter length
132
product counter length
3
self.category_计数器工作正常-132次
但是self.product_计数器-仅3次
执行日志
SPIDER CLOSED
Category Counter length
132
product counter length
3
2020-10-24 05:49:28 [fit] INFO: Spider closed: fit
2020-10-24 05:49:28 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 53795,
'downloader/request_count': 70,
'downloader/request_method_count/GET': 70,
'downloader/response_bytes': 1873858,
'downloader/response_count': 70,
'downloader/response_status_count/200': 70,
'dupefilter/filtered': 129,
'elapsed_time_seconds': 85.653364,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2020, 10, 24, 5, 49, 28, 413804),
'item_scraped_count': 3,
'log_count/DEBUG': 214,
'log_count/INFO': 13,
'log_count/WARNING': 1,
'memusage/max': 140091392,
'memusage/startup': 96174080,
'request_depth_max': 2,
'response_received_count': 70,
'robotstxt/request_count': 1,
'robotstxt/response_count': 1,
'robotstxt/response_status_count/200': 1,
'scheduler/dequeued': 69,
'scheduler/dequeued/memory': 69,
'scheduler/enqueued': 69,
'scheduler/enqueued/memory': 69,
'start_time': datetime.datetime(2020, 10, 24, 5, 48, 2, 760440)}
2020-10-24 05:49:28 [scrapy.core.engine] INFO: Spider closed (finished)
不确定我遗漏了什么,请帮忙。你能分享执行日志吗?我假设所产生的项目也只有3?更新了日志,请check@renatodvc还需要别的吗?你可能想试试
SPIDER CLOSED
Category Counter length
132
product counter length
3
2020-10-24 05:49:28 [fit] INFO: Spider closed: fit
2020-10-24 05:49:28 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 53795,
'downloader/request_count': 70,
'downloader/request_method_count/GET': 70,
'downloader/response_bytes': 1873858,
'downloader/response_count': 70,
'downloader/response_status_count/200': 70,
'dupefilter/filtered': 129,
'elapsed_time_seconds': 85.653364,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2020, 10, 24, 5, 49, 28, 413804),
'item_scraped_count': 3,
'log_count/DEBUG': 214,
'log_count/INFO': 13,
'log_count/WARNING': 1,
'memusage/max': 140091392,
'memusage/startup': 96174080,
'request_depth_max': 2,
'response_received_count': 70,
'robotstxt/request_count': 1,
'robotstxt/response_count': 1,
'robotstxt/response_status_count/200': 1,
'scheduler/dequeued': 69,
'scheduler/dequeued/memory': 69,
'scheduler/enqueued': 69,
'scheduler/enqueued/memory': 69,
'start_time': datetime.datetime(2020, 10, 24, 5, 48, 2, 760440)}
2020-10-24 05:49:28 [scrapy.core.engine] INFO: Spider closed (finished)