为什么Python Scrapy库类不是';不执行
我正试图在IBMCloud上使用Scrapy作为一个函数。我的为什么Python Scrapy库类不是';不执行,python,scrapy,ibm-cloud,ibm-cloud-functions,Python,Scrapy,Ibm Cloud,Ibm Cloud Functions,我正试图在IBMCloud上使用Scrapy作为一个函数。我的\uuuu main\uuuu.py如下所示: import scrapy from scrapy.crawler import CrawlerProcess class AutoscoutListSpider(scrapy.Spider): name = "vehicles list" def __init__(self, params, *args, **kwargs): super(Autos
\uuuu main\uuuu.py
如下所示:
import scrapy
from scrapy.crawler import CrawlerProcess
class AutoscoutListSpider(scrapy.Spider):
name = "vehicles list"
def __init__(self, params, *args, **kwargs):
super(AutoscoutListSpider, self).__init__(*args, **kwargs)
make = params.get("make", None)
model = params.get("model", None)
mileage = params.get("mileage", None)
init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
model, make, mileage)
self.start_urls = [init_url]
def parse(self, response):
# Get total result on list load
init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
if init_total_results > 400:
yield {"message": "There are MORE then 400 results"}
else:
yield {"message": "There are LESS then 400 results"}
def main(params):
process = CrawlerProcess()
try:
process.crawl(AutoscoutListSpider, params)
process.start()
return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
params['make'], params['model'], params['mileage'])}
except Exception as e:
return {"Error ": e, "params ": params}
{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
main({"make":"9", "model":"1624", "mileage":"2500"})
添加此功能的整个过程如下:
import scrapy
from scrapy.crawler import CrawlerProcess
class AutoscoutListSpider(scrapy.Spider):
name = "vehicles list"
def __init__(self, params, *args, **kwargs):
super(AutoscoutListSpider, self).__init__(*args, **kwargs)
make = params.get("make", None)
model = params.get("model", None)
mileage = params.get("mileage", None)
init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
model, make, mileage)
self.start_urls = [init_url]
def parse(self, response):
# Get total result on list load
init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
if init_total_results > 400:
yield {"message": "There are MORE then 400 results"}
else:
yield {"message": "There are LESS then 400 results"}
def main(params):
process = CrawlerProcess()
try:
process.crawl(AutoscoutListSpider, params)
process.start()
return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
params['make'], params['model'], params['mileage'])}
except Exception as e:
return {"Error ": e, "params ": params}
{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
main({"make":"9", "model":"1624", "mileage":"2500"})
zip-r ascrawler.zip\uuuu main\uuuupy common.py
//因此我创建了一个zip文件来上传它。(还有一个common.py文件。为了简单起见,我从这里删除了它。)ibmcloud wsk action create ascrawler--kind python:3 ascrawler.zip
//创建函数并将其添加到云中ibmcloud wsk action invoke--blocking--result ascrawler--param make 9--param model 1624--param miliety 2500
//使用参数调用函数import scrapy
from scrapy.crawler import CrawlerProcess
class AutoscoutListSpider(scrapy.Spider):
name = "vehicles list"
def __init__(self, params, *args, **kwargs):
super(AutoscoutListSpider, self).__init__(*args, **kwargs)
make = params.get("make", None)
model = params.get("model", None)
mileage = params.get("mileage", None)
init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
model, make, mileage)
self.start_urls = [init_url]
def parse(self, response):
# Get total result on list load
init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
if init_total_results > 400:
yield {"message": "There are MORE then 400 results"}
else:
yield {"message": "There are LESS then 400 results"}
def main(params):
process = CrawlerProcess()
try:
process.crawl(AutoscoutListSpider, params)
process.start()
return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
params['make'], params['model'], params['mileage'])}
except Exception as e:
return {"Error ": e, "params ": params}
{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
main({"make":"9", "model":"1624", "mileage":"2500"})
因此,我没有得到任何错误,但它根本没有出现在AutoscoutListSpider
类中。为什么?
它还应该返回{“message”:“有超过400个结果”}
。有什么想法吗
当我从python控制台运行它时,如下所示:
import scrapy
from scrapy.crawler import CrawlerProcess
class AutoscoutListSpider(scrapy.Spider):
name = "vehicles list"
def __init__(self, params, *args, **kwargs):
super(AutoscoutListSpider, self).__init__(*args, **kwargs)
make = params.get("make", None)
model = params.get("model", None)
mileage = params.get("mileage", None)
init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
model, make, mileage)
self.start_urls = [init_url]
def parse(self, response):
# Get total result on list load
init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
if init_total_results > 400:
yield {"message": "There are MORE then 400 results"}
else:
yield {"message": "There are LESS then 400 results"}
def main(params):
process = CrawlerProcess()
try:
process.crawl(AutoscoutListSpider, params)
process.start()
return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
params['make'], params['model'], params['mileage'])}
except Exception as e:
return {"Error ": e, "params ": params}
{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
main({"make":"9", "model":"1624", "mileage":"2500"})
它返回正确的结果:
{"message": "There are MORE then 400 results"}
{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
{“message”:“有超过400个结果”}
在调用的激活日志中可用,而不是操作结果
运行ibmcloud wsk action invoke
命令后,检索上一次调用的激活标识符
$ ibmcloud wsk activation list
activations
d13bd19b196d420dbbd19b196dc20d59 ascrawler
...
$ ibmcloud wsk activation logs d13bd19b196d420dbbd19b196dc20d59 | grep LESS
2018-06-29T08:27:11.094873294Z stderr: {'message': 'There are LESS then 400 results'}
然后可以使用此激活标识从调用期间写入的stdout和stderr中检索所有控制台日志
$ ibmcloud wsk activation list
activations
d13bd19b196d420dbbd19b196dc20d59 ascrawler
...
$ ibmcloud wsk activation logs d13bd19b196d420dbbd19b196dc20d59 | grep LESS
2018-06-29T08:27:11.094873294Z stderr: {'message': 'There are LESS then 400 results'}
谢谢你详细的评论,我明天会看的。@JamesThomas好的。谢谢好消息,一切正常!这些信息被写入控制台日志,而不是从函数返回。你知道为什么这不起作用吗
zip-r ascrawler.zip-venv/bin/activate\u this.py-venv/lib/python3.6/site-packages/raven-venv/lib/python3.6/site-packages/raven-6.9.0.dist-info\u\u main\u.py common.py db.py
?我想使用raven捕获错误,我确实喜欢您的教程,但我得到一个错误“error”:“该操作没有返回字典。”
。在日志中,“,”2018-07-04T12:55:56.590718898Z标准偏差:来自raven导入客户端“,”2018-07-04T12:55:56.590724452Z标准偏差:模块未找到错误:没有名为“raven”的模块