为什么Python Scrapy库类不是';不执行

为什么Python Scrapy库类不是';不执行,python,scrapy,ibm-cloud,ibm-cloud-functions,Python,Scrapy,Ibm Cloud,Ibm Cloud Functions,我正试图在IBMCloud上使用Scrapy作为一个函数。我的\uuuu main\uuuu.py如下所示: import scrapy from scrapy.crawler import CrawlerProcess class AutoscoutListSpider(scrapy.Spider): name = "vehicles list" def __init__(self, params, *args, **kwargs): super(Autos

我正试图在IBMCloud上使用Scrapy作为一个函数。我的
\uuuu main\uuuu.py
如下所示:

import scrapy
from scrapy.crawler import CrawlerProcess

class AutoscoutListSpider(scrapy.Spider):
    name = "vehicles list"

    def __init__(self, params, *args, **kwargs):
        super(AutoscoutListSpider, self).__init__(*args, **kwargs)
        make = params.get("make", None)
        model = params.get("model", None)
        mileage = params.get("mileage", None)

        init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
            model, make, mileage)
        self.start_urls = [init_url]

    def parse(self, response):
        # Get total result on list load
        init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
        if init_total_results > 400:
            yield {"message": "There are MORE then 400 results"}
        else:
            yield {"message": "There are LESS then 400 results"}


def main(params):
    process = CrawlerProcess()
    try:
        process.crawl(AutoscoutListSpider, params)
        process.start()
        return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
            params['make'], params['model'], params['mileage'])}
    except Exception as e:
        return {"Error ": e, "params ": params}
{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
main({"make":"9", "model":"1624", "mileage":"2500"})
添加此功能的整个过程如下:

import scrapy
from scrapy.crawler import CrawlerProcess

class AutoscoutListSpider(scrapy.Spider):
    name = "vehicles list"

    def __init__(self, params, *args, **kwargs):
        super(AutoscoutListSpider, self).__init__(*args, **kwargs)
        make = params.get("make", None)
        model = params.get("model", None)
        mileage = params.get("mileage", None)

        init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
            model, make, mileage)
        self.start_urls = [init_url]

    def parse(self, response):
        # Get total result on list load
        init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
        if init_total_results > 400:
            yield {"message": "There are MORE then 400 results"}
        else:
            yield {"message": "There are LESS then 400 results"}


def main(params):
    process = CrawlerProcess()
    try:
        process.crawl(AutoscoutListSpider, params)
        process.start()
        return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
            params['make'], params['model'], params['mileage'])}
    except Exception as e:
        return {"Error ": e, "params ": params}
{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
main({"make":"9", "model":"1624", "mileage":"2500"})
  • zip-r ascrawler.zip\uuuu main\uuuupy common.py
    //因此我创建了一个zip文件来上传它。(还有一个common.py文件。为了简单起见,我从这里删除了它。)
  • ibmcloud wsk action create ascrawler--kind python:3 ascrawler.zip
    //创建函数并将其添加到云中
  • ibmcloud wsk action invoke--blocking--result ascrawler--param make 9--param model 1624--param miliety 2500
    //使用参数调用函数
  • 执行第三步后,我得到如下结果:

    import scrapy
    from scrapy.crawler import CrawlerProcess
    
    class AutoscoutListSpider(scrapy.Spider):
        name = "vehicles list"
    
        def __init__(self, params, *args, **kwargs):
            super(AutoscoutListSpider, self).__init__(*args, **kwargs)
            make = params.get("make", None)
            model = params.get("model", None)
            mileage = params.get("mileage", None)
    
            init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
                model, make, mileage)
            self.start_urls = [init_url]
    
        def parse(self, response):
            # Get total result on list load
            init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
            if init_total_results > 400:
                yield {"message": "There are MORE then 400 results"}
            else:
                yield {"message": "There are LESS then 400 results"}
    
    
    def main(params):
        process = CrawlerProcess()
        try:
            process.crawl(AutoscoutListSpider, params)
            process.start()
            return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
                params['make'], params['model'], params['mileage'])}
        except Exception as e:
            return {"Error ": e, "params ": params}
    
    {"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
    
    main({"make":"9", "model":"1624", "mileage":"2500"})
    
    因此,我没有得到任何错误,但它根本没有出现在
    AutoscoutListSpider
    类中。为什么?

    它还应该返回
    {“message”:“有超过400个结果”}
    。有什么想法吗

    当我从python控制台运行它时,如下所示:

    import scrapy
    from scrapy.crawler import CrawlerProcess
    
    class AutoscoutListSpider(scrapy.Spider):
        name = "vehicles list"
    
        def __init__(self, params, *args, **kwargs):
            super(AutoscoutListSpider, self).__init__(*args, **kwargs)
            make = params.get("make", None)
            model = params.get("model", None)
            mileage = params.get("mileage", None)
    
            init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
                model, make, mileage)
            self.start_urls = [init_url]
    
        def parse(self, response):
            # Get total result on list load
            init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
            if init_total_results > 400:
                yield {"message": "There are MORE then 400 results"}
            else:
                yield {"message": "There are LESS then 400 results"}
    
    
    def main(params):
        process = CrawlerProcess()
        try:
            process.crawl(AutoscoutListSpider, params)
            process.start()
            return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
                params['make'], params['model'], params['mileage'])}
        except Exception as e:
            return {"Error ": e, "params ": params}
    
    {"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
    
    main({"make":"9", "model":"1624", "mileage":"2500"})
    
    它返回正确的结果:

    {"message": "There are MORE then 400 results"}
    {"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}
    
    {“message”:“有超过400个结果”}
    在调用的激活日志中可用,而不是操作结果

    运行
    ibmcloud wsk action invoke
    命令后,检索上一次调用的激活标识符

    $ ibmcloud wsk activation list
    activations
    d13bd19b196d420dbbd19b196dc20d59 ascrawler
    ...
    
    $ ibmcloud wsk activation logs d13bd19b196d420dbbd19b196dc20d59 | grep LESS
    2018-06-29T08:27:11.094873294Z stderr: {'message': 'There are LESS then 400 results'}
    
    然后可以使用此激活标识从调用期间写入的stdout和stderr中检索所有控制台日志

    $ ibmcloud wsk activation list
    activations
    d13bd19b196d420dbbd19b196dc20d59 ascrawler
    ...
    
    $ ibmcloud wsk activation logs d13bd19b196d420dbbd19b196dc20d59 | grep LESS
    2018-06-29T08:27:11.094873294Z stderr: {'message': 'There are LESS then 400 results'}
    

    谢谢你详细的评论,我明天会看的。@JamesThomas好的。谢谢好消息,一切正常!这些信息被写入控制台日志,而不是从函数返回。你知道为什么这不起作用吗
    zip-r ascrawler.zip-venv/bin/activate\u this.py-venv/lib/python3.6/site-packages/raven-venv/lib/python3.6/site-packages/raven-6.9.0.dist-info\u\u main\u.py common.py db.py
    ?我想使用raven捕获错误,我确实喜欢您的教程,但我得到一个错误
    “error”:“该操作没有返回字典。”
    。在日志中,“,”2018-07-04T12:55:56.590718898Z标准偏差:来自raven导入客户端“,”2018-07-04T12:55:56.590724452Z标准偏差:模块未找到错误:没有名为“raven”的模块