Scrapy/Python-url应该被多次调用（多亏了一个循环）。我只打过一次电话。（dont_过滤器不工作）_Python_Python 2.7_Scrapy_Web Crawler

Scrapy/Python-url应该被多次调用（多亏了一个循环）。我只打过一次电话。（dont_过滤器不工作）

python python-2.7 scrapy web-crawler

Scrapy/Python-url应该被多次调用（多亏了一个循环）。我只打过一次电话。（dont_过滤器不工作）,python,python-2.7,scrapy,web-crawler,Python,Python 2.7,Scrapy,Web Crawler,我不确定scrapy是如何工作的。我做了一个工作几乎完美的爬虫。我有一个dict.（config.products）列表。这些dict包含一篇文章，必须在函数initial\u search中发送。因此，必须多次调用initial_search，但是现在由initial_search发送的帖子只进行了一次，爬虫程序正在关闭自己。我添加了dont\u filter=True，但这并没有改变什么。有人知道怎么回事吗 def parse(self, response): return scra

我不确定

scrapy

是如何工作的。我做了一个工作几乎完美的爬虫。我有一个dict.（config.products）列表。这些dict包含一篇文章，必须在函数

initial\u search

中发送。因此，必须多次调用

initial_search

，但是现在由

initial_search

发送的帖子只进行了一次，爬虫程序正在关闭自己。我添加了

dont\u filter=True

，但这并没有改变什么。有人知道怎么回事吗

def parse(self, response):
    return scrapy.FormRequest.from_response(
        response,
        meta={'product':config.products[0]},
        callback=self.initial_search
    )


def initial_search(self, response):
    config.actualProduct = response.meta['product']
    if config.products.index(config.actualProduct) == 0:
        config.savedResponse = response

    # The second time, the request is not made. (even with dont_filter=True)

    return scrapy.FormRequest(
        url=response.url,
        formdata=dictArgs,
        meta={'dictArgs': config.actualProduct},
        dont_filter = True,
        callback=self.other_function
    )

def other_function(self, response):
    return scrapy.FormRequest(
        url=response.url,
        formdata=dictArgs,
        meta={'dictArgs': config.actualProduct},
        callback=self.other_function2
    )

def other_function2(self, response):
        nextPosition = config.products.index(config.actualProduct) + 1

        # Checking if we have another dict to post

        if nextPosition < len(config.products):
            config.savedResponse.meta['product'] = config.products[nextPosition]
            self.initial_search(config.savedResponse)

def解析（self，response）：
从_响应返回scrapy.FormRequest.from(
答复,，
meta={'product'：config.products[0]}，
callback=self.initial\u search
)
def初始搜索（自我，响应）：
config.actualProduct=response.meta['product']
如果config.products.index（config.actualProduct）==0：
config.savedResponse=响应
#第二次，没有提出请求。（即使使用dont_filter=True）
返回scrapy.FormRequest(
url=response.url，
formdata=dictArgs，
meta={'dictArgs'：config.actualProduct}，
Don_filter=True，
回调=self.other_函数
)
def其他_功能（自我、响应）：
返回scrapy.FormRequest(
url=response.url，
formdata=dictArgs，
meta={'dictArgs'：config.actualProduct}，
回调=self.other_函数2
)
def其他_功能2（自我、响应）：
nextPosition=config.products.index（config.actualProduct）+1
#检查我们是否有另一条记录要发布
如果nextPosition


任何帮助都将不胜感激
事实上，您没有正确地调用其他功能2中的初始搜索
。下面是它应该是什么样子：
def other_function2(self, response):
        nextPosition = config.products.index(config.actualProduct) + 1

        # Checking if we have another dict to post

        if nextPosition < len(config.products):
            config.savedResponse.meta['product'] = config.products[nextPosition]
            yield scrapy.Request(
                config.savedResponse,
                meta={'product':config.products[nextPosition]},
                callback=self.initial_search
                )

def其他_功能2（自我、响应）：
nextPosition=config.products.index（config.actualProduct）+1
#检查我们是否有另一条记录要发布
如果nextPosition
这行正确吗meta={'product'：config.products[0]}，
，您有一个产品列表，但每次调用DHI Nima解析时都会发送第一条记录。谢谢你的回复。但我为什么要在这个级别使用scrapy.Request？我的意思是我只想调用函数initial_search。顺便说一下，我不想进行http请求，我称之为初始搜索。只是“return scrapy.FormRequest”不是第二次发出的…：/第一个原因：您的初始搜索是一个spider，并在请求后收到响应。第二个原因：您的初始搜索提取了一个指向此函数的元值。如果您不通过请求发送元值，它将如何提取它？！！！第三个原因：您应该使用scrapy.Request
，因为您没有任何表单数据可发送到初始搜索