Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/330.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/asp.net/29.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 尝试爬网我的爬行器时出错(NotImplementedError)_Python_Asp.net_Scrapy_Syntax Error - Fatal编程技术网

Python 尝试爬网我的爬行器时出错(NotImplementedError)

Python 尝试爬网我的爬行器时出错(NotImplementedError),python,asp.net,scrapy,syntax-error,Python,Asp.net,Scrapy,Syntax Error,我的密码坏了。我试图做的论坛刮,但收到一个错误。 这是我的密码: import scrapy, time class ForumSpiderSpider(scrapy.Spider): name = 'forum_spider' allowed_domains = ['visforvoltage.org/latest_tech/'] start_urls = ['http://visforvoltage.org/latest_tech//'] def parse_ur

我的密码坏了。我试图做的论坛刮,但收到一个错误。 这是我的密码:

import scrapy, time

class ForumSpiderSpider(scrapy.Spider):
    name = 'forum_spider'
    allowed_domains = ['visforvoltage.org/latest_tech/']
    start_urls = ['http://visforvoltage.org/latest_tech//']

def parse_urls(self, response):
   for href in response.css(r"tbody a[href*='/forum/']::attr(href)").extract():
       url = response.urljoin(href)
       print(url)
       req = scrapy.Request(url, callback=self.parse_data)
       time.sleep(10)
       yield req

def parse_data(self, response):
    for sel in response.css('html').extract():
       data = {}
       data['name'] = response.css(r"div[class='author-pane-line author-name'] span[class='username']::text").extract()
       data['date'] = response.css(r"div[class='forum-posted-on']:contains('-') ::text").extract()
       data['title'] = response.css(r"div[class='section'] h1[class='title']::text").extract()
       data['body'] = response.css(r"div[class='field-items'] p::text").extract()
       yield data
   

    next_page = response.css(r"li[class='pager-next'] a[href*='page=']::attr(href)").extract()
    if next_page:
        yield scrapy.Request(
            response.urljoin(next_page),
            callback=self.parse_urls)
这里有一个错误:

[scrapy.core.scraper] ERROR: Spider error processing <GET https://visforvoltage.org/latest_tech> (referer: None)
raise NotImplementedError('{}.parse callback is not defined'.format(self.__class__.__name__))
NotImplementedError: ForumSpiderSpider.parse callback is not defined
[scrapy.core.scraper]错误:蜘蛛错误处理(参考:无)
raise NOTEImplementedError(“{}.parse回调未定义”。格式(self.\uuuuuu类\uuuuuuuuuu名称)
NotImplementedError:ForumSpiderSpider.parse回调未定义

如果有人能帮我,我会非常感激的

父类
scrapy.Spider
有一个名为
start\u requests
的方法。这是一种检查
start\u URL
并为spider创建第一个请求的方法

该方法需要一个名为
parse
的方法作为回调函数。因此,解决问题的最快方法是将
parse_url
方法更改为
parse
,如下所示:

def parse(self, response):
   for href in response.css(r"tbody a[href*='/forum/']::attr(href)").extract():
       url = response.urljoin(href)
       print(url)
       req = scrapy.Request(url, callback=self.parse_data)
       time.sleep(10)
       yield req
如果要更改该行为,需要覆盖类中的
start\u requests
方法,以便确定回调函数的名称例如:

def start_requests(self):
    for url in self.start_urls:
        yield Request(url, callback=self.parse_urls, dont_filter=True)

多谢各位!