Python 3.x 如何在python中提取网页中的src元素_Python 3.x_Web Scraping_Scrapy

Python 3.x 如何在python中提取网页中的src元素

python-3.x web-scraping scrapy

Python 3.x 如何在python中提取网页中的src元素,python-3.x,web-scraping,scrapy,Python 3.x,Web Scraping,Scrapy,我需要从'https://www.gizbot.com/mobile-brands-in-india/'. 我试过用scrapy做这件事- spider.py def parse(self, response): page = response.url.split("/")[-2] filename = 'mobiles-%s.html' % page mob = response.xpath('.//div[has-c

我需要从'https://www.gizbot.com/mobile-brands-in-india/'. 我试过用scrapy做这件事- spider.py

    def parse(self, response):
        page = response.url.split("/")[-2]
        filename = 'mobiles-%s.html' % page
        mob = response.xpath('.//div[has-class("all-brands-block-desc-brand")]/text()').getall()
       
        for mobile in mob:
            m = str(mobile).split()[0]
            with open(filename, 'a') as f:
                f.write("%s %s\n" % (mobile, response.xpath('.//a[contains(@href, m)]').xpath("@href").extract()))
            self.log('Saved file %s' % filename)

但是它没有提取正确的数据。我不知道哪里出了问题。非常感谢您的帮助。

您需要使用以下xpath：

mob = response.xpath('//div[contains(@class, "all-brands-block-desc-brand")]').getall()

xpath*对于提取图像src，您有一个输入错误。对于i in response.css（'div.all-brands-block'）：print（“+i.css（'img:：attr（'data pagespeed lazy src'）））。get（））