Scrapy/Python是否在保存前修改提取的数据？_Python_Web Scraping_Scrapy

Scrapy/Python是否在保存前修改提取的数据？

python web-scraping scrapy

Scrapy/Python是否在保存前修改提取的数据？,python,web-scraping,scrapy,Python,Web Scraping,Scrapy,我试图将一个url附加到提取的数据块上，但我一辈子都找不到如何添加url 我使用的选择器如下所示： “15_urlmod”：response.url.split（“=”）[-1]+“_l_a1.jpg” 这行代码返回如下内容： 12306116_l_a1.jpg 然后我想附加http:exampleurl.com/images/12306116_l_a1.jpg 因此，scrapy提取并保存的最终url为： http:exampleurl.com/images/12306116_l_a1.jpg

我试图将一个url附加到提取的数据块上，但我一辈子都找不到如何添加url

我使用的选择器如下所示：

“15_urlmod”：response.url.split（“=”）[-1]+“_l_a1.jpg”

这行代码返回如下内容：

12306116_l_a1.jpg

然后我想附加http:exampleurl.com/images/12306116_l_a1.jpg

因此，scrapy提取并保存的最终url为：

http:exampleurl.com/images/12306116_l_a1.jpg

我是Python新手，已经搜索了好几天试图弄明白这一点。我使用的spider代码完整如下：

import scrapy
from scrapy.selector import Selector


    #Starting URL to scrape
class examplespiderscraper(scrapy.Spider):
    name = "examplespider"
    start_urls = ['https://www.exampleurl.com']

    def parse(self, response):
        for book_url in response.xpath(
                "//div[@class='s-producttext-top-wrapper']/a//@href").extract():
            yield scrapy.Request(response.urljoin(book_url), callback=self.parse_details)
        next_page = response.css('span.PageNumberInner > a.swipeNextClick::attr(href)').extract_first()
        if next_page:
            yield scrapy.Request(response.urljoin(next_page), callback=self.parse)



    def parse_details(self, response):
        yield {
            '01_brand': response.xpath("//span[@id='lblProductBrand']/text()").extract_first(),
            '15_urlmod': response.url.split('=')[-1] + "_l_a1.jpg",
        }

问题解决了，在玩了一会儿之后，我想到了：

“+response.url.split（'='）[-1]+”\u l_a8.jpg”

我不知道你能做到

希望这对其他人有帮助