Python 刮取时缺少信息-可能是由于错误的css.selector

Python 刮取时缺少信息-可能是由于错误的css.selector,python,web-scraping,scrapy,css-selectors,Python,Web Scraping,Scrapy,Css Selectors,目前,我正在用Python尝试一些关于刮(scrapy)的东西,但我无法解决这个问题(我尝试了很多东西,甚至在堆栈上提出了一个问题,请参见下面的url) 我试着刮两个(它们在我的脚本中),我收到了结果。然而,我丢失了信息,我不知道为什么 铲运机工作正常。但是我无法刮取“”(在代码中,请参见部分:item['schomponder_Tag']) 我的问题:我如何才能获得现在的结果,但包括赞助商标签? from twisted.internet import reactor import scrap

目前,我正在用Python尝试一些关于刮(scrapy)的东西,但我无法解决这个问题(我尝试了很多东西,甚至在堆栈上提出了一个问题,请参见下面的url)

我试着刮两个(它们在我的脚本中),我收到了结果。然而,我丢失了信息,我不知道为什么

铲运机工作正常。但是我无法刮取“”(在代码中,请参见部分:item['schomponder_Tag'])

我的问题:我如何才能获得现在的结果,但包括赞助商标签?

from twisted.internet import reactor
import scrapy
from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
#import re

class AmazonProductSpider(scrapy.Spider):
    name = "AmazonDeals"
    allowed_domains = ["amazon.com"]

    DOWNLOADER_MIDDLEWARES = {
    'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
    'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,}
    #Use working product URL below
    start_urls = [
            "https://www.amazon.com/s?k=shaver&ref=nb_sb_noss_1",          # Shaver
            "https://www.amazon.com/s?k=electric+shaver&ref=nb_sb_noss_2"]#

custom_settings = {
        'FEED_URI' : 'Asin_Titles.json',
        'FEED_FORMAT' : 'json'
        }

    def parse(self, response):
        for product in response.css('.s-result-item'):   # Do I scrape the wrong information? 
            item = AmazonItem()

            # I think that this part goes wrong (the item['Sponsored_Tag'] part)
            item['Sponsored_Tag'] = product.css('span:contains("Sponsored")') 
            #item['Sponsored_Tag'] = product.css('.s-result-item').get() #.css('contains("sponsored")').get()

            item['Prime_tag'] = product.css('.a-color-secondary').get()
            item['asin'] = product.css('::attr(data-asin)').get()
            item['index'] = product.css('::attr(data-index)').get()
            item['link'] = product.css('.a-text-normal::attr(href)').get() 
            item['url_Response'] = response.url
            item['tag'] = product.css('.a-badge-text').get()
            yield item

class AmazonItem(scrapy.Item):
    asin = scrapy.Field()
    index = scrapy.Field()
    link = scrapy.Field()
    url_Response = scrapy.Field()
    tag = scrapy.Field()
    Prime_tag = scrapy.Field() 
    Sponsored_Tag = scrapy.Field()
我试过什么? 我尝试了很多东西。例如,将response.css()更改为“s-result-list s-search-results”)

我想,我找到了解决办法。如果您查看其中一个页面,您可以看到(如果您搜索“s-result-item”,这是我们的response.css),结果包含一个名为“AdHolder”的文本。但是,我在我的搜索结果中找不到它。。。(见下图)

期望的结果

一个包含以下信息的文件(目前,我正在编写一个JSON文件):

 - Sponsored Tag: Yes/No           **#This is what is missing!**
 - ASIN: XXXXXXXX                  #This works in the code below
 - Index: "0"                      #This works in the code below 
 - Link: "complete link"           #This works in the code below 
 - url_response: "response link"   #This works in the code below
 - tag: Bestsellertag etc.         #This works in the code below 
我的代码:

from twisted.internet import reactor
import scrapy
from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
#import re

class AmazonProductSpider(scrapy.Spider):
    name = "AmazonDeals"
    allowed_domains = ["amazon.com"]

    DOWNLOADER_MIDDLEWARES = {
    'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
    'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,}
    #Use working product URL below
    start_urls = [
            "https://www.amazon.com/s?k=shaver&ref=nb_sb_noss_1",          # Shaver
            "https://www.amazon.com/s?k=electric+shaver&ref=nb_sb_noss_2"]#

custom_settings = {
        'FEED_URI' : 'Asin_Titles.json',
        'FEED_FORMAT' : 'json'
        }

    def parse(self, response):
        for product in response.css('.s-result-item'):   # Do I scrape the wrong information? 
            item = AmazonItem()

            # I think that this part goes wrong (the item['Sponsored_Tag'] part)
            item['Sponsored_Tag'] = product.css('span:contains("Sponsored")') 
            #item['Sponsored_Tag'] = product.css('.s-result-item').get() #.css('contains("sponsored")').get()

            item['Prime_tag'] = product.css('.a-color-secondary').get()
            item['asin'] = product.css('::attr(data-asin)').get()
            item['index'] = product.css('::attr(data-index)').get()
            item['link'] = product.css('.a-text-normal::attr(href)').get() 
            item['url_Response'] = response.url
            item['tag'] = product.css('.a-badge-text').get()
            yield item

class AmazonItem(scrapy.Item):
    asin = scrapy.Field()
    index = scrapy.Field()
    link = scrapy.Field()
    url_Response = scrapy.Field()
    tag = scrapy.Field()
    Prime_tag = scrapy.Field() 
    Sponsored_Tag = scrapy.Field()
编辑1

正如pyguy所提到的,解决方案是集成的。不幸的是,“adHolder”项的所有结果都是空的

    def parse(self, response):
    item = AmazonItem()

    for result in response.css('.s-result-list div'):
        if result.css('.AdHolder').extract_first():
            item['adholder'] = True
        else:
            item['adholder'] = False

    for product in response.css('.s-result-item'):    #.s-result-item 
        #item = AmazonItem()
        #item['Sponsored_Tag'] = product.css('span:contains("sponsored")').get()
        #item['Sponsored_Tag'] = product.css('.s-result-item').get() #.css('contains("sponsored")').get()

        item['Prime_tag'] = product.css('.a-color-secondary').get()
        item['asin'] = product.css('::attr(data-asin)').get()
        item['index'] = product.css('::attr(data-index)').get()
        item['link'] = product.css('.a-text-normal::attr(href)').get() 
        item['url_Response'] = response.url
        item['tag'] = product.css('.a-badge-text').get()
        # And so on 
        # ...
        yield item

class AmazonItem(scrapy.Item):   
asin = scrapy.Field()
index = scrapy.Field()
link = scrapy.Field()
url_Response = scrapy.Field()
tag = scrapy.Field()
Prime_tag = scrapy.Field() 
#Sponsored_Tag = scrapy.Field()
adholder = scrapy.Field()
编辑2

正如Pyguy所提到的,一切都在一个循环中。这里有两个问题:

  • AdHolder(或赞助商标签)不会被刮掉(一切都是假的,这是不可能的)
  • 我们现在有太多的产品+/-3095,而我预期的大约是(两页,40/50产品=80/100产品)

  • 首先,非常感谢您的帮助

    In [4]: print(response.css('.s-result-list .AdHolder').extract())                                                                                                                                                                                                              
    ['<div data-asin="B07F7XYMNN" data-index="0" class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 s-result-item sg-col-4-of-28 sg-col-4-of-16 AdHolder sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n    \n\n\n<div data-component-type="s-impression-logger" data-component-props=\'{"percentageShownToFire":"50","batchable":true,"requiredElementSelector":".s-image","url":"https://www.amazon.com/gp/sponsored-products/logging/log-action.html?qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf&amp;adId=200011353751711&amp;eventType=1&amp;adIndex=0"}\' class="rush-component s-expand-height">\n    \n\n\n<div data-component-type="sp-sponsored-result" class="rush-component s-expand-height">\n    \n\n\n\n\n\n\n\n\n<div class="s-expand-height s-include-content-margin s-border-bottom">\n<div class="a-section a-spacing-medium">\n\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        <div class="a-section a-spacing-micro s-min-height-extra-large">\n            \n        </div>\n    </div></div>\n</div>\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none">\n            \n\n\n\n\n\n<span data-component-type="s-product-image" class="rush-component">\n    \n    <a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0414635KJLKI9OVJ7G1&amp;url=%2FBraun-Electric-Integrated-Precision-Rechargeable%2Fdp%2FB07F7XYMNN%2Fref%3Dsr_1_1_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-1-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n        <div class="a-section aok-relative s-image-square-aspect">\n            \n                \n                \n                    <img src="https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL320_.jpg" class="s-image" alt="Braun Series 9 Men\'s Electric Foil Shaver with Wet &amp; Dry Integrated Precision Trimmer &amp; Rechargeable and Cordless Razor with Clean&amp;Charge Station, 9296cc" srcset="https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL320_.jpg 1x, https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL480_QL65_.jpg 1.5x, https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL640_QL65_.jpg 2x, https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL800_QL65_.jpg 2.5x, https://m.media-amazon.com/images/I/81Y8IzpMY2L._AC_UL960_QL65_.jpg 3x" data-image-index="0" data-image-load="" data-image-latency="s-product-image" data-image-source-density="1">\n                \n            \n        </div>\n    </a>\n</span>\n\n        </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none a-spacing-top-small">\n            <div class="a-row a-spacing-micro"><span class="a-size-base a-color-secondary">Sponsored</span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-popover" data-a-popover=\'{"dataStrategy":"preload","name":"sp-info-popover-B07F7XYMNN","position":"triggerVertical"}\'>\n    \n        \n        \n            \n            <span class="aok-inline-block s-info-icon"></span>\n        \n        \n    \n</span>\n\n\n\n    <div class="a-popover-preload" id="a-popover-sp-info-popover-B07F7XYMNN">\n        <span>These are ads for products you\'ll find on Amazon.com. </span><div class="a-row"><span>Clicking an ad will take you to the product\'s page.</span>\n\n\n\n\n<a class="a-link-normal" href="https://advertising.amazon.com/products-self-serve?ref_=ext_amzn_wtsp">\n    \n        \n            \n                <span>Learn more about Sponsored Products.</span>\n            \n        \n        \n    \n</a>\n</div><div class="a-row a-spacing-top-small"><span></span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-modal" data-a-modal=\'{"dataStrategy":"ajax","header":"Share your feedback","url":"/gp/sponsored-products/lazyLoad/handler/sp-feedback-handler.html?pl=lB0mnfW%2BgTIwXZ6x6%2FEkofSU5xhfxxnnc785QXRwbCnTJ%2BJA6GNDd0mNO0LWBiOz0%2BPUMqC%2BfpmF%0Amq1O8nxbZonbXZAJRS9av%2BV16idXdoOcIz11gXk310EIW6PcOMYdRA%2Bp7Z%2FEeOGfKzIeFtB9qQvN%0AjDcmdjwVXrg9HYDbd1wHIPKQtdWBZdOar0OZInRL0%2F7pFW12O6KO3SMD%2B1v35y5myOGZmF51DLTd%0Ar0Eot11Sc8HtuVRMXgD1s8WIwu%2F0zt6zF3tg3EMcdWFtwMCECLKa2xwQfYLDM6NKIeQvOsky9j19%0A0mXHU7i%2FXQ9fL70%2Bf7m0aTvN8LwIHzdNZM6f6qiuarbVWcVp%2B1BM7Q0NyT33bHOLdwKR7DmhKH03%0AjWCKWNQnpeVaWAm%2BuwDwrOBtCI4voFa%2BK4IX5hBAzvyzgBCtATDpdIllsxGZj7dvvaFrkCdPE7w%2B%0AZ%2BzM8Bhfm4SU3nL%2FTRgbAEwM1b%2BMMAW3uFvcDWFHmWmwwTkce6uiLD5d83valaN%2FEgY03Xx0DsSd%0AROuQ8XS7DCpmy7rFmNXOEFSlWSOBf68IZKQniiJid90PjI9TySMEuuA%3D"}\'>\n    \n        \n        \n        \n            \n\n\n\n\n<a class="a-link-normal" href="#">\n    \n        \n            \n                <span></span>\n            \n        \n        \n    \n</a>\n\n        \n    \n</span>\n\n\n\n</div>\n    </div>\n\n</div>\n\n\n\n\n<h2 class="a-size-mini a-spacing-none a-color-base s-line-clamp-4">\n    \n    \n        \n\n\n\n\n<a class="a-link-normal a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0414635KJLKI9OVJ7G1&amp;url=%2FBraun-Electric-Integrated-Precision-Rechargeable%2Fdp%2FB07F7XYMNN%2Fref%3Dsr_1_1_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-1-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n    \n        \n            \n                <span class="a-size-base-plus a-color-base a-text-normal">Braun Series 9 Men\'s Electric Foil Shaver with Wet &amp; Dry Integrated Precision Trimmer &amp; Rechargeable and Cordless Razor with Clean&amp;Charge Station, 9296cc</span>\n            \n        \n        \n    \n</a>\n\n    \n</h2>\n\n        </div>\n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-small">\n\n\n<span aria-label="4.4 out of 5 stars">\n    \n\n\n\n\n\n\n    \n        <span class="a-declarative" data-action="a-popover" data-a-popover=\'{"max-width":"700","closeButton":false,"position":"triggerBottom","url":"/review/widgets/average-customer-review/popover/ref=acr_search__popover?ie=UTF8&amp;asin=B07F7XYMNN&amp;ref=acr_search__popover&amp;contextId=search"}\'>\n            \n            <a href="javascript:void(0)" class="a-popover-trigger a-declarative"><i class="a-icon a-icon-star-small a-star-small-4-5 aok-align-bottom"><span class="a-icon-alt">4.4 out of 5 stars</span></i><i class="a-icon a-icon-popover"></i></a>\n        </span>\n    \n    \n\n\n</span>\n\n\n\n<span aria-label="89">\n    \n\n\n\n\n<a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0414635KJLKI9OVJ7G1&amp;url=%2FBraun-Electric-Integrated-Precision-Rechargeable%2Fdp%2FB07F7XYMNN%2Fref%3Dsr_1_1_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-1-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf#customerReviews">\n    \n        \n            \n                <span class="a-size-base">89</span>\n            \n        \n        \n    \n</a>\n\n</span>\n</div>\n            </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-small">\n                <div class="a-row a-size-base a-color-base"><div class="a-row">\n\n\n\n\n<a class="a-size-base a-link-normal s-no-hover a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0414635KJLKI9OVJ7G1&amp;url=%2FBraun-Electric-Integrated-Precision-Rechargeable%2Fdp%2FB07F7XYMNN%2Fref%3Dsr_1_1_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-1-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n    \n        \n            \n                <span class="a-price" data-a-size="l" data-a-color="base"><span class="a-offscreen">$309.99</span><span aria-hidden="true"><span class="a-price-symbol">$</span><span class="a-price-whole">309<span class="a-price-decimal">.</span></span><span class="a-price-fraction">99</span></span></span>\n            \n                <span class="a-size-base a-color-secondary">($146.91/Pound)</span>\n            \n        \n        \n    \n</a>\n</div></div>\n            </div>\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-base a-color-secondary s-align-children-center"><div class="a-row s-align-children-center">\n\n\n\n\n<span class="aok-inline-block s-image-logo-view">\n  <span class="aok-relative s-icon-text-medium s-prime">\n    <i class="a-icon a-icon-prime a-icon-medium" role="img" aria-label="Amazon Prime"></i>\n  </span>\n  <span>\n    \n  </span>\n</span>\n\n\n\n<span aria-label="Get it as soon as Thu, Jul 18">\n    <span>Get it as soon as </span><span class="a-text-bold">Thu, Jul 18</span>\n</span>\n</div><div class="a-row">\n\n\n<span aria-label="FREE Shipping by Amazon">\n    <span>FREE Shipping by Amazon</span>\n</span>\n</div></div>\n            </div>\n        \n        \n        \n        \n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n  </div></div>\n</div>\n</div>\n</div>\n\n</div>\n\n</div>\n\n</div></div>', '<div data-asin="B003YJAZZ4" data-index="1" class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 s-result-item sg-col-4-of-28 sg-col-4-of-16 AdHolder sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n    \n\n\n<div data-component-type="s-impression-logger" data-component-props=\'{"percentageShownToFire":"50","batchable":true,"requiredElementSelector":".s-image","url":"https://www.amazon.com/gp/sponsored-products/logging/log-action.html?qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf&amp;adId=200005657625311&amp;eventType=1&amp;adIndex=1"}\' class="rush-component s-expand-height">\n    \n\n\n<div data-component-type="sp-sponsored-result" class="rush-component s-expand-height">\n    \n\n\n\n\n\n\n\n\n<div class="s-expand-height s-include-content-margin s-border-bottom">\n<div class="a-section a-spacing-medium">\n\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        <div class="a-section a-spacing-micro s-min-height-extra-large">\n            \n        </div>\n    </div></div>\n</div>\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none">\n            \n\n\n\n\n\n<span data-component-type="s-product-image" class="rush-component">\n    \n    <a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_2?ie=UTF8&amp;adId=A05912003D82ZR77VHH5H&amp;url=%2FBraun-Electric-Shaver-Station-Cordless%2Fdp%2FB003YJAZZ4%2Fref%3Dsr_1_2_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-2-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n        <div class="a-section aok-relative s-image-square-aspect">\n            \n                \n                \n                    <img src="https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL320_.jpg" class="s-image" alt="Braun Series 7 790cc-4 Electric Foil Shaver with Clean&amp;Charge Station, 1 Count" srcset="https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL320_.jpg 1x, https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL480_QL65_.jpg 1.5x, https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL640_QL65_.jpg 2x, https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL800_QL65_.jpg 2.5x, https://m.media-amazon.com/images/I/814fSnCl0eL._AC_UL960_QL65_.jpg 3x" data-image-index="1" data-image-load="" data-image-latency="s-product-image" data-image-source-density="1">\n                \n            \n        </div>\n    </a>\n</span>\n\n        </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none a-spacing-top-small">\n            <div class="a-row a-spacing-micro"><span class="a-size-base a-color-secondary">Sponsored</span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-popover" data-a-popover=\'{"dataStrategy":"preload","name":"sp-info-popover-B003YJAZZ4","position":"triggerVertical"}\'>\n    \n        \n        \n            \n            <span class="aok-inline-block s-info-icon"></span>\n        \n        \n    \n</span>\n\n\n\n    <div class="a-popover-preload" id="a-popover-sp-info-popover-B003YJAZZ4">\n        <span>These are ads for products you\'ll find on Amazon.com. </span><div class="a-row"><span>Clicking an ad will take you to the product\'s page.</span>\n\n\n\n\n<a class="a-link-normal" href="https://advertising.amazon.com/products-self-serve?ref_=ext_amzn_wtsp">\n    \n        \n            \n                <span>Learn more about Sponsored Products.</span>\n            \n        \n        \n    \n</a>\n</div><div class="a-row a-spacing-top-small"><span></span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-modal" data-a-modal=\'{"dataStrategy":"ajax","header":"Share your feedback","url":"/gp/sponsored-products/lazyLoad/handler/sp-feedback-handler.html?pl=KHcYnXwozHREynoNEPM3bbMnH%2FMI8s5SVP9nZyO9%2Bemx9E7m9uWGEQS3nBvHj%2BTpHlWCgtJxv65B%0A1fbbCIAb1ivlHLGe9bDi9XBrklhROSKeoWM3PpYPqCIFGwSusBYKhsKTSsijnU0hcE2kwEkX6dPm%0AYWYhITYXNoUJuWjUPV%2F%2F6IRqxkNKhbsOVZQoki56dZeC6ojhq78vV%2FZUUtSmf8LwehZjTMvF65xc%0A6jxI8nbjFJMvluPsl7BEX7ZfF08o13Ip%2BIY8y8%2BwZMH5SFcUbkqfJtQPYfy3WMrC3fT4zOAu3z5J%0AJau%2BZWLYs7GHgJnQj%2Ftw4VVjQjWJXdund5ND1rLuRP%2B5UnCqff0wXM%2BYZWrAdJKeLpWSavRDwfM2%0AIaPiuHxMS4F%2F0Y05HmeJp%2FPwPWN8YmGMKMoA4egrr2HhF8Yi9dIQLgWLgd%2FNM521RQprNbGbROEU%0ANuWR3XcKd53kwCC2WSt6ZQ5C8SuEjfhdDUxKtA9E4E%2FoKwyvW4OdOVBn1gS8o3cxME8l4IYhL23o%0AW4Z6tYRWSPdXW%2BvflMnJVqE%3D"}\'>\n    \n        \n        \n        \n            \n\n\n\n\n<a class="a-link-normal" href="#">\n    \n        \n            \n                <span></span>\n            \n        \n        \n    \n</a>\n\n        \n    \n</span>\n\n\n\n</div>\n    </div>\n\n</div>\n\n\n\n\n<h2 class="a-size-mini a-spacing-none a-color-base s-line-clamp-4">\n    \n    \n        \n\n\n\n\n<a class="a-link-normal a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_2?ie=UTF8&amp;adId=A05912003D82ZR77VHH5H&amp;url=%2FBraun-Electric-Shaver-Station-Cordless%2Fdp%2FB003YJAZZ4%2Fref%3Dsr_1_2_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-2-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n    \n        \n            \n                <span class="a-size-base-plus a-color-base a-text-normal">Braun Series 7 790cc-4 Electric Foil Shaver with Clean&amp;Charge Station, 1 Count</span>\n            \n        \n        \n    \n</a>\n\n    \n</h2>\n\n        </div>\n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-small">\n\n\n<span aria-label="4.3 out of 5 stars">\n    \n\n\n\n\n\n\n    \n        <span class="a-declarative" data-action="a-popover" data-a-popover=\'{"max-width":"700","closeButton":false,"position":"triggerBottom","url":"/review/widgets/average-customer-review/popover/ref=acr_search__popover?ie=UTF8&amp;asin=B003YJAZZ4&amp;ref=acr_search__popover&amp;contextId=search"}\'>\n            \n            <a href="javascript:void(0)" class="a-popover-trigger a-declarative"><i class="a-icon a-icon-star-small a-star-small-4-5 aok-align-bottom"><span class="a-icon-alt">4.3 out of 5 stars</span></i><i class="a-icon a-icon-popover"></i></a>\n        </span>\n    \n    \n\n\n</span>\n\n\n\n<span aria-label="7,761">\n    \n\n\n\n\n<a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_2?ie=UTF8&amp;adId=A05912003D82ZR77VHH5H&amp;url=%2FBraun-Electric-Shaver-Station-Cordless%2Fdp%2FB003YJAZZ4%2Fref%3Dsr_1_2_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-2-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf#customerReviews">\n    \n        \n            \n                <span class="a-size-base">7,761</span>\n            \n        \n        \n    \n</a>\n\n</span>\n</div>\n            </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-small">\n                <div class="a-row a-size-base a-color-base"><div class="a-row">\n\n\n\n\n<a class="a-size-base a-link-normal s-no-hover a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_2?ie=UTF8&amp;adId=A05912003D82ZR77VHH5H&amp;url=%2FBraun-Electric-Shaver-Station-Cordless%2Fdp%2FB003YJAZZ4%2Fref%3Dsr_1_2_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-2-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_atf">\n    \n        \n            \n                <span class="a-price" data-a-size="l" data-a-color="base"><span class="a-offscreen">$199.94</span><span aria-hidden="true"><span class="a-price-symbol">$</span><span class="a-price-whole">199<span class="a-price-decimal">.</span></span><span class="a-price-fraction">94</span></span></span>\n            \n                <span class="a-price" data-a-size="b" data-a-strike="true" data-a-color="secondary"><span class="a-offscreen">$289.99</span><span aria-hidden="true"><span class="a-price-symbol">$</span><span class="a-price-whole">289<span class="a-price-decimal">.</span></span><span class="a-price-fraction">99</span></span></span>\n            \n        \n        \n    \n</a>\n</div></div><div class="a-row a-size-base a-color-secondary"><div class="a-row">\n\n\n\n\n\n<span data-component-type="s-coupon-component" data-component-props=\'{"asin":"B003YJAZZ4"}\' class="rush-component">\n    <span class="s-coupon-clipped aok-hidden">\n        <span class="a-color-base">$20.00 coupon applied.</span>\n    </span>\n    <span class="s-coupon-unclipped ">\n        \n\n\n<span class="a-size-base s-coupon-highlight-color s-highlighted-text-padding aok-inline-block">\n    Save $20.00\n</span>\n\n        <span class="a-color-base"> with coupon</span>\n    </span>\n    \n</span>\n</div></div>\n            </div>\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-base a-color-secondary s-align-children-center"><div class="a-row s-align-children-center">\n\n\n\n\n<span class="aok-inline-block s-image-logo-view">\n  <span class="aok-relative s-icon-text-medium s-prime">\n    <i class="a-icon a-icon-prime a-icon-medium" role="img" aria-label="Amazon Prime"></i>\n  </span>\n  <span>\n    \n  </span>\n</span>\n\n\n\n<span aria-label="Get it as soon as Thu, Jul 18">\n    <span>Get it as soon as </span><span class="a-text-bold">Thu, Jul 18</span>\n</span>\n</div><div class="a-row">\n\n\n<span aria-label="FREE Shipping by Amazon">\n    <span>FREE Shipping by Amazon</span>\n</span>\n</div></div>\n            </div>\n        \n        \n        \n        \n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n  </div></div>\n</div>\n</div>\n</div>\n\n</div>\n\n</div>\n\n</div></div>', '<div data-asin="B01M716CC2" data-index="19" class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 s-result-item sg-col-4-of-28 sg-col-4-of-16 AdHolder sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n    \n\n\n<div data-component-type="s-impression-logger" data-component-props=\'{"percentageShownToFire":"50","batchable":true,"requiredElementSelector":".s-image","url":"https://www.amazon.com/gp/sponsored-products/logging/log-action.html?qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_mtf&amp;adId=200003291706211&amp;eventType=1&amp;adIndex=2"}\' class="rush-component s-expand-height">\n    \n\n\n<div data-component-type="sp-sponsored-result" class="rush-component s-expand-height">\n    \n\n\n\n\n\n\n\n\n<div class="s-expand-height s-include-content-margin s-border-bottom">\n<div class="a-section a-spacing-medium">\n\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        <div class="a-section a-spacing-micro s-min-height-extra-large">\n            \n        </div>\n    </div></div>\n</div>\n\n<div class="sg-row">\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none">\n            \n\n\n\n\n\n<span data-component-type="s-product-image" class="rush-component">\n    \n    <a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_mtf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0238051Q7V9JY6EMHYF&amp;url=%2FBraun-Electric-Shaver-9290cc-Travel%2Fdp%2FB01M716CC2%2Fref%3Dsr_1_19_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-19-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_mtf">\n        <div class="a-section aok-relative s-image-square-aspect">\n            \n                \n                \n                    <img src="https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL320_.jpg" class="s-image" alt="Braun Series 9 9290cc Electric Razor for Men, Rechargeable and Cordless Electric Shaver, Foil Shaver, Silver, with Clean&amp;Charge Station and Travel Case" srcset="https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL320_.jpg 1x, https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL480_QL65_.jpg 1.5x, https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL640_QL65_.jpg 2x, https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL800_QL65_.jpg 2.5x, https://m.media-amazon.com/images/I/81CmsUO2IzL._AC_UL960_QL65_.jpg 3x" data-image-index="19" data-image-load="" data-image-latency="s-product-image" data-image-source-density="1">\n                \n            \n        </div>\n    </a>\n</span>\n\n        </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        <div class="a-section a-spacing-none a-spacing-top-small">\n            <div class="a-row a-spacing-micro"><span class="a-size-base a-color-secondary">Sponsored</span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-popover" data-a-popover=\'{"dataStrategy":"preload","name":"sp-info-popover-B01M716CC2","position":"triggerVertical"}\'>\n    \n        \n        \n            \n            <span class="aok-inline-block s-info-icon"></span>\n        \n        \n    \n</span>\n\n\n\n    <div class="a-popover-preload" id="a-popover-sp-info-popover-B01M716CC2">\n        <span>These are ads for products you\'ll find on Amazon.com. </span><div class="a-row"><span>Clicking an ad will take you to the product\'s page.</span>\n\n\n\n\n<a class="a-link-normal" href="https://advertising.amazon.com/products-self-serve?ref_=ext_amzn_wtsp">\n    \n        \n            \n                <span>Learn more about Sponsored Products.</span>\n            \n        \n        \n    \n</a>\n</div><div class="a-row a-spacing-top-small"><span></span>\n\n\n\n\n\n<span class="a-declarative" data-action="a-modal" data-a-modal=\'{"dataStrategy":"ajax","header":"Share your feedback","url":"/gp/sponsored-products/lazyLoad/handler/sp-feedback-handler.html?pl=0p%2BGEEYfSi3REr7K6Ac9Nqe9JjlS1kupaJ7MTBFbv7xen2TPg5Oc%2BN2Ae163fnU01qKZeDgPtGas%0A0Feh9ykvFdqBAZMnMfCHs4k%2Beht6HW%2FzanzhIRMebuDSFcmpXsxMIlEyihB6RIC2LnQNUhfy8i3x%0AWhSijAPnRknldtvl%2BiXb%2FomJhnuVZuVr0qxvvFXe4cjEudG1ABX946GnadxoboiHHfy9GwF6QF1b%0ATdmSjB%2BE7yy3HB3B6E9ImtbgoqBIk4aSkqRyXuahRoAp1brZO3Nn3qFPYXDIG2%2F%2BCDzJndYLL%2FCK%0AVvZ3R6lN42KA6oTI4CxoMs%2FmfiN7P85KWyTeS8YX6ICcjkaIjnvxjOCDp%2FX8%2FDrKYNWc8GrdY4Fb%0ABQSlas58beh5VfDSQ0Tiwe3TkLoIXzEGFfsIPEa2OP0AyJWut4dsSB%2FQ%2FHzx71c27lH0R6cIGGdV%0APA0iOLrWXAYZpM6VBSEcKJb4Zu%2F0bnzB%2Be6s9yF%2FduaUtfqsiXgXyypf3TA1wNADbj0mPJp1Fj5W%0AEdvQxCk8Z%2B6fIpGfW%2Bh6d%2BI4PP3QDZF6yjN5%2FFd1jf8O%2BXp5AefGs38%3D"}\'>\n    \n        \n        \n        \n            \n\n\n\n\n<a class="a-link-normal" href="#">\n    \n        \n            \n                <span></span>\n            \n        \n        \n    \n</a>\n\n        \n    \n</span>\n\n\n\n</div>\n    </div>\n\n</div>\n\n\n\n\n<h2 class="a-size-mini a-spacing-none a-color-base s-line-clamp-4">\n    \n    \n        \n\n\n\n\n<a class="a-link-normal a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_mtf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0238051Q7V9JY6EMHYF&amp;url=%2FBraun-Electric-Shaver-9290cc-Travel%2Fdp%2FB01M716CC2%2Fref%3Dsr_1_19_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-19-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_mtf">\n    \n        \n            \n                <span class="a-size-base-plus a-color-base a-text-normal">Braun Series 9 9290cc Electric Razor for Men, Rechargeable and Cordless Electric Shaver, Foil Shaver, Silver, with Clean&amp;Charge Station and Travel Case</span>\n            \n        \n        \n    \n</a>\n\n    \n</h2>\n\n        </div>\n        \n            <div class="a-section a-spacing-none a-spacing-top-micro">\n                <div class="a-row a-size-small">\n\n\n<span aria-label="3.9 out of 5 stars">\n    \n\n\n\n\n\n\n    \n        <span class="a-declarative" data-action="a-popover" data-a-popover=\'{"max-width":"700","closeButton":false,"position":"triggerBottom","url":"/review/widgets/average-customer-review/popover/ref=acr_search__popover?ie=UTF8&amp;asin=B01M716CC2&amp;ref=acr_search__popover&amp;contextId=search"}\'>\n            \n            <a href="javascript:void(0)" class="a-popover-trigger a-declarative"><i class="a-icon a-icon-star-small a-star-small-4 aok-align-bottom"><span class="a-icon-alt">3.9 out of 5 stars</span></i><i class="a-icon a-icon-popover"></i></a>\n        </span>\n    \n    \n\n\n</span>\n\n\n\n<span aria-label="1,079">\n    \n\n\n\n\n<a class="a-link-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_mtf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0238051Q7V9JY6EMHYF&amp;url=%2FBraun-Electric-Shaver-9290cc-Travel%2Fdp%2FB01M716CC2%2Fref%3Dsr_1_19_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-19-spons%26psc%3D1&amp;qualifier=1563148367&amp;id=6965017493400297&amp;widgetName=sp_mtf#customerReviews">\n    \n        \n            \n                <span class="a-size-base">1,079</span>\n            \n        \n        \n    \n</a>\n\n</span>\n</div>\n            </div>\n        \n  </div></div>\n  <div class="sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 sg-col-4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"><div class="sg-col-inner">\n        \n        \n            <div class="a-section a-spacing-none a-spacing-top-small">\n                <div class="a-row a-size-base a-color-base"><div class="a-row">\n\n\n\n\n<a class="a-size-base a-link-normal s-no-hover a-text-normal" href="/gp/slredirect/picassoRedirect.html/ref=pa_sp_mtf_aps_sr_pg1_1?ie=UTF8&amp;adId=A0238051Q7V9JY6EMHYF&amp;url=%2FBraun-Electric-Shaver-9290cc-Travel%2Fdp%2FB01M716CC2%2Fref%3Dsr_1_19_sspa%3Fkeywords%3Dshaver%26qid%3D1563148367%26s%3Dgateway%26sr%3D8-19-spons%26psc%3D1&amp;
    

    我对这种方式不太满意,但我认为它会起作用。

    我想问一下你是否正在使用Crawlera?如果是,你现在对Amazon URL的爬网速度非常慢吗?PS:在控制台中检查你的回复,确保你收到了赞助商标签。很多时候这些广告都是通过JavaScript提供的。比如广告通常是异步加载或延迟插入的。可能是您的刮板没有等待足够长的时间来获取数据。请尝试为您的脚本添加一些任意的超时时间,并确保执行JavaScript。@ThePyGuy谢谢!我没有使用Crawlera…我知道可以刮板它们,但我无法使整个集成ion(参见示例:)@BramVanroy谢谢。我将在代码中添加一些time.sleep()进行测试。将该语句放在“for product in response.css('.s-result-item'):”之后是否明智?此外,我知道可以删除该语句(参见)。非常感谢您的帮助!谢谢@ThePyGuy。我在帖子中添加了您的解决方案,但不幸的是,所有结果都是空的(这是不可能的,请在页面上看到真实结果)。我集成得好吗?非常感谢!在一个循环中完成所有工作。拿出第二个for循环,用result.css设置项目,看看是否有效。它将逐步完成结果列表中的每个第一个div,检查是否为add holder并进行适当设置。希望如此。谢谢!我更新了代码(请参见问题,编辑2),但不幸的是,它没有给出结果:1)。所有结果均为假,2)。我们现在有太多的结果(大约3000个,而不是80-100个)。我的想法正确吗?当然,我做错了什么吗?是的,它看起来更正确……当我查找div时,它们会出现在我的控制台中。也许是因为某些原因,他们没有出现在主列表中。前两个列表总是被赞助的吗?是的,在两个URL中,前两个(数据索引=0和数据索引=1)总是被赞助的。你能分享你的代码吗?它和我在编辑2中发布的完全一样吗?谢谢
    for result in response.css('.s-result-list div'):
        if result.css('.AdHolder').extract_first():
            item['adholder'] = True
        else:
            item['adholder'] = False
        rest of item logic