Python 类包含空格时获取空数组_Python_Scrapy_Scrapy Shell

Python 类包含空格时获取空数组

python scrapy

Python 类包含空格时获取空数组,python,scrapy,scrapy-shell,Python,Scrapy,Scrapy Shell,Python 2.7 我想获取每个新的背景图像url和标题，但当我尝试获取图像url时，我总是使用xpath获取空数组以下是我尝试的： scrapy shell http://www.wownews.tw/fashion/movie 然后 response.body 我可以在终端上看到html数据。但是当我打字的时候 response.xpath('//div[@class="text ng-scope"]') 得到空数组，我觉得应该是工作问题是因为类包含空格而发生的吗如何修复它？任

Python 2.7

我想获取每个新的背景图像url和标题，但当我尝试获取图像url时，我总是使用xpath获取空数组

以下是我尝试的：

scrapy shell http://www.wownews.tw/fashion/movie

然后

response.body

我可以在终端上看到html数据。但是当我打字的时候

response.xpath('//div[@class="text ng-scope"]')

得到空数组，我觉得应该是工作

问题是因为类包含空格而发生的吗

如何修复它？任何帮助都将不胜感激

我尝试命令仍然得到空数组

response.xpath('//div[contains(concat(" ", normalize-space(@class), " "), "text ng-scope")]')

这是你需要的一切

import json
import scrapy


class ListingSpider(scrapy.Spider):
    name = 'listing'

    start_urls = ['http://api.wownews.tw/f/pages/site/558fd617913b0c11001d003d?category=5590a6a3f0a8bf110060914d&children=true&limit=48&page=1']

    def parse(self, response):
        items = json.loads(response.body)['results']

        for item in items:
            yield item

请参阅

I see

ng

，这可能意味着此页面使用JavaScript加载数据。打开浏览器，关闭JavaScript并在浏览器中加载页面，以查看

Scrapy

可以看到的内容。我在HTML中没有看到带有class

“text ng scope”

的标记。也许您可以在JavaScript的

response.body

中看到它。有些标记带有class

“text”（

response.xpath（'//div[contains（@class，“text”）]）

），但不带有class“ng scope”（response.xpath（'//div[contains（@class，“ng scope”）））
）。对于我来说，ng范围
可能不是class
，而是属性
。我尝试关闭javascript并打开它。网站将被卡住。网站被卡住是因为没有JavaScript它无法工作。据我所知，Scrapy
不使用Selenium
（您必须使用Selenium
创建项目并添加一些代码），因此它无法获取使用JavaScript创建的数据。感谢您的帮助，我发现我可以从他们的ajax请求中获取数据。