Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/webpack/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 Scrapy解析iframe url_Python 2.7_Iframe_Scrapy_Centos6.5 - Fatal编程技术网

Python 2.7 Scrapy解析iframe url

Python 2.7 Scrapy解析iframe url,python-2.7,iframe,scrapy,centos6.5,Python 2.7,Iframe,Scrapy,Centos6.5,我正在解析一个网站的链接,然后尝试解析iframe src的链接 根据调试,第一个链接似乎被正确解析,但我的输出文件中没有任何数据 是否也可以在安装后删除所有内容?在URL中。这 看起来像嵌入的iframe信息 我正在运行Centos 6.5 Python 2.7.5 scrapy runspider new.py-o videos.csv import scrapy class PdgaSpider(scrapy.Spider): name = "pdgavideos" st

我正在解析一个网站的链接,然后尝试解析iframe src的链接

根据调试,第一个链接似乎被正确解析,但我的输出文件中没有任何数据

是否也可以在安装后删除所有内容?在URL中。这 看起来像嵌入的iframe信息

我正在运行Centos 6.5 Python 2.7.5

scrapy runspider new.py-o videos.csv

import scrapy

class PdgaSpider(scrapy.Spider):
    name = "pdgavideos"
    start_urls = ["http://www.pdga.com/videos/"]

    def parse(self, response):
        for link in response.xpath('//td[2]/a/@href').extract():

            from scrapy.http.request import Request
            yield Request('http://www.pdga.com'+link, callback=self.parse_page, meta={'link':link})

    def parse_page(self, response):
        for frame in response.xpath("//player").extract():

            yield {
                'link': response.urljoin(frame)
            }
调试结果

预期结果


Scrapy不会刮取iFrame的内容,但您可以获得它们。首先获取iframe url,然后对其调用parse

urls = response.css('iframe::attr(src)').extract()
for url in urls :
        yield scrapy.Request(url....)
urls = response.css('iframe::attr(src)').extract()
for url in urls :
        yield scrapy.Request(url....)