Python 如何在scrapy已经下载的网页上使用selenium.PhantomJS（）_Python_Selenium_Scrapy_Phantomjs

Python 如何在scrapy已经下载的网页上使用selenium.PhantomJS（）

python selenium scrapy phantomjs

Python 如何在scrapy已经下载的网页上使用selenium.PhantomJS（）,python,selenium,scrapy,phantomjs,Python,Selenium,Scrapy,Phantomjs,scrapy已经下载了该网页（包含在response.body中），并且dr.get（response.url）将再次下载有没有办法让selenium直接使用response.body？如何保存HTML文件以及response.body中的内容，然后执行以下操作 def parseList(self, response): dr=webdriver.PhantomJS() dr.get(response.url) pageSource = dr.page_sour

scrapy已经下载了该网页（包含在

response.body

中），并且

dr.get（response.url）

将再次下载

有没有办法让selenium直接使用

response.body

？

如何保存HTML文件以及

response.body

中的内容，然后执行以下操作

def parseList(self, response):
    dr=webdriver.PhantomJS()   
    dr.get(response.url)
    pageSource = dr.page_source
    print dr.page_source

从：

无论此参数的类型如何，最终存储的值都将是一个str（从不使用unicode或None）

如果您使用的是Scrapy，我假设您在Python中使用的是Selenium。您可以使用或其他库解析该

response.body

字符串。“让selenium使用

response.body

”到底是什么意思

url = "file:///your/path/to/downloaded/file.html"
dr.get(url)