Python/scrapy-response.replace()方法不起作用?
在调用屈服请求之前,尝试使用response.replace更改response.url时,我会得到相同的结果吗?语法似乎是正确的Python/scrapy-response.replace()方法不起作用?,python,scrapy,Python,Scrapy,在调用屈服请求之前,尝试使用response.replace更改response.url时,我会得到相同的结果吗?语法似乎是正确的 print(response.url) response.replace(url='https://techcrunch.com/search/heartbleed#stq=heartbleed&stp=2') print(response.url) next = self.driver.find_element(By.XPATH,"//a[@clas
print(response.url)
response.replace(url='https://techcrunch.com/search/heartbleed#stq=heartbleed&stp=2')
print(response.url)
next = self.driver.find_element(By.XPATH,"//a[@class='page-link next']")
nextpage = next.get_attribute("href")
yield scrapy.Request(url=nextpage, dont_filter=False)
注:1.我给url分配了两次(obv.如果可以的话不需要…grrr)
2.nextpage与代码第2行中的url完全相同 输出:
https://techcrunch.com/search/heartbleed
https://techcrunch.com/search/heartbleed
2017-06-15 15:09:55 [selenium.webdriver.remote.remote_connection] DEBUG: POST http://127.0.0.1:56740/wd/hub/session/e3ba0740-51cb-11e7-acb6-f1825cec3f42/element {"using": "xpath", "sessionId": "e3ba0740-51cb-11e7-acb6-f1825cec3f42", "value": "//a[@class='page-link next']"}
2017-06-15 15:09:55 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2017-06-15 15:09:55 [selenium.webdriver.remote.remote_connection] DEBUG: GET http://127.0.0.1:56740/wd/hub/session/e3ba0740-51cb-11e7-acb6-f1825cec3f42/element/:wdc:1497532195411/attribute/href {"sessionId": "e3ba0740-51cb-11e7-acb6-f1825cec3f42", "name": "href", "id": ":wdc:1497532195411"}
2017-06-15 15:09:55 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
我觉得这就是我无法转到其他链接的原因,因为响应总是停留在同一个站点上,而不是跟随新链接,我猜替换方法不会执行适当的操作,而是返回结果:
replace([url, status, headers, body, request, flags, cls])
Returns a Response object with the same members, except for those members given new values by whichever keyword arguments are specified.
因此,我会尝试以下方法:
new_response = response.replace(whatever=value)
我猜replace方法不会就地执行操作,而是返回结果:
replace([url, status, headers, body, request, flags, cls])
Returns a Response object with the same members, except for those members given new values by whichever keyword arguments are specified.
因此,我会尝试以下方法:
new_response = response.replace(whatever=value)
bla Out[28]:bla=bla.replace(url=')bla Out[30]:correct bla.url Out[26]:“”bla.replace(url=')Out[27]:bla Out[28]:bla=bla.replace(url=')bla Out[30]:correct bla.url Out[26]:“”bla replace(url=')Out[27]:bla Out[28]:bla=bla replace(url=')bla bla Out[30]: