我想用scrapy python点击网站的链接_Python_Selenium_Selenium Webdriver_Web Scraping_Scrapy

我想用scrapy python点击网站的链接

python selenium selenium-webdriver web-scraping scrapy

我想用scrapy python点击网站的链接,python,selenium,selenium-webdriver,web-scraping,scrapy,Python,Selenium,Selenium Webdriver,Web Scraping,Scrapy,我想点击一个链接我想点击一个链接下面的脚本将为您获取所需的项目，并耗尽连接到下一页链接的所有单击。您不能在此处使用response.follow（），因为除了单击它之外，没有其他链接可以跟随 def parse(self, response): for title in response.css('div.tabledivinlineblock a.tablelink50::attr(href)').extract(): yield {'title': title,

我想点击一个链接

下面的脚本将为您获取所需的项目，并耗尽连接到下一页链接的所有单击。您不能在此处使用

response.follow（）

，因为除了单击它之外，没有其他链接可以跟随

def parse(self, response):
    for title in response.css('div.tabledivinlineblock a.tablelink50::attr(href)').extract():
        yield {'title': title,
               'response': response.url
               }

   # i want to click this a tag
    next = self.driver.find_element_by_xpath('//*[@id="maincontent_DataPager"]/a[last()]')

    # follow pagination links
    # for href in response.css('span#maincontent_DataPager a:last-child'):
    #
    #     yield response.follow(href, self.parse)

    next_page = response.css('span#maincontent_DataPager a:last-child::attr(href)').extract_first().strip()
    if next_page is not None:
        yield response.follow(next_page, callback=self.parse)

我在脚本中使用了harcoded wait，这一点都不推荐。您应该将其替换为

显式等待

如果您描述脚本失败的地方，以及脚本失败时收到的特定异常消息，我们将为您提供最佳帮助。也，请更正问题的格式…我现在看到的文本和代码很混乱。目的是调用javascript函数基本上是我的分页元素单击下一页的加载数据，我想通过下一步按钮加载元素告诉我如何调用javascript函数，因为我的href有javascript函数和我不知道怎么称呼它。看看你文章的标题。但是，上面的脚本也通过单击“下一页”按钮填充下一页内容。这工作正常，但我不想在浏览器上打开页面。我只想更新响应，以便可以从中删除数据。请帮助，然后使用chrome选项使其无头。如果您不想使用任何浏览器模拟器，那么您的帖子就会变得混乱，因为您已经在上面的脚本中启动了它。谢谢。我知道你是告诉我我想要什么，但先生，我想进入每个链接，这是通过你的脚本，我只是想问我如何移动到嵌套页面。

    def __init__(self):
        self.drivers = webdriver.Firefox('C:/Program Files (x86)\Mozilla Firefox')

def parse(self, response):
    for title in response.css('div.tabledivinlineblock a.tablelink50::attr(href)').extract():
        yield {'title': title,
               'response': response.url
               }

   # i want to click this a tag
    next = self.driver.find_element_by_xpath('//*[@id="maincontent_DataPager"]/a[last()]')

    # follow pagination links
    # for href in response.css('span#maincontent_DataPager a:last-child'):
    #
    #     yield response.follow(href, self.parse)

    next_page = response.css('span#maincontent_DataPager a:last-child::attr(href)').extract_first().strip()
    if next_page is not None:
        yield response.follow(next_page, callback=self.parse)

import time
import scrapy
from selenium import webdriver

class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = [
        'http://ozhat-turkiye.com/en/brands/a',
    ]

    def __init__(self):
        self.driver = webdriver.Firefox()

    def parse(self, response):
        self.driver.get(response.url)
        while True:
            time.sleep(5)
            for title in self.driver.find_elements_by_css_selector('div.tabledivinlineblock a.tablelink50'):
                yield {'title': title.text,'response': response.url}

            try:
                self.driver.find_element_by_css_selector('span#maincontent_DataPager a:last-child').click()
            except Exception: break