Python 3.x 检索具有自定义HTML属性的元素_Python 3.x_Selenium_Xpath_Scrapy_Css Selectors

Python 3.x 检索具有自定义HTML属性的元素

python-3.x selenium xpath scrapy

Python 3.x 检索具有自定义HTML属性的元素,python-3.x,selenium,xpath,scrapy,css-selectors,Python 3.x,Selenium,Xpath,Scrapy,Css Selectors,我有以下网站：，在那里我想检索Selenium、Scrapy和Python的登录链接。因此，对于相关函数，我有以下代码： def start_requests(self): self.driver = webdriver.Chrome(executable_path=os.path.join(os.getcwd(), "Drivers", "chromedriver.exe")) self.driver.get(self.initial_url)

我有以下网站：，在那里我想检索Selenium、Scrapy和Python的登录链接。因此，对于相关函数，我有以下代码：

def start_requests(self):
        self.driver = webdriver.Chrome(executable_path=os.path.join(os.getcwd(), "Drivers", "chromedriver.exe"))
        self.driver.get(self.initial_url)
        test = access_page_wait.until(expected_conditions.visibility_of_element_located((By.CSS_SELECTOR, 'a[data-ui-test-class="linkCard_toegangscode"]')))
    if test.is_displayed():
        print("+1")
    else:
        print("-1")

然而，这似乎不起作用，因为它只是等待15秒，然后停止。它永远不会达到+1或-1

现在我的问题是，我们如何将selenium指向正确的元素。使用XPATH find_elements_by_XPATH//a[@data ui test class='linkCard_toegasscode']似乎也不起作用

我应该使用另一种选择方法吗？如果是，是哪种方法？

因为存在阻止您访问元素的帧。请切换到iframe，然后访问元素

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
import os
driver = webdriver.Chrome(executable_path=os.path.join(os.getcwd(), "Drivers", "chromedriver.exe"))
driver.get("https://www.kvk.nl/handelsregister/publicaties/")
driver.switch_to.frame(0)
test=WebDriverWait(driver,10).until(expected_conditions.visibility_of_element_located((By.CSS_SELECTOR, 'a[data-ui-test-class="linkCard_toegangscode"]')))
if test.is_displayed():
    print("+1")
else:
    print("-1")

试试上面的代码。它应该打印你所关注的。

你尝试了我的答案吗？考虑使用混合Sury和Seice，以防止将来可能遇到的其他问题。实际上，我只希望Selein登录并获得认证后的页面。将登录身份验证头/会话传递给我的Scrapy spider，然后我继续刮。我相信Scrapy会快一点，因为它不需要浏览器。