Selenium找不到Python元素_Python_Selenium Webdriver_Web Scraping_Web Crawler_Selenium Chromedriver

Selenium找不到Python元素

python selenium-webdriver web-scraping web-crawler

Selenium找不到Python元素,python,selenium-webdriver,web-scraping,web-crawler,selenium-chromedriver,Python,Selenium Webdriver,Web Scraping,Web Crawler,Selenium Chromedriver,我用selenium编写了一个代码来提取足球联赛中的回合数，我看到的所有页面中的所有元素都是相同的，但出于某种原因，该代码适用于某些链接，而不适用于其他链接 from selenium import webdriver from selenium.webdriver.firefox.options import Options from time import sleep def pack_links(l): options = Options() options.headl

我用selenium编写了一个代码来提取足球联赛中的回合数，我看到的所有页面中的所有元素都是相同的，但出于某种原因，该代码适用于某些链接，而不适用于其他链接

from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from time import sleep

def pack_links(l):

    options = Options()
    options.headless = True
    driver = webdriver.Chrome()
    driver.get(l)

    rnds = driver.find_element_by_id('showRound')
    a_ = rnds.find_elements_by_xpath(".//td[@class='lsm2']")
    #a_ = driver.find_elements_by_class_name('lsm2')

    knt = 0
    for _ in a_:
        knt = knt+1

    print(knt)

    sleep(2)
    driver.close()
    return None

link = 'http://info.nowgoal.com/en/League/34.html'
pack_links(link)

这是一个有效的链接，它返回类为lsm2的

td

标记的数量

还有一张源页面的图片

这个返回值为0，由于某种原因，它找不到class

lsm2

的标签，也找不到感兴趣的片段的图片

即使我试图用这条注释行直接找到它，它仍然返回0。我将非常感谢您的帮助。

据我所知，td的内部HTML带有“showRound”id，是动态的，由showRound（）JS函数加载，然后在页面加载时由页面头标记内的脚本调用。因此，在您的情况下，它似乎没有足够的时间加载。我试图用两种方法解决这个问题：

一个棘手的问题是：使用驱动程序。隐式地等待（等待的秒数）。我还建议将来使用它而不是sleep（）。然而，这个解决方案相当笨拙，有点异步；换句话说，它主要等待秒倒计时，而不是结果

我们可以等待“lsm2”类的第一个元素加载；如果在一段合理的超时时间后，它没有这样做，我们可能会停止等待并提出en异常（感谢Zeinab Abbasimazar的回答）。这可以通过预期的_条件和WebDriverWait实现：

您可以进行一些实验并调整超时长度，以获得必要的结果。我还建议使用len（a_）而不是使用for循环进行迭代，但这取决于您

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException

def pack_links(l):
    options = webdriver.ChromeOptions()  # I would also suggest to use this instead of Options()
    options.add_argument("--headless")
    options.add_argument("--enable-javascript")  # To be on the safe side, although it seems to be enabled by default
    driver = webdriver.Chrome("path_to_chromedriver_binary", options=options)
    driver.get(l)
    rnds = driver.find_element_by_id('showRound')

    """Until now, your code has gone almost unchanged. Now let's wait for the first td element with lsm2 class to load, with setting maximum timeout of 5 seconds:"""

    try:
        WebDriverWait(driver, 5).until(EC.presence_of_element_located((By.CLASS_NAME, "lsm2")))
        print("All necessary tables have been loaded successfully")
    except TimeoutException:
        raise("Timeout error")


    """Then we proceed in case of success:"""

    a_ = rnds.find_elements_by_xpath(".//td[@class='lsm2']")
    knt = 0
    for _ in a_:
        knt = knt+1

    print(knt)

    driver.implicitly_wait(2)  # Not sure if it is needed here anymore
    driver.close()
    driver.quit()  # I would also recommend to make sure you quit the driver not only close it if you don't want to kill numerous RAM-greedy Chrome processes by hand 
    return None