Selenium搜索ID并非始终有效?

Selenium搜索ID并非始终有效?,selenium,beautifulsoup,screen-scraping,Selenium,Beautifulsoup,Screen Scraping,过去,当一个网站“延迟加载”时,我经常遇到问题- 当我用这种方法搜索身份证时,它起到了帮助作用 element = driver.find_element_by_id ("analyst-estimate") driver.execute_script ("arguments[0].scrollIntoView();", element) 现在我发现这并不是对每个站点都有效 在以下站点上,一切正常: link = "https://www.gu

过去,当一个网站“延迟加载”时,我经常遇到问题-

当我用这种方法搜索身份证时,它起到了帮助作用

element = driver.find_element_by_id ("analyst-estimate")
driver.execute_script ("arguments[0].scrollIntoView();", element)
现在我发现这并不是对每个站点都有效

在以下站点上,一切正常:

link = "https://www.gurufocus.com/stock/AAPL/summary"
options = Options ()
options.add_argument ('--headless')
options.add_experimental_option ('excludeSwitches', ['enable-logging'])
path = os.path.abspath (os.path.dirname (sys.argv[0]))
if platform == "win32": cd = '/chromedriver.exe'
elif platform == "linux": cd = '/chromedriver_linux'
elif platform == "darwin": cd = '/chromedriver'
driver = webdriver.Chrome (path + cd, options=options)
driver.get (link)  # Read link
time.sleep (2)  # Wait till the full site is loaded
element = driver.find_element_by_id ("analyst-estimate")
driver.execute_script ("arguments[0].scrollIntoView();", element)
time.sleep (1)
但是在另一个站点上(也有id-它根本不工作)

为什么这不适用于第二个网站? 这是完全相同的代码-为什么他找不到id-它存在于网页上?

时间。对于等待页面加载,sleep()不是很稳定。切换到webdriver等待。此外,它似乎不需要2秒来加载

wait = WebDriverWait(driver, 5)
wait.until(EC.presence_of_element_located((By.ID, "YDC-Col1")))
另一个问题可能是使用无头和不设置窗口大小

options.add_argument('--headless')
options.add_argument("--window-size=1920,1080")
进口

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC
Time.sleep()对于等待页面加载不是很稳定。切换到webdriver等待。此外,它似乎不需要2秒来加载

wait = WebDriverWait(driver, 5)
wait.until(EC.presence_of_element_located((By.ID, "YDC-Col1")))
另一个问题可能是使用无头和不设置窗口大小

options.add_argument('--headless')
options.add_argument("--window-size=1920,1080")
进口

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

在加载页面之前,会显示一个接受cookies弹出窗口,您必须首先单击该窗口:

WebDriverWait(driver, 5).until(
    EC.presence_of_element_located((By.NAME, "agree"))).click()
WebDriverWait(driver, 5).until(
EC.presence_of_element_located((By.ID, "YDC-Col1")))
在无头模式下测试之前,在无头模式下检查以查看实际行为,如果仅在无头模式下失败,则截图以了解失败期间网站的状态

您可以将屏幕截图设置为:

try:

    link = "https://finance.yahoo.com/quote/MSFT/analysis?p=MSFT"
    options = ChromeOptions()
    options.add_argument('--headless')
    options.add_experimental_option('excludeSwitches', ['enable-logging'])

    driver = webdriver.Chrome(options=options)
    driver.get(link)  # Read link
    time.sleep(2)  # Wait till the full site is loaded


    element = driver.find_element_by_id("YDC-Col1")
    # element = driver.find_element_by_id ("Col2-4-QuoteModule-Proxy")
    # element = driver.find_element_by_id ("app")
    driver.execute_script("arguments[0].scrollIntoView();", element)
    time.sleep(1)

except:
    driver.get_screenshot_as_file("a.jpeg")

在加载页面之前,会显示一个接受cookies弹出窗口,您必须首先单击该窗口:

WebDriverWait(driver, 5).until(
    EC.presence_of_element_located((By.NAME, "agree"))).click()
WebDriverWait(driver, 5).until(
EC.presence_of_element_located((By.ID, "YDC-Col1")))
在无头模式下测试之前,在无头模式下检查以查看实际行为,如果仅在无头模式下失败,则截图以了解失败期间网站的状态

您可以将屏幕截图设置为:

try:

    link = "https://finance.yahoo.com/quote/MSFT/analysis?p=MSFT"
    options = ChromeOptions()
    options.add_argument('--headless')
    options.add_experimental_option('excludeSwitches', ['enable-logging'])

    driver = webdriver.Chrome(options=options)
    driver.get(link)  # Read link
    time.sleep(2)  # Wait till the full site is loaded


    element = driver.find_element_by_id("YDC-Col1")
    # element = driver.find_element_by_id ("Col2-4-QuoteModule-Proxy")
    # element = driver.find_element_by_id ("app")
    driver.execute_script("arguments[0].scrollIntoView();", element)
    time.sleep(1)

except:
    driver.get_screenshot_as_file("a.jpeg")

尝试过这种方式,但在使用wait时仍不起作用。直到(EC.presence\u of_element\u located((By.ID,“YDC-Col1”))我收到以下错误:selenium.common.exceptions.TimeoutException:Message:尝试过这种方式,但在使用wait时仍不起作用。直到(EC.presence\u of_element\u located((By.ID,“YDC-Col1”))我得到这个错误:selenium.common.exceptions.TimeoutException:Message:works很棒-感谢您的解释!请接受我的回答:)准备好了,很好-谢谢你的解释!请接受我的回答:)准备好了吗