Python | Selenium |不断输出页面源代码，而不是手动检查元素时看到的内容_Python_Selenium Webdriver_Xpath_Beautifulsoup

Python | Selenium |不断输出页面源代码，而不是手动检查元素时看到的内容

python selenium-webdriver xpath

Python | Selenium |不断输出页面源代码，而不是手动检查元素时看到的内容,python,selenium-webdriver,xpath,beautifulsoup,Python,Selenium Webdriver,Xpath,Beautifulsoup,我希望打印出我在手动检查页面时看到的内容。然而，它看起来像是在打印页面源代码，因为我在手动检查时找不到存在的元素我想知道美国银行信用卡的产品名称我之所以使用Selenium，是因为bankofamerica网站上的产品名称是通过Javascript生成的。一旦我知道我正在解析正确的元素，我计划通过搜索类来查找卡片名称和其他相关元素我相信我已经正确安装了Firefox web驱动程序，因为代码会打开一个指向正确页面的浏览器窗口 import requests from bs4 import

我希望打印出我在手动检查页面时看到的内容。然而，它看起来像是在打印页面源代码，因为我在手动检查时找不到存在的元素

我想知道美国银行信用卡的产品名称

我之所以使用Selenium，是因为bankofamerica网站上的产品名称是通过Javascript生成的。一旦我知道我正在解析正确的元素，我计划通过搜索类来查找卡片名称和其他相关元素

我相信我已经正确安装了Firefox web驱动程序，因为代码会打开一个指向正确页面的浏览器窗口

import requests
from bs4 import BeautifulSoup
from selenium import webdriver

browser = webdriver.Firefox()
browser.get('https://www.bankofamerica.com/credit-cards/#filter')
html = browser.execute_script("return document.documentElement.outerHTML")

sel_soup = BeautifulSoup(html,'html.parser')
print (sel_soup)

使用selenium尝试下面的代码

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

browser = webdriver.Firefox()
browser.get('https://www.bankofamerica.com/credit-cards/#filter')
# wait until the `.small-12.medium-9.columns` class elements present
WebDriverWait(browser,5).until(EC.presence_of_element_located((By.CSS_SELECTOR,'.small-12.medium-9.columns')))
# get all the elements with matching class
creditCardOptions = browser.find_elements_by_css_selector('.small-12.medium-9.columns')
# now you can have your own logic to iterate through all the CC options.

使用selenium尝试下面的代码

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

browser = webdriver.Firefox()
browser.get('https://www.bankofamerica.com/credit-cards/#filter')
# wait until the `.small-12.medium-9.columns` class elements present
WebDriverWait(browser,5).until(EC.presence_of_element_located((By.CSS_SELECTOR,'.small-12.medium-9.columns')))
# get all the elements with matching class
creditCardOptions = browser.find_elements_by_css_selector('.small-12.medium-9.columns')
# now you can have your own logic to iterate through all the CC options.

您没有使用所需的xpath筛选已解析的html，这就是它显示整个html的原因。@supputuri非常感谢您的帮助。如果我仍然遗漏了一些东西，我表示歉意，但在这个阶段，我希望它能输出整个html。但它并不是输出与我手动检查页面时看到的html相同的html，而是输出页面的静态源（而不是Javascript输出的动态元素）。这有助于澄清吗？您的意思是说，当您不断滚动页面时，将加载的元数据或动态数据？我建议在您的案例中使用selenium来获得您想要的结果，因为有一个js正在加载数据，而BS将在没有js数据的情况下获得html。滚动页面时加载的动态数据。具体来说，如果我手动检查页面，我可以找到class=“small-12 medium-9 columns”，其中包含我需要的信息，但是当我运行我在上面发布的代码时，“class=“small-12 medium-9 columns”不会显示在任何地方。谢谢：）啊，我刚刚读了你的第二部分，我将努力用其他与硒相关的东西来取代BeautifulSoup。只需要弄清楚应该是什么…：）刚刚发布了伪代码，这应该给您一个好的开始。您没有使用所需的xpath过滤解析的html，这就是为什么它显示整个html的原因。@supputuri非常感谢您的帮助。如果我仍然遗漏了一些东西，我表示歉意，但在这个阶段，我希望它能输出整个html。但它并不是输出与我手动检查页面时看到的html相同的html，而是输出页面的静态源（而不是Javascript输出的动态元素）。这有助于澄清吗？您的意思是说，当您不断滚动页面时，将加载的元数据或动态数据？我建议在您的案例中使用selenium来获得您想要的结果，因为有一个js正在加载数据，而BS将在没有js数据的情况下获得html。滚动页面时加载的动态数据。具体来说，如果我手动检查页面，我可以找到class=“small-12 medium-9 columns”，其中包含我需要的信息，但是当我运行我在上面发布的代码时，“class=“small-12 medium-9 columns”不会显示在任何地方。谢谢：）啊，我刚刚读了你的第二部分，我将努力用其他与硒相关的东西来取代BeautifulSoup。只需要弄清楚应该是什么…：）刚刚发布了伪代码，这应该会给你一个好的开始。