如何在Python中使用Selenium提取文本元素?

如何在Python中使用Selenium提取文本元素?,python,selenium,class,xpath,Python,Selenium,Class,Xpath,我正在使用Selenium从app store中刮取内容: 作为主题专家,我试图提取文本字段,我们的团队非常有吸引力 我试着按类查找元素 review_ratings = driver.find_elements_by_class_name('we-truncate we-truncate--multi-line we-truncate--interactive ember-view we-customer-review__body') review_ratingsList = [] for e

我正在使用Selenium从app store中刮取内容:

作为主题专家,我试图提取文本字段,我们的团队非常有吸引力

我试着按类查找元素

review_ratings = driver.find_elements_by_class_name('we-truncate we-truncate--multi-line we-truncate--interactive ember-view we-customer-review__body')
review_ratingsList = []
for e in review_ratings:
review_ratingsList.append(e.get_attribute('innerHTML'))
review_ratings
但它返回一个空列表[]

代码有问题吗?还是有更好的解决方案?感谢您的帮助。

您可以使用WebDriverWait等待元素的可见性并获取文本。请查收

您可以使用WebDriverWait等待元素的可见性并获取文本。请查收

我可以建议把硒和美素混合吗? 使用webdriver:

from bs4 import BeautifulSoup
from selenium import webdriver
browser=webdriver.Chrome()
url = "https://apps.apple.com/us/app/bank-of-america-private-bank/id1096813830"
browser.get(url)
innerHTML = browser.execute_script("return document.body.innerHTML")

bs = BeautifulSoup(innerHTML, 'html.parser')

bs.blockquote.p.text
输出:

Out[22]: 'As subject matter experts, our team is very engaging and focused on our near and long term financial health!'
As subject matter experts, our team is very engaging and focused on our near and long term financial health!
如果有什么要解释的,就告诉我

我可以建议将硒与美素素混合使用吗? 使用webdriver:

from bs4 import BeautifulSoup
from selenium import webdriver
browser=webdriver.Chrome()
url = "https://apps.apple.com/us/app/bank-of-america-private-bank/id1096813830"
browser.get(url)
innerHTML = browser.execute_script("return document.body.innerHTML")

bs = BeautifulSoup(innerHTML, 'html.parser')

bs.blockquote.p.text
输出:

Out[22]: 'As subject matter experts, our team is very engaging and focused on our near and long term financial health!'
As subject matter experts, our team is very engaging and focused on our near and long term financial health!
如果有什么要解释的,就告诉我

使用WebDriverWait并等待找到的所有元素出现,然后使用以下Css选择器

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://apps.apple.com/us/app/bank-of-america-private-bank/id1096813830")
review_ratings =WebDriverWait(driver,20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,'.we-customer-review__body p[dir="ltr"]')))
review_ratingsList = []
for e in review_ratings:
 review_ratingsList.append(e.get_attribute('innerHTML'))
print(review_ratingsList)
输出: 使用WebDriverWait并等待所有元素的出现,然后使用以下Css选择器

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://apps.apple.com/us/app/bank-of-america-private-bank/id1096813830")
review_ratings =WebDriverWait(driver,20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,'.we-customer-review__body p[dir="ltr"]')))
review_ratingsList = []
for e in review_ratings:
 review_ratingsList.append(e.get_attribute('innerHTML'))
print(review_ratingsList)
输出: 使用请求和美化组:

输出:

Out[22]: 'As subject matter experts, our team is very engaging and focused on our near and long term financial health!'
As subject matter experts, our team is very engaging and focused on our near and long term financial health!
使用请求和美化组:

输出:

Out[22]: 'As subject matter experts, our team is very engaging and focused on our near and long term financial health!'
As subject matter experts, our team is very engaging and focused on our near and long term financial health!

@Juan C return document.body.innerHTML的作用是什么?谢谢,它可以让你在加载所有内容后获得动态HTML网页的HTML代码。如果您试图从原始链接获取BeautifulSoup对象,它将不会有必须加载的javascript对象的数据等。我没有软件工程背景,所以我的语言可能不准确。@Arthurmargan如果你发现这是最好的答案,请你标记一下,这样你的问题就不会出现在“未回答的问题”页面上?@Juan C return document.body.innerHTML做了什么?谢谢,它可以让你在加载所有内容后获得动态HTML网页的HTML代码。如果您试图从原始链接获取BeautifulSoup对象,它将不会有必须加载的javascript对象的数据等。我没有软件工程背景,所以我的语言可能不准确。@ArthurMorgan如果你发现这是最好的答案,请你标记一下,这样你的问题就不会出现在“未回答的问题”页面上吗?我喜欢这个答案,就像你有一个URL列表一样。这是我想象的最快的方法,可以转到特定页面并将其删除。如果需要在页面之间导航,那么这将不起作用。很高兴知道。谢谢我喜欢这个答案,就好像你有一个URL列表,这是我想象的最快的方法去特定的页面,刮他们。如果需要在页面之间导航,那么这将不起作用。很高兴知道。谢谢我可以问一下我们客户审查机构是做什么的吗@KunduKIt是类名称。对于css选择器,类名称以.classname开头。我可以问一下.we-customer-review\uu主体是做什么的吗@KunduKIt是类名称。对于css选择器,类名称以.classname开头