Python 无法从网页中刮取主书的标题以及客户查看的书籍
我一直在尝试从网页上刮取登录页上的书名以及的书名。要获取所有书籍的标题,必须像上图中所示,不断单击右箭头按钮 我试过:Python 无法从网页中刮取主书的标题以及客户查看的书籍,python,python-3.x,selenium,selenium-webdriver,web-scraping,Python,Python 3.x,Selenium,Selenium Webdriver,Web Scraping,我一直在尝试从网页上刮取登录页上的书名以及的书名。要获取所有书籍的标题,必须像上图中所示,不断单击右箭头按钮 我试过: from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
links = [
"https://www.amazon.com/Keto-Meal-Prep-Cookbook-Beginners/dp/1673455980/",
"https://www.amazon.com/Keto-Diet-Cookbook-Beginners-Recipes/dp/1792145454/"
]
def fetch_content(link):
driver.get(link)
title = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,'h1#title > span#productTitle'))).text
page_count = wait.until(EC.presence_of_element_located((By.XPATH,'//*[contains(@class,"a-carousel-header-row")][.//h2[contains(@class,"a-carousel-heading")][contains(.,"Customers who")]]//span[@class="a-carousel-page-max"]'))).text
title_list = []
for i in range(int(page_count)+1):
wait.until(EC.presence_of_element_located((By.XPATH,'//*[contains(@class,"a-carousel-header-row")][.//h2[contains(@class,"a-carousel-heading")][contains(.,"Customers who")]]/following-sibling::*[contains(@class,"a-carousel-row")]//a[contains(@class,"a-carousel-goto-nextpage")]'))).click()
for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"li.a-carousel-card > a.a-link-normal > div[data-rows]"))):
title_list.append(item.text)
return title,title_list
if __name__ == '__main__':
with webdriver.Chrome() as driver:
wait = WebDriverWait(driver,15)
for link in links:
print(fetch_content(link))
当我执行上述脚本时,我会注意到(如果在脚本运行时手动向下滚动一点),它会从查看容器的客户那里获取前两个标题,然后抛出陈旧元素引用
指向标题列表.append(item.text)
的错误
我怎样才能从网页上刮取主书的标题以及客户查看的书籍
我想你需要再次等待,在你点击箭头之后,检查元素。我很难弄清楚这一点,但我知道你的a-autoid-13,a-autoid-17
正在点击赞助商相关项目旋转木马的左箭头。我不认为那是你想做的。你希望客户的右箭头也能看到旋转木马,不是吗?这就是:a-autoid-16
。除此之外,我很困惑。对不起,我看错了你的帖子。您希望客户也购买旋转木马,即a-autoid-32
这是否回答了您的问题?谢谢@baduker的链接。我希望坚持用硒来解决这个问题。