Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/320.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 无法从网页中刮取主书的标题以及客户查看的书籍_Python_Python 3.x_Selenium_Selenium Webdriver_Web Scraping - Fatal编程技术网

Python 无法从网页中刮取主书的标题以及客户查看的书籍

Python 无法从网页中刮取主书的标题以及客户查看的书籍,python,python-3.x,selenium,selenium-webdriver,web-scraping,Python,Python 3.x,Selenium,Selenium Webdriver,Web Scraping,我一直在尝试从网页上刮取登录页上的书名以及的书名。要获取所有书籍的标题,必须像上图中所示,不断单击右箭头按钮 我试过: from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions

我一直在尝试从网页上刮取登录页上的书名以及的书名。要获取所有书籍的标题,必须像上图中所示,不断单击右箭头按钮

我试过:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

links = [
    "https://www.amazon.com/Keto-Meal-Prep-Cookbook-Beginners/dp/1673455980/",
    "https://www.amazon.com/Keto-Diet-Cookbook-Beginners-Recipes/dp/1792145454/"
]

def fetch_content(link):
    driver.get(link)
    title = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR,'h1#title > span#productTitle'))).text
    page_count = wait.until(EC.presence_of_element_located((By.XPATH,'//*[contains(@class,"a-carousel-header-row")][.//h2[contains(@class,"a-carousel-heading")][contains(.,"Customers who")]]//span[@class="a-carousel-page-max"]'))).text

    title_list = []
    for i in range(int(page_count)+1):
        wait.until(EC.presence_of_element_located((By.XPATH,'//*[contains(@class,"a-carousel-header-row")][.//h2[contains(@class,"a-carousel-heading")][contains(.,"Customers who")]]/following-sibling::*[contains(@class,"a-carousel-row")]//a[contains(@class,"a-carousel-goto-nextpage")]'))).click()
        for item in wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR,"li.a-carousel-card > a.a-link-normal > div[data-rows]"))):
            title_list.append(item.text)
    return title,title_list

if __name__ == '__main__':
    with webdriver.Chrome() as driver:
        wait = WebDriverWait(driver,15)
        for link in links:
            print(fetch_content(link))
当我执行上述脚本时,我会注意到(如果在脚本运行时手动向下滚动一点),它会从查看容器的
客户那里获取前两个标题,然后抛出
陈旧元素引用
指向
标题列表.append(item.text)
的错误

我怎样才能从网页上刮取主书的标题以及客户查看的书籍


我想你需要再次等待,在你点击箭头之后,检查元素。我很难弄清楚这一点,但我知道你的
a-autoid-13,a-autoid-17
正在点击赞助商相关项目旋转木马的
箭头。我不认为那是你想做的。你希望客户的右箭头也能看到旋转木马,不是吗?这就是:
a-autoid-16
。除此之外,我很困惑。对不起,我看错了你的帖子。您希望客户也购买旋转木马,即
a-autoid-32
这是否回答了您的问题?谢谢@baduker的链接。我希望坚持用硒来解决这个问题。