Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用Beauty soup的python web抓取不起作用_Python_Web Scraping - Fatal编程技术网

使用Beauty soup的python web抓取不起作用

使用Beauty soup的python web抓取不起作用,python,web-scraping,Python,Web Scraping,我正试图从沃尔玛网站上删除一些数据进行研究 我想把所有的产品分类都删掉。每个产品类别都有这个html容器 <div class="TempoCategoryTileV2-tile"><img alt="" aria-hidden="true" tabindex="-1" itemprop="image" src="//i5.walmartimages.com/dfw/4ff9c6c9-deda/k2-_c3162a27-dbb6-46df-8b9f-b5b52ea657b

我正试图从沃尔玛网站上删除一些数据进行研究

我想把所有的产品分类都删掉。每个产品类别都有这个html容器

  <div class="TempoCategoryTileV2-tile"><img alt="" aria-hidden="true" tabindex="-1" itemprop="image" src="//i5.walmartimages.com/dfw/4ff9c6c9-deda/k2-_c3162a27-dbb6-46df-8b9f-b5b52ea657b2.v1.jpg?odnWidth=168&amp;odnHeight=210&amp;odnBg=ffffff" class="TempoCategoryTileV2-tile-img display-block">
<div class="TempoCategoryTileV2-tile-content-one text-center">
    <div class="TempoCategoryTileV2-tile-linkText">
        <div style="overflow: hidden;">
            <div>Toyland</div>
        </div>
    </div>
</div><a class="TempoCategoryTileV2-tile-overlay" id="HomePage-contentZone12-FeaturedCategoriesCuratedV2-tileLink-1" aria-label="Toyland" href="/cp/toys/4171?povid=14503+%257C+contentZone12+%257C+2017-11-01+%257C+1+%257C+HP+FC+Toys" data-uid="zir3SFhh" tabindex="" data-tl-id="HomePage-contentZone12-FeaturedCategoriesCuratedV2-categoryTile-1-link" style="background-image: url(&quot;about:blank&quot;);"></a></div>
但当我运行它时,我得到的都是这些

 "Relax we are getting the data..." 
 []

由于某些原因,它无法从页面获取内容。我做错了什么?我该如何解决这个问题?

该页面的项目是动态生成的,因此您需要使用任何浏览器模拟器来捕获它。试试这个

import time
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome()
Walmarthome = 'https://www.walmart.com/?povid=14503+%7C+contentZone1+%7C+2017-10-27+%7C+1+%7C+header+logo'
driver.get(Walmarthome)
page = driver.find_element_by_tag_name('body')
for i in range(3):
    page.send_keys(Keys.PAGE_DOWN)
    time.sleep(2)

soup = BeautifulSoup(driver.page_source,"lxml")
driver.quit()
for item in soup.select(".TempoCategoryTileV2-tile"):
    title = item.select(".TempoCategoryTileV2-tile-overlay")[0]['aria-label']
    image = item.select("[itemprop='image']")[0]['src']
    print(title,image)

谢谢,这是可行的,但它得到的是电脑。我想从电脑(视频游戏、食品、电子产品)下载产品类别,查看编辑后的代码。如果它符合你的目的,一定要接受它作为一个答案。谢谢。
import time
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
from selenium import webdriver

driver = webdriver.Chrome()
Walmarthome = 'https://www.walmart.com/?povid=14503+%7C+contentZone1+%7C+2017-10-27+%7C+1+%7C+header+logo'
driver.get(Walmarthome)
page = driver.find_element_by_tag_name('body')
for i in range(3):
    page.send_keys(Keys.PAGE_DOWN)
    time.sleep(2)

soup = BeautifulSoup(driver.page_source,"lxml")
driver.quit()
for item in soup.select(".TempoCategoryTileV2-tile"):
    title = item.select(".TempoCategoryTileV2-tile-overlay")[0]['aria-label']
    image = item.select("[itemprop='image']")[0]['src']
    print(title,image)