Python 为什么selenium只接受前12项?
我正在尝试为一个网站()创建一个web scraper,它复制一个图像列表并将它们保存在一个目录中。一切似乎都很正常,除了我希望它能捡到的800多件物品,它只捡到了12件。我试过使用selenium的Python 为什么selenium只接受前12项?,python,selenium,web-scraping,Python,Selenium,Web Scraping,我正在尝试为一个网站()创建一个web scraper,它复制一个图像列表并将它们保存在一个目录中。一切似乎都很正常,除了我希望它能捡到的800多件物品,它只捡到了12件。我试过使用selenium的隐式等待,但似乎不起作用。我希望它能把这一页上的每一张照片都刮掉 下面是我的代码: from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait import shutil import
隐式等待
,但似乎不起作用。我希望它能把这一页上的每一张照片都刮掉
下面是我的代码:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
import shutil
import os
import requests
def spritescrape(driver):
sprites_list = driver.find_elements_by_tag_name('img')
sprite_srcs = [sprite.get_attribute('src') for sprite in sprites_list]
return sprite_srcs
def download_images(srcs, dirname):
for index, src in enumerate(srcs):
response = requests.get(src, stream=True)
save_image(response, dirname, index)
del response
def save_image(image, dirname, suffix):
with open('{dirname}/img_{suffix}.jpg'.format(dirname=dirname, suffix=suffix), 'wb') as out_file:
shutil.copyfileobj(image.raw, out_file)
def make_dir(dirname):
current_path = os.getcwd()
path = os.path.join(current_path, dirname)
if not os.path.exists(path):
os.makedirs(path)
if __name__ == '__main__':
chromeexe_path = r'C:\code\Learning Python\Scrapers\chromedriver.exe'
driver = webdriver.Chrome(executable_path=chromeexe_path)
driver.get(r'https://pokemondb.net/pokedex/national')
driver.implicitly_wait(10)
sprite_links = spritescrape(driver)
dirname = 'sprites'
make_dir(dirname)
download_images(sprite_links, dirname)
我听说有些网站可以通过防止刮擦的方式建立,我想知道这个网站是否也是这样。我是一个非常新的编码,所以任何帮助获得所有的图像将不胜感激 当页面第一次打开时,所有元素都没有加载。它们似乎只在您向下滚动页面时加载。在这种情况下,我所做的是先滚动到页面底部,然后查找元素。这满足了我的需要
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
您需要将页面滚动到底部。但是,如果您直接转到
scrollHeight
,您将再次释放所有元素。您需要使用无限循环并在每页缓慢滚动,并在滚动期间添加elements属性,以使其不再丢失。我有890个元素
请尝试下面的代码
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://pokemondb.net/pokedex/national")
sprite_srcs=[]
height=1000
itemsnobefore=len(sprite_srcs)
while True:
driver.execute_script("window.scrollTo(0," + str(height) + ");")
sprites_list = driver.find_elements_by_tag_name('img')
for sprite in sprites_list:
if sprite.get_attribute('src') not in sprite_srcs:
sprite_srcs.append(sprite.get_attribute('src'))
itemsnoafter=len(sprite_srcs)
#Break the loop when there is no more image tag left
if itemsnobefore==itemsnoafter:
break
itemsnobefore=itemsnoafter
height=height+500
time.sleep(0.25)
print(len(sprites_list))
网站中的元素使用。因此,要提取图像的
src
属性列表,您必须向下滚动到页面末尾,您可以使用以下命令:
- 代码块:
from selenium import webdriver from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC options = webdriver.ChromeOptions() options.add_argument("start-maximized") options.add_experimental_option("excludeSwitches", ["enable-automation"]) options.add_experimental_option('useAutomationExtension', False) driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe') driver.get("https://pokemondb.net/pokedex/national") myLength = len(WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//img[@class]")))) while True: try: driver.execute_script("window.scrollBy(0,1500)", ""); WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//img[@class]"))) WebDriverWait(driver, 20).until(lambda driver: len(driver.find_elements_by_xpath("//img[@class]")) > myLength) elements = driver.find_elements_by_xpath("//img[@class]") myLength = len(elements) except TimeoutException: break print(myLength) for element in elements: print(element.get_attribute("src")) driver.quit()
- 控制台输出:
890 https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/bulbasaur.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/ivysaur.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/venusaur.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/charmander.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/charmeleon.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/charizard.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/squirtle.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/wartortle.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/blastoise.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/caterpie.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/metapod.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/butterfree.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/weedle.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/kakuna.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/beedrill.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/pidgey.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/pidgeotto.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/pidgeot.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/rattata.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/raticate.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/spearow.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/fearow.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/ekans.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/arbok.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/pikachu.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/raichu.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/sandshrew.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/sandslash.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/nidoran-f.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/nidorina.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/nidoqueen.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/nidoran-m.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/nidorino.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/nidoking.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/clefairy.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/clefable.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/vulpix.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/ninetales.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/jigglypuff.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/wigglytuff.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/zubat.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/golbat.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/oddish.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/gloom.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/vileplume.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/paras.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/parasect.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/venonat.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/venomoth.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/diglett.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/dugtrio.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/meowth.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/persian.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/psyduck.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/golduck.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/mankey.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/primeape.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/growlithe.png https://img.pokemondb.net/sprites/omega-ruby-alpha-sapphire/dex/normal/arcanine.png . . . https://img.pokemondb.net/sprites/sword-shield/pixel/dreepy.png https://img.pokemondb.net/sprites/sword-shield/pixel/drakloak.png https://img.pokemondb.net/sprites/sword-shield/pixel/dragapult.png https://img.pokemondb.net/sprites/sword-shield/pixel/zacian-crowned.png https://img.pokemondb.net/sprites/sword-shield/pixel/zamazenta-crowned.png https://img.pokemondb.net/sprites/sword-shield/pixel/eternatus.png