无法使用selenium、beautiufulsoup和python正确地刮取图像URL
我正在抓取的链接: 使用上面的代码,我从上面的链接中抓取图像URL 输出无法使用selenium、beautiufulsoup和python正确地刮取图像URL,python,selenium,web-scraping,beautifulsoup,python-requests,Python,Selenium,Web Scraping,Beautifulsoup,Python Requests,我正在抓取的链接: 使用上面的代码,我从上面的链接中抓取图像URL 输出 21 ['https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/crest_world_elite.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/des
21
['https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/crest_world_elite.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/indulge_credit_card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/celesta-credit-card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/indulge_credit_card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/crest_world_elite.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/celesta-credit-card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/th-pioneer-heritage-credit-card.png', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pioneer_legacy_world_card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pinnacle_master-card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/Legend_card-image_396x257px.png', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/nexxt-credit-card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_select.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/payback-visa-credit-card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia_-card_american_express.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia-visa-credit-card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Odyssey_amex_front.png', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-odyssey-visa-credit-card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Voyage_amex_front.png', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-voyage-visa-credit-card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/signature_card.jpg', 'https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_aura_master.jpg']
输出在某种程度上是正确的,但该列表中的许多图像URL是重复的,链接与网站中的卡不一致,总共有22张卡,因此在输出中,我希望列表中的22个图像URL与网站的顺序一致
我想知道一种替代代码,它可以解决输出的所有问题
非常感谢您的帮助。这些复制的图像源URL来自推荐部分。因此,您需要跳过前三项,然后将所有图像链接按正确顺序排列 以下是方法:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options)
driver.get("https://www.indusind.com/in/en/personal/cards/credit-card.html")
elements = driver.find_elements_by_css_selector('.chkboxcard .cat-card-header img')
cards = [element.get_attribute("src") for element in elements][3:]
print(len(cards))
for card in cards:
print(card)
输出:
22
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/indulge_credit_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/crest_world_elite.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/celesta-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/th-pioneer-heritage-credit-card.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pioneer_legacy_world_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pinnacle_master-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/Legend_card-image_396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/nexxt-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/product/creditcardimages/Platinum396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_select.jpg
https://www.indusind.com/content/dam/indusind-platform-images/product/creditcardimages/AuraEdge396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/payback-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia_-card_american_express.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Odyssey_amex_front.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-odyssey-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Voyage_amex_front.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-voyage-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/signature_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_aura_master.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/debitCard/duo-card_Visa_front.jpg
https://www.indusind.com/content/dam/indusind-platform-images/banner-images/iconia-amex-visa/card-stack.png
22
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/indulge_credit_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/crest_world_elite.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/celesta-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/th-pioneer-heritage-credit-card.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pioneer_legacy_world_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pinnacle_master-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/Legend_card-image_396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/nexxt-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/product/creditcardimages/Platinum396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_select.jpg
https://www.indusind.com/content/dam/indusind-platform-images/product/creditcardimages/AuraEdge396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/payback-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia_-card_american_express.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Odyssey_amex_front.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-odyssey-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Voyage_amex_front.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-voyage-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/signature_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_aura_master.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/debitCard/duo-card_Visa_front.jpg
https://www.indusind.com/content/dam/indusind-platform-images/banner-images/iconia-amex-visa/card-stack.png
问题是,您只是在搜索
jpg
,但也有一些png
图像
此外,由于主要部分中的三个图像,因此有三个重复。您可以根据使用切片的set
和list
的计数跳过前三个(n
)
此外,您还可以使用选择器rounded-1 w-100
直接获取图像,并消除if
条件:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import json, requests, re, sys
from selenium import webdriver
import re, time
IndusInd_url = "https://www.indusind.com/in/en/personal/cards/credit-card.html"
driver = webdriver.Chrome(executable_path="C:\\Users\\I346696\\Downloads\\chromedriver.exe")
driver.get(IndusInd_url)
time.sleep(3)
soup = BeautifulSoup(driver.page_source, 'lxml')
img = []
for x in soup.find_all('img', class_ = 'rounded-1 w-100'):
img.append("https://www.indusind.com" + x.get('src'))
start = len(img) - len(set(img))
img = img[start:]
print(len(img))
for im in img:
print(im)
driver.close()
输出:
22
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/indulge_credit_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/crest_world_elite.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/celesta-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/th-pioneer-heritage-credit-card.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pioneer_legacy_world_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pinnacle_master-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/Legend_card-image_396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/nexxt-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/product/creditcardimages/Platinum396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_select.jpg
https://www.indusind.com/content/dam/indusind-platform-images/product/creditcardimages/AuraEdge396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/payback-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia_-card_american_express.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Odyssey_amex_front.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-odyssey-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Voyage_amex_front.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-voyage-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/signature_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_aura_master.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/debitCard/duo-card_Visa_front.jpg
https://www.indusind.com/content/dam/indusind-platform-images/banner-images/iconia-amex-visa/card-stack.png
22
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/indulge_credit_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/crest_world_elite.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/celesta-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/th-pioneer-heritage-credit-card.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pioneer_legacy_world_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/pinnacle_master-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/Legend_card-image_396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/nexxt-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/product/creditcardimages/Platinum396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_select.jpg
https://www.indusind.com/content/dam/indusind-platform-images/product/creditcardimages/AuraEdge396x257px.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/payback-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia_-card_american_express.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/iconia-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Odyssey_amex_front.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-odyssey-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-card_Voyage_amex_front.png
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/intermiles-voyage-visa-credit-card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/signature_card.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/creditCard/platinum_aura_master.jpg
https://www.indusind.com/content/dam/indusind-platform-images/productCategory/desktopImage/debitCard/duo-card_Visa_front.jpg
https://www.indusind.com/content/dam/indusind-platform-images/banner-images/iconia-amex-visa/card-stack.png
如果不希望保留顺序,也可以将列表转换为set