Python 元素可以';找不到,请替换文本
有时在我正在抓取的页面中,找不到“price”xpath。我想在找不到“price”xpath元素时替换文本“No pricing info available”,而不是以错误结尾。我确信它与“尝试和例外”有关,但不确定如何编写它。谢谢 在最后一个代码块中更新Python 元素可以';找不到,请替换文本,python,selenium,Python,Selenium,有时在我正在抓取的页面中,找不到“price”xpath。我想在找不到“price”xpath元素时替换文本“No pricing info available”,而不是以错误结尾。我确信它与“尝试和例外”有关,但不确定如何编写它。谢谢 在最后一个代码块中更新 #finds titles deal_title = browser.find_elements_by_xpath("//a[@id='dealTitle']/span") titles = [] for title in deal_ti
#finds titles
deal_title = browser.find_elements_by_xpath("//a[@id='dealTitle']/span")
titles = []
for title in deal_title:
titles.append(title.text)
#finds links
deal_link = browser.find_elements_by_xpath("//div[@class='a-row dealDetailContainer']/div/a[@id='dealTitle']")
links = []
for link in deal_link:
links.append(link.get_attribute('href'))
#finds images
deal_image = browser.find_elements_by_xpath("//a[@id='dealImage']/div/div/div/img")
images = []
for image in deal_image:
images.append(image.get_attribute('src'))
#finds prices (if present)
deal_price = browser.find_elements_by_xpath("//div[@class='a-row priceBlock unitLineHeight']/span")
prices = []
for price in deal_price:
prices.append(price.text)
#writes to html
for title, link, image, price in zip(titles, links, images, prices):
f.write("<tr class='border'><td class='image'>" + "<img src=" + image + "></td>" + "<td class='title'><a href=" + link + '>'">" + title + "</a></td><td class='price'>" + price + "</td></tr>")
#查找标题
deal\u title=browser。通过xpath(“a[@id='dealTitle']/span”)查找元素
标题=[]
对于交易标题中的标题:
titles.append(title.text)
#查找链接
deal\u link=browser。通过xpath(//div[@class='a-row dealDetailContainer']/div/a[@id='dealTitle'])查找元素
链接=[]
对于deal_链接中的链接:
links.append(link.get_属性('href'))
#查找图像
deal\u image=browser。通过xpath(“a[@id='dealImage']/div/div/div/img”)查找元素
图像=[]
对于deal_图像中的图像:
images.append(image.get_属性('src'))
#查找价格(如果存在)
deal\u price=browser。通过xpath(“div[@class='a-row priceBlock unitLineHeight']/span)查找元素
价格=[]
对于交易价格中的价格:
prices.append(price.text)
#写入html
对于zip格式的标题、链接、图像、价格(标题、链接、图像、价格):
f、 写(“+”+“+价格+”)
更新:所以我将代码更新为这样,这样price的值将有一个占位符,而不是通过它,这会导致写入文件时(标题、链接、图像、价格)不匹配。有没有关于如何正确执行此操作的想法?当它写入时,文本“打印/写入此文本而不是传递”将写入文件
#finds titles
deal_title = browser.find_elements_by_xpath("//a[@id='dealTitle']/span")
titles = []
for title in deal_title:
titles.append(title.text)
#finds links
deal_link = browser.find_elements_by_xpath("//div[@class='a-row dealDetailContainer']/div/a[@id='dealTitle']")
links = []
for link in deal_link:
links.append(link.get_attribute('href'))
#finds images
deal_image = browser.find_elements_by_xpath("//a[@id='dealImage']/div/div/div/img")
images = []
for image in deal_image:
images.append(image.get_attribute('src'))
try:
deal_price = browser.find_elements_by_xpath("//div[@class='a-row priceBlock unitLineHeight']/span")
prices = []
for price in deal_price:
prices.append(price.text)
except NoSuchElementException:
price = ("PRINT/WRITE THIS TEXT INSTEAD OF PASSING")
#writes to html
for title, link, image, price in zip(titles, links, images, prices):
f.write("<tr class='border'><td class='image'>" + "<img src=" + image + "></td>" + "<td class='title'><a href=" + link + '>'">" + title + "</a></td><td class='price'>" + price + "</td></tr>")
#查找标题
deal\u title=browser。通过xpath(“a[@id='dealTitle']/span”)查找元素
标题=[]
对于交易标题中的标题:
titles.append(title.text)
#查找链接
deal\u link=browser。通过xpath(//div[@class='a-row dealDetailContainer']/div/a[@id='dealTitle'])查找元素
链接=[]
对于deal_链接中的链接:
links.append(link.get_属性('href'))
#查找图像
deal\u image=browser。通过xpath(“a[@id='dealImage']/div/div/div/img”)查找元素
图像=[]
对于deal_图像中的图像:
images.append(image.get_属性('src'))
尝试:
deal\u price=browser。通过xpath(“div[@class='a-row priceBlock unitLineHeight']/span)查找元素
价格=[]
对于交易价格中的价格:
prices.append(price.text)
除无任何例外:
价格=(“打印/写入此文本,而不是传递”)
#写入html
对于zip格式的标题、链接、图像、价格(标题、链接、图像、价格):
f、 写(“+”+“+价格+”)
在尝试迭代之前,请检查交易价格的值
deal_price = browser.find_elements_by_xpath("//div[@class='a-row priceBlock unitLineHeight']/span")
# did we find any prices?
if deal_price:
prices = []
for price in deal_price:
prices.append(price.text)
else:
# handle missing prices...
如果无法找到用文本价格不可用替代的价格,您可以使用以下解决方案:
- 代码块:
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
browser=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
browser.get("https://www.amazon.com/gp/goldbox/ref=gbps_ftr_s-4_bedf_page_10?gb_f_deals1=enforcedCategories:2972638011,dealStates:AVAILABLE%252CWAITLIST%252CWAITLISTFULL,includedAccessTypes:,page:10,sortOrder:BY_SCORE,dealsPerPage:32&pf_rd_p=afc45143-5c9c-4b30-8d5c-d838e760bedf&pf_rd_s=slot-4&pf_rd_t=701&pf_rd_i=gb_main&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=ZDV4YBQJFDVR3PAY4ZBS&ie=UTF8")
#finds images
deal_image = browser.find_elements_by_xpath("//div[@class='a-row dealContainer dealTile']//div[@class='a-row layer']/img")
images = []
prices = []
for image in deal_image:
images.append(image.get_attribute('src'))
#finds relevant prices
try :
deal_price = image.find_element_by_xpath("./following::div[2]//div[@class='a-row a-spacing-mini'][2]/div[@class='a-row priceBlock unitLineHeight']/span")
prices.append(deal_price.get_attribute("innerHTML"))
except NoSuchElementException:
prices.append("Price Unavailable")
#print the information
for image, price in zip(images, prices):
print(image, price)
- 控制台输出:
https://images-na.ssl-images-amazon.com/images/I/5102g01O1nL._AA210_.jpg $19.60
https://images-na.ssl-images-amazon.com/images/I/51N2rdMSh0L._AA210_.jpg $22.99 - $24.99
https://images-na.ssl-images-amazon.com/images/I/31XsztnbNGL._AA210_.jpg $123.20 - $133.63
https://images-na.ssl-images-amazon.com/images/I/31fNbfTW35L._AA210_.jpg $241.21 - $267.00
https://images-na.ssl-images-amazon.com/images/I/41fuZZwdruL._AA210_.jpg $41.24
https://images-na.ssl-images-amazon.com/images/I/51hC7rJT-VL._AA210_.jpg $39.95
https://images-na.ssl-images-amazon.com/images/I/41nziezVczL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/41OeZ0KTE8L._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/41QVEcJWLeL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/51XqV4DKV%2BL._AA210_.jpg $92.24
https://images-na.ssl-images-amazon.com/images/I/31-QDRkNbhL._AA210_.jpg $15.80
https://images-na.ssl-images-amazon.com/images/I/51MQu5v%2BQJL._AA210_.jpg $17.41
https://images-na.ssl-images-amazon.com/images/I/316KunRLRZL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/512s5ZrjoFL._AA210_.jpg $51.99
https://images-na.ssl-images-amazon.com/images/I/51A8Nfvf8eL._AA210_.jpg $8.30
https://images-na.ssl-images-amazon.com/images/I/51aDac6YN5L._AA210_.jpg $18.53
https://images-na.ssl-images-amazon.com/images/I/418aN7ErLJL._AA210_.jpg $179.30
https://images-na.ssl-images-amazon.com/images/I/31SQON%2BiOBL._AA210_.jpg $9.75
https://images-na.ssl-images-amazon.com/images/I/519hxsMZlTL._AA210_.jpg $38.99
https://images-na.ssl-images-amazon.com/images/I/515keAWwYYL._AA210_.jpg $7.97
https://images-na.ssl-images-amazon.com/images/I/31aNppsuZEL._AA210_.jpg $9.49
https://images-na.ssl-images-amazon.com/images/I/4104Jm-d3IL._AA210_.jpg $15.35
https://images-na.ssl-images-amazon.com/images/I/31abuAZRuqL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/51en-FhtpbL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/41qLIdqYXjL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/41bbblEeCWL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/316fGtNzmZL._AA210_.jpg $12.71
https://images-na.ssl-images-amazon.com/images/I/51OtBDMnPtL._AA210_.jpg $55.86
https://images-na.ssl-images-amazon.com/images/I/51IwkmtXKhL._AA210_.jpg $22.53
https://images-na.ssl-images-amazon.com/images/I/41fsdPNN71L._AA210_.jpg $23.31
https://images-na.ssl-images-amazon.com/images/I/41vgtpZ5H3L._AA210_.jpg $21.01
https://images-na.ssl-images-amazon.com/images/I/41vXJ9fvcIL._AA210_.jpg $88.99
https://images-na.ssl-images-amazon.com/images/I/41qzYct%2BPNL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/51K4hWy5wBL._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/51oxPj3Fz8L._AA210_.jpg Price Unavailable
https://images-na.ssl-images-amazon.com/images/I/51rCyK90S2L._AA210_.jpg Price Unavailable
- 链接:
https://www.amazon.com/gp/goldbox/ref=gbps_ftr_s-4_bedf_page_10?gb_f_deals1=enforcedCategories:2972638011,dealStates:AVAILABLE%252CWAITLIST%252CWAITLISTFULL,includedAccessTypes:,page:10,sortOrder:BY_SCORE,dealsPerPage:32&pf_rd_p=afc45143-5c9c-4b30-8d5c-d838e760bedf&pf_rd_s=slot-4&pf_rd_t=701&pf_rd_i=gb_main&pf_rd_m=ATVPDKIKX0DER&pf_rd_r=ZDV4YBQJFDVR3PAY4ZBS&ie=UTF8
- 浏览器快照:
谢谢John,如果它找不到“deal\u price”的xpath值,那么脚本就会出错。如果找不到交易价格,我需要检查一下,做些别的事情。