Python 获取shell中每个项目的多个图像url,但仅为csv中的每个项目导入一个url
您好,我在python shell中为每个项目获取多个图像url,但在csv中为每个项目获取一个图像url。下面是我的python shell的输出结果:Python 获取shell中每个项目的多个图像url,但仅为csv中的每个项目导入一个url,python,dataframe,csv,beautifulsoup,Python,Dataframe,Csv,Beautifulsoup,您好,我在python shell中为每个项目获取多个图像url,但在csv中为每个项目获取一个图像url。下面是我的python shell的输出结果: product_link: https://www.amazon.com/gp/slredirect/picassoRedirect.html/ref=pa_sp_btf_aps_sr_pg1_1?ie=UTF8&adId=A002532917E3JT34GS1DE&url=%2FWireless-Vssoplor-Porta
product_link: https://www.amazon.com/gp/slredirect/picassoRedirect.html/ref=pa_sp_btf_aps_sr_pg1_1?ie=UTF8&adId=A002532917E3JT34GS1DE&url=%2FWireless-Vssoplor-Portable-Computer-Computer-Black%2Fdp%2FB07RLYJJBX%2Fref%3Dsr_1_22_sspa%3Fcrid%3D22TI4BA3RLK5J%26dchild%3D1%26keywords%3Dwireless%2Bmouse%26qid%3D1599517835%26sprefix%3Dw%252Caps%252C528%26sr%3D8-22-spons%26psc%3D1&qualifier=1600050591&id=4126203954910776&widgetName=sp_btf
product_title: Wireless Mouse, Vssoplor 2.4G Slim Portable Computer Mice with Nano Receiver for Notebook, PC, Laptop, Computer-Black and Sapphire Blue
product_price: $10.99
product_rating: 2,262 ratings
image link:https://images-na.ssl-images-amazon.com/images/I/41oJQTxCbZL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/4152DCmmGFL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/41ayV4UraXL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/I/310z8LQ%2BoYL._AC_US40_.jpg
image link:https://images-na.ssl-images-amazon.com/images/G/01/x-locale/common/transparent-pixel._V192234675_.gif
这是我的填写代码:
for page_num in range(1):
url = "https://www.amazon.com/s?k=wireless+mouse&page={}&crid=22TI4BA3RLK5J&qid=1599517835&sprefix=w%2Caps%2C528&ref=sr_pg_2".format(page_num)
r = requests.get(url,headers=headers,proxies=proxies,auth=auth).text
soup = BeautifulSoup(r,'lxml')
container = soup.find_all('h2',{'class':'a-size-mini a-spacing-none a-color-base s-line-clamp-2'})
for containers in container:
product_link = f"https://www.amazon.com{containers.find('a')['href']}"
print(f"page_number:{url}\n\nproduct_link:{product_link}")
#here I am start scraping from details page of each product
details_page = requests.get(product_link,headers=headers,proxies=proxies,auth=auth).text
dpsoup = BeautifulSoup(details_page,'lxml')
title = dpsoup.find('span', id='productTitle')
if title is not None:
title = title.text.strip()
else:
title= None
rating = dpsoup.find('span', id='acrCustomerReviewText')
if rating is not None:
rating = rating.text
else:
rating = None
price = dpsoup.find('span', class_='a-size-mini twisterSwatchPrice')
if price is not None:
price = price.text
else:
price = None
print(f'\nproduct_link: {product_link}\n\nproduct_title: {title}\n\nproduct_price: {price}\n\nproduct_rating: {rating}\n\n')
#this is for scrape all gallray image src
for url in dpsoup.select('span.a-button-text > img')[3:10]:
print(f"image link:{url['src']}")
with io.open("amazon.csv", "a",encoding="utf-8") as f:
writeFile = csv.writer(f)
writeFile.writerow([url['src'],product_link ,title,rating,price])
我也试过了,但没用
#this is for scrape all gallray image src
for url in dpsoup.select('span.a-button-text > img')[3:10]:
with io.open("amazon.csv", "a",encoding="utf-8") as f:
writeFile = csv.writer(f)
writeFile.writerow([url['src'],product_link ,title,rating,price])
如何导入csv中每个项目的所有图像url?有人请帮忙吗