Python 如何在excel或csv上获取selenium数据?
这是我的全部代码。我想获得csv上的输出数据,如标题、价格,所有内容都将在csv或excel电子表格上分开列。我的代码将转到每个产品的详细信息页面,并收集必要的信息,如产品名称、价格等Python 如何在excel或csv上获取selenium数据?,python,selenium,beautifulsoup,Python,Selenium,Beautifulsoup,这是我的全部代码。我想获得csv上的输出数据,如标题、价格,所有内容都将在csv或excel电子表格上分开列。我的代码将转到每个产品的详细信息页面,并收集必要的信息,如产品名称、价格等 from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.common.by import By from selenium.webdriver.supp
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from bs4 import BeautifulSoup
#argument for incognito Chrome
option = Options()
option.add_argument("--incognito")
browser = webdriver.Chrome(options=option)
browser.get("https://www.daraz.com.bd/consumer-electronics/?spm=a2a0e.pdp.breadcrumb.1.4d20110bzkC0bn")
# Wait 20 seconds for page to load
timeout = 20
try:
WebDriverWait(browser, timeout).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='c16H9d']")))
except TimeoutException:
print("Timed out waiting for page to load")
browser.quit()
//getting link of each product
soup = BeautifulSoup(browser.page_source, "html.parser")
product_items = soup.find_all("div", attrs={"data-qa-locator": "product-item"})
for item in product_items:
item_url = f"https:{item.find('a')['href']}"
print(item_url)
browser.get(item_url)
//scrape details page information
itm_soup = BeautifulSoup(browser.page_source, "html.parser")
container_box = itm_soup.find_all("div",{"id":"container"})
# Use the itm_soup to find details about the item from its url.
for itm in container_box:
product_title_element = itm.find("span",class_="pdp-mod-product-badge-title")
product_title = product_title_element.get_text() if product_title_element else "No title"
print(product_title)
browser.quit()
如何在csv或excel电子表格中获取产品标题?您可以使用csv writer模块完成此操作
from csv import writer
def AddToCSV(List):
with open("Output.csv", "a+", newline='') as output_file:
csv_writer = writer(output_file)
csv_writer.writerow(List)
# this can be used within your for loop
row_list = [item_url, product_title, price, etc..]
AddToCSV(row_list)
您可以使用csv编写器模块来完成此操作
from csv import writer
def AddToCSV(List):
with open("Output.csv", "a+", newline='') as output_file:
csv_writer = writer(output_file)
csv_writer.writerow(List)
# this can be used within your for loop
row_list = [item_url, product_title, price, etc..]
AddToCSV(row_list)
谢谢,纳文,成功了。请告诉我如何指定列名,例如对于item\u URL,我想指定列名item\u URL,product\u tile列名将是product\u Tille。在循环开始添加列名之前,您可以使用相同的函数。只需传递一个列表,如
AddToCSV(['item\u url','title',等等])
。但您应该注意,每次运行此脚本时,都会添加一行。因此,当第二次运行时,它将再次添加列名。如果你有相同的输出文件,新的列名可能在文件的中间。请告诉我如何指定列名,例如对于item\u URL,我想指定列名item\u URL,product\u tile列名将是product\u Tille。在循环开始添加列名之前,您可以使用相同的函数。只需传递一个列表,如AddToCSV(['item\u url','title',等等])
。但您应该注意,每次运行此脚本时,都会添加一行。因此,当第二次运行时,它将再次添加列名。如果您有相同的输出文件,则新的列名可能在文件的中间。