使用beautifulsoup在python中将html表转换为csv_Python_Beautifulsoup_Html Table

使用beautifulsoup在python中将html表转换为csv

python

使用beautifulsoup在python中将html表转换为csv,python,beautifulsoup,html-table,Python,Beautifulsoup,Html Table,我正在尝试从网站获取表格数据。该网页显示了一些产品，我提取了这些产品的链接，并希望逐一访问这些产品，获取表格，并将所有信息合并到一个大表格中。下面的代码仅返回1种产品的信息。有人能帮我把它扩展到所有产品吗？这将是一个很大的帮助，谢谢 import requests import xlsxwriter import csv from bs4 import BeautifulSoup def cpap_spider(max_pages): global row_i page=1

我正在尝试从网站获取表格数据。该网页显示了一些产品，我提取了这些产品的链接，并希望逐一访问这些产品，获取表格，并将所有信息合并到一个大表格中。下面的代码仅返回1种产品的信息。有人能帮我把它扩展到所有产品吗？这将是一个很大的帮助，谢谢

import requests
import xlsxwriter
import csv
from bs4 import BeautifulSoup 
def cpap_spider(max_pages):
    global row_i
    page=1
    while page<=max_pages:
        url= "https://dmesupplyusa.com/bath-aids.html?p=" +str(page)
        source_code= requests.get(url)
        plain_text= source_code.text
        soup= BeautifulSoup(plain_text, 'html.parser')
        for link in soup.findAll("a", {"class":"product-item-link"}):
            href=link.get("href")
            title = link.string
            #worksheet.write(row_i, 0, title)
            each_item(href)
            print(href)
            #print(title)
        page+=1

def each_item(item_url):
    
    source_code= requests.get(item_url)
    plain_text= source_code.text
    soup= BeautifulSoup(plain_text, 'html.parser')

    table = soup.find("table", {"class":"data table additional-attributes"})
    rows = table.find_all("tr")

    with open("editors.csv", "wt+", newline="") as f:
        writer = csv.writer(f)
        for row in rows:
            csv_row = []
            for cell in row.findAll(["td", "th"]):
                csv_row.append(cell.get_text())
            writer.writerow(csv_row)
    
cpap_spider(1)

导入请求
导入xlsxwriter
导入csv
从bs4导入BeautifulSoup
def cpap_爬行器（最大页数）：
全球第一排
页码=1
虽然页面如果您有表格，最好使用&dump to csv using。我可以为多个页面执行此操作吗？是的，您可以在爬行时使用继续添加，但页面上还有很多其他信息。如何确保只读取表？只读html将只读取