Python 使用Beautifulsoup解析后写入csv会导致分离的值或空输出文件_Python_Html_Csv_Beautifulsoup

Python 使用Beautifulsoup解析后写入csv会导致分离的值或空输出文件

python html csv

Python 使用Beautifulsoup解析后写入csv会导致分离的值或空输出文件,python,html,csv,beautifulsoup,Python,Html,Csv,Beautifulsoup,这是我的密码。在这里查看类似问题时，我尝试用以下方法解决此问题： soup = BeautifulSoup(browser.page_source, 'html.parser') table = soup.find('table', attrs={'class':'table_grid'}) with open('test1.csv', 'w', newline='') as outfile: writer = csv.writer(outfile) for body in

这是我的密码。在这里查看类似问题时，我尝试用以下方法解决此问题：

soup = BeautifulSoup(browser.page_source, 'html.parser')

table = soup.find('table', attrs={'class':'table_grid'})

with open('test1.csv', 'w', newline='') as outfile:
    writer = csv.writer(outfile)
    for body in table.findAll('tr'):
        rows = body.getText()
        writer.writerow(rows)

然而，这导致了一个空白的csv文件。你知道我做错了什么吗？

我想你是想把每个单元格都写进它自己的列中：

请注意，我在这里使用了一些快捷方式-

table（'tr'）

是另一种简洁的方式来执行

table.find_all（'tr'）

另外，将HTML表转储到CSV中的另一种方法是使用

pandas

库，尤其是-和方法。

这肯定会将每个单元格写入它自己的列中，谢谢。然而，它现在错过了它以前拉头的地方。我是否可以添加：对于表中的行（'th'）：以及？@nvachhan好的，一个想法是为每一行查找

td

和

th

元素-请查看更新的答案。新的答案有效，有意义，谢谢。最后一个问题：（strip=True）的目的是什么？下次我将试用pandas。@nvachan

strip=True

有助于删除文本周围多余的空格和换行符。谢谢

soup = BeautifulSoup(browser.page_source, 'html.parser')

table = soup.find('table', attrs={'class':'table_grid'})

with open('test1.csv', 'w', newline='') as outfile:
    writer = csv.writer(outfile)
    for body in table.findAll('tr'):
        rows = body.getText()
        writer.writerow(rows)

writer.writerow([rows])

with open('test1.csv', 'w', newline='') as outfile:
    writer = csv.writer(outfile)
    for row in table('tr'):
        writer.writerow([cell.get_text(strip=True) for cell in row(['td', 'th'])])