在Python中将html转换为excel_Python_Html_Excel_Csv

在Python中将html转换为excel

python html excel csv

在Python中将html转换为excel,python,html,excel,csv,Python,Html,Excel,Csv,我正在尝试将以下站点中的表转换为xls表：以下是我通过研究得出的代码： from bs4 import BeautifulSoup import pandas as pd from urllib2 import urlopen import requests import csv url='http://www.dekel.co.il/madad-lazarchan' table = pd.read_html(requests.get(url).text, attrs={"class" :

我正在尝试将以下站点中的表转换为xls表：

以下是我通过研究得出的代码：

from bs4 import BeautifulSoup
import pandas as pd
from urllib2 import urlopen
import requests
import csv

url='http://www.dekel.co.il/madad-lazarchan'
table = pd.read_html(requests.get(url).text, attrs={"class" : "medadimborder"})

print table</code>

如何使其正确显示标题并输出到csv或xls文件

如果我添加以下内容：

table.to_csv('test.csv')

我得到的不是打印行，而是以下错误：

'list' object has no attribute 'to_csv'

提前谢谢

好吧，根据评论，也许我不应该使用panda或read_html，因为我想要的是表格而不是列表。我编写了以下代码，但现在打印输出有分隔符，看起来我丢失了标题行。还不确定如何将其导出到csv文件


从bs4导入美化组
导入urllib2
导入csv
soup=BeautifulSoup（urllib2.urlopen）http://www.dekel.co.il/madad-lazarchan）.read（），'html'）
数据=[]
table=soup.find（“table”，attrs={“class”：“medadimborder”}）
table_body=table.find（'tbody'））
行=表体.findAll（'tr'）
对于行中的行：
cols=row.findAll（'td'）
cols=[ele.text.strip（）表示cols中的ele]
打印颜色

[u'01/16'，u'130.7915'，u'122.4640'，u'117.9807'，u'112.2557'，u'105.8017'，u'100.5720'，u'98.6'] [u'12/15'，u'131.4547'，u'123.0850'，u'118.5790'，u'112.8249'，u'106.3383'，u'101.0820'，u'99.1']

[u'11/15'，u'131.5874'，u'123.2092'，u'118.6986'，u'112.9387'，u'106.4456'，u'101.1840'，u'99.2']

您可以使用可用的python包处理Excel文件。这是一个。

您的“table”变量不是pandas数据框，而是一个2D列表，其第一个也是唯一的元素是pandas数据框。从逻辑上讲，在python列表上调用pandas方法将不起作用，并会引发

AttributeError

。 Python内置的

type（）

和

dir（）

揭示了这一点：

>>> type(table)
<class 'list'>

>>> type(table[0])
<class 'pandas.core.frame.DataFrame'>

# no error 
>>> table[0].to_csv('test.csv')
>>> 

# 2D to 1D list 
>>> table = table[0]
>>> table.to_csv('test.csv')
>>>

>类型（表格）
>>>类型（表[0]）
#无误
>>>表[0]。至_csv（'test.csv'））
>>> 
#二维到一维列表
>>>表=表[0]
>>>表.to_csv（'test.csv'））
>>>

pandas.read\u html返回数据帧列表，而不是单个数据帧。您必须在返回的列表中指定数据帧的索引（在这种情况下，索引=0）：

将

更新的初始问题视为在没有panda的情况下显示新代码，因为我想要一个表而不是列表。该链接在列表的顶部有openpyxl，如果您想轻松地将html表放入openpyxl，可以使用tablepyxl

#now the result of read_html will be named 'tables', which is a list of DataFrames
tables = pd.read_html(requests.get(url).text, attrs={"class" : "medadimborder"})
#assigning the first element of the list of DataFrames 'tables' into DataFrame 'table'
table = tables[0]
#converting into csv
table.to_csv('test.csv')