Python 熊猫-阅读HTML_Python_Pandas

Python 熊猫-阅读HTML

python pandas

Python 熊猫-阅读HTML,python,pandas,Python,Pandas,我正在尝试将表转换为pandasDataFrame 到目前为止，我已经做了以下工作 import pandas as pd url = 'http://www.scb.se/sv_/Hitta-statistik/Statistik-efter-amne/Befolkning/Befolkningens-sammansattning/Befolkningsstatistik/25788/25795/Helarsstatistik---Riket/26046/' df = pd.read_ht

我正在尝试将表转换为

pandas

DataFrame

到目前为止，我已经做了以下工作

import pandas as pd

url = 'http://www.scb.se/sv_/Hitta-statistik/Statistik-efter-amne/Befolkning/Befolkningens-sammansattning/Befolkningsstatistik/25788/25795/Helarsstatistik---Riket/26046/'

df = pd.read_html(url,thousands=' ')
df2= df[0]

我这里的问题是

pandas

无法识别索引值

是标题。我还希望列值

År

成为索引值

最后，我想将

Folkmängd

列值绘制为

，将

År

列值绘制为

提前谢谢。

这应该接近您想要的：

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
matplotlib.style.use('ggplot')

url = 'http://www.scb.se/sv_/Hitta-statistik/Statistik-efter-amne/Befolkning/Befolkningens-sammansattning/Befolkningsstatistik/25788/25795/Helarsstatistik---Riket/26046/'

table = pd.read_html(url,thousands=' ', header=0, index_col=0)[0]
table["Folkmängd"].plot(color='k')
plt.show()

这会给你一些类似的东西：

index\u col=0

如果你想把Ar作为索引，

pd.read\u html（url，数千=''，index\u col=0）

看起来几乎和它在表格栏中所做的一样——数千个独立的数据库，我还添加了df=pd.read\u html（url，数千=''，index\u col=0，header=0），这使得表格完全符合我的要求。有没有办法告诉熊猫“År”列是年份而不是数字你可能想要的

pd。阅读\u html（url，数千=''，index\u col=0，header=0）

如果你想使用列名，我认为这与年份无关，你没有日期，所以你只有数字

parse_dates=0

将使每年成为

01-01年

，但我看不到任何优势