Python:如何使用";读取url;。“数据”;后缀
我正在尝试将此url-“”中的数据读取到数据框中 我用过这个技巧:Python:如何使用";读取url;。“数据”;后缀,python,pandas,url,Python,Pandas,Url,我正在尝试将此url-“”中的数据读取到数据框中 我用过这个技巧: park_df = pd.read_html('https://archive.ics.uci.edu/ml/machine-learning- databases/parkinsons/parkinsons.data', header=0, flavor='bs4') 但我得到一个错误,如下所示: -------------------------------------------------------------
park_df = pd.read_html('https://archive.ics.uci.edu/ml/machine-learning-
databases/parkinsons/parkinsons.data', header=0, flavor='bs4')
但我得到一个错误,如下所示:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-804373f977ab> in <module>()
----> 1 park_df = pd.read_html('https://archive.ics.uci.edu/ml/machine-
learning-databases/parkinsons/parkinsons.data', header=0, flavor='bs4')
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\html.py in
read_html(io, match, flavor, header, index_col, skiprows, attrs,
parse_dates, tupleize_cols, thousands, encoding, decimal, converters,
na_values, keep_default_na, displayed_only)
985 decimal=decimal, converters=converters,
na_values=na_values,
986 keep_default_na=keep_default_na,
--> 987 displayed_only=displayed_only)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\html.py in
_parse(flavor, io, match, attrs, encoding, displayed_only, **kwargs)
813 break
814 else:
--> 815 raise_with_traceback(retained)
816
817 ret = []
~\AppData\Local\Continuum\anaconda3\lib\site-
packages\pandas\compat\__init__.py in raise_with_traceback(exc, traceback)
402 if traceback == Ellipsis:
403 _, _, traceback = sys.exc_info()
--> 404 raise exc.with_traceback(traceback)
405 else:
406 # this version of raise is a syntax error in Python 3
ValueError: No tables found
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在()
---->1 park_df=pd.read_html('https://archive.ics.uci.edu/ml/machine-
学习数据库/parkinsons/parkinsons.data',header=0,flavor='bs4')
中的~\AppData\Local\Continuum\anaconda3\lib\site packages\pandas\io\html.py
读取html(io、匹配、风格、标题、索引列、skiprows、属性、,
解析日期、元组、千、编码、十进制、转换器、,
不适用值,保留默认值,仅显示)
985十进制=十进制,转换器=转换器,
na_值=na_值,
986保留默认值=保留默认值,
-->987仅显示=仅显示)
中的~\AppData\Local\Continuum\anaconda3\lib\site packages\pandas\io\html.py
_解析(风格、io、匹配、属性、编码、仅显示,**kwargs)
813休息
814其他:
-->815带回溯的raise_(保留)
816
817 ret=[]
~\AppData\Local\Continuum\anaconda3\lib\site-
包\pandas\compat\ \uuuuuuu init\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
402如果回溯==省略号:
403_u,u,traceback=sys.exc_info()
-->404带回溯(回溯)的提升exc
405其他:
406#此版本的raise在Python3中是一个语法错误
ValueError:未找到任何表
你能告诉我我做错了什么吗,还有什么更好的选择吗。请打开url以检查数据的外观,标题在第一行(包含列名),数据如下。函数用于将html表格转换为pandas DataFrame,用于转换csv格式:
@MohammadAmirAshraff-不客气!如果我的回答有帮助,别忘了。谢谢
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/parkinsons/parkinsons.data'
df = pd.read_csv(url)
print (df.head())
name MDVP:Fo(Hz) MDVP:Fhi(Hz) MDVP:Flo(Hz) MDVP:Jitter(%) \
0 phon_R01_S01_1 119.992 157.302 74.997 0.00784
1 phon_R01_S01_2 122.400 148.650 113.819 0.00968
2 phon_R01_S01_3 116.682 131.111 111.555 0.01050
3 phon_R01_S01_4 116.676 137.871 111.366 0.00997
4 phon_R01_S01_5 116.014 141.781 110.655 0.01284
MDVP:Jitter(Abs) MDVP:RAP MDVP:PPQ Jitter:DDP MDVP:Shimmer ... \
0 0.00007 0.00370 0.00554 0.01109 0.04374 ...
1 0.00008 0.00465 0.00696 0.01394 0.06134 ...
2 0.00009 0.00544 0.00781 0.01633 0.05233 ...
3 0.00009 0.00502 0.00698 0.01505 0.05492 ...
4 0.00011 0.00655 0.00908 0.01966 0.06425 ...
Shimmer:DDA NHR HNR status RPDE DFA spread1 \
0 0.06545 0.02211 21.033 1 0.414783 0.815285 -4.813031
1 0.09403 0.01929 19.085 1 0.458359 0.819521 -4.075192
2 0.08270 0.01309 20.651 1 0.429895 0.825288 -4.443179
3 0.08771 0.01353 20.644 1 0.434969 0.819235 -4.117501
4 0.10470 0.01767 19.649 1 0.417356 0.823484 -3.747787
spread2 D2 PPE
0 0.266482 2.301442 0.284654
1 0.335590 2.486855 0.368674
2 0.311173 2.342259 0.332634
3 0.334147 2.405554 0.368975
4 0.234513 2.332180 0.410335
[5 rows x 24 columns]