Python 使用熊猫解析html文件以提取特定表

Python 使用熊猫解析html文件以提取特定表,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我使用以下代码试图确定html文件中的表数并读取前两个表。我是python新手,不熟悉如何处理以下错误 我试图确定表的顺序,因为数据文件没有固定的格式,我希望我需要的表至少每次都以相同的顺序放置 代码: 错误: C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\python.exe C:/Users/Ahmed_Abdelmuniem/PycharmProjects/PandaHTML/main.py tables

我使用以下代码试图确定html文件中的表数并读取前两个表。我是python新手,不熟悉如何处理以下错误

我试图确定表的顺序,因为数据文件没有固定的格式,我希望我需要的表至少每次都以相同的顺序放置

代码:

错误:

C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\python.exe C:/Users/Ahmed_Abdelmuniem/PycharmProjects/PandaHTML/main.py
tables found: 201
Traceback (most recent call last):
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Ahmed_Abdelmuniem\PycharmProjects\PandaHTML\main.py", line 7, in <module>
    df1 = table[0]
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 3023, in __getitem__
    return self._getitem_multilevel(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 3074, in _getitem_multilevel
    loc = self.columns.get_loc(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 2876, in get_loc
    loc = self._get_level_indexer(key, level=0)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 3158, in _get_level_indexer
    idx = self._get_loc_single_level_index(level_index, key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 2809, in _get_loc_single_level_index
    return level_index.get_loc(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3082, in get_loc
    raise KeyError(key) from err
KeyError: 0

Process finished with exit code 1
C:\Users\Ahmed\u Abdelmuniem\AppData\Local\Programs\Python\Python39\Python.exe C:/Users/Ahmed\u Abdelmuniem/PycharmProjects/PandaHTML/main.py
找到的表格:201
回溯(最近一次呼叫最后一次):
文件“C:\Users\Ahmed\u Abdelmuniem\AppData\Local\Programs\Python\39\lib\site packages\pandas\core\index\base.py”,第3080行,在get\u loc中
返回自我。引擎。获取定位(铸造钥匙)
文件“pandas\\ libs\index.pyx”,第70行,在pandas.\ libs.index.IndexEngine.get\ loc中
pandas.\u libs.index.IndexEngine.get\u loc中第101行的文件“pandas\\u libs\index.pyx”
pandas.\u libs.hashtable.PyObjectHashTable.get\u项中的文件“pandas\\u libs\hashtable\u class\u helper.pxi”,第4554行
pandas.\u libs.hashtable.PyObjectHashTable.get\u项中的文件“pandas\\u libs\hashtable\u class\u helper.pxi”,第4562行
关键错误:0
上述异常是以下异常的直接原因:
回溯(最近一次呼叫最后一次):
文件“C:\Users\Ahmed_Abdelmuniem\PycharmProjects\PandaHTML\main.py”,第7行,在
df1=表[0]
文件“C:\Users\Ahmed\u Abdelmuniem\AppData\Local\Programs\Python\39\lib\site packages\pandas\core\frame.py”,第3023行,在uu getitem中__
返回自我。\u获取项目\u多级(键)
文件“C:\Users\Ahmed\u Abdelmuniem\AppData\Local\Programs\Python\39\lib\site packages\pandas\core\frame.py”,第3074行,位于_getitem\u
loc=self.columns.get_loc(键)
文件“C:\Users\Ahmed\u Abdelmuniem\AppData\Local\Programs\Python\39\lib\site packages\pandas\core\index\multi.py”,第2876行,在get\u loc中
loc=self.\u获取\u级别\u索引器(键,级别=0)
文件“C:\Users\Ahmed\u Abdelmuniem\AppData\Local\Programs\Python\39\lib\site packages\pandas\core\index\multi.py”,第3158行,在get\u level\u索引器中
idx=self.\u get\u loc\u single\u level\u index(level\u index,key)
文件“C:\Users\Ahmed\u Abdelmuniem\AppData\Local\Programs\Python\39\lib\site packages\pandas\core\index\multi.py”,第2809行,在get\u loc\u single\u level\u索引中
返回级别索引。获取位置(键)
文件“C:\Users\Ahmed\u Abdelmuniem\AppData\Local\Programs\Python\39\lib\site packages\pandas\core\index\base.py”,第3082行,在get\u loc中
从err中升起钥匙错误(钥匙)
关键错误:0
进程已完成,退出代码为1

Remove
[0]
from
table=pd.read\u html(file)[0]
感谢百万富翁,就是这样,我可以问一下,从代码的角度来说,结尾的零意味着什么吗?
[0]
意味着你想从列表中获取第一个元素。所以
table=pd.read\u html(file)[0]
意味着
table
是html文档中找到的第一个表。我明白了,非常感谢。
C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\python.exe C:/Users/Ahmed_Abdelmuniem/PycharmProjects/PandaHTML/main.py
tables found: 201
Traceback (most recent call last):
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Ahmed_Abdelmuniem\PycharmProjects\PandaHTML\main.py", line 7, in <module>
    df1 = table[0]
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 3023, in __getitem__
    return self._getitem_multilevel(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 3074, in _getitem_multilevel
    loc = self.columns.get_loc(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 2876, in get_loc
    loc = self._get_level_indexer(key, level=0)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 3158, in _get_level_indexer
    idx = self._get_loc_single_level_index(level_index, key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\multi.py", line 2809, in _get_loc_single_level_index
    return level_index.get_loc(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3082, in get_loc
    raise KeyError(key) from err
KeyError: 0

Process finished with exit code 1