Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/325.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Pandas read_html导致类型错误_Python_Pandas - Fatal编程技术网

Python Pandas read_html导致类型错误

Python Pandas read_html导致类型错误,python,pandas,Python,Pandas,我使用bs4解析html页面并提取一个表,下面给出的示例表,我试图将其加载到pandas中,但是当我调用pddataframe=pd.read_html(LOTable,skiprows=2,flavor=['bs4'])时,我得到了下面列出的错误,但我可以打印bs4美化的表 有什么建议我可以解决这个问题,而不需要得到每一个td和阅读1 1 样本表 学习成果 成功完成本模块后,学员将能够: LO1 了解财务会计信息作为决策过程输入的重要作用。 LO2 了解财务报表编制所依据的基本会计概念、原则

我使用bs4解析html页面并提取一个表,下面给出的示例表,我试图将其加载到pandas中,但是当我调用
pddataframe=pd.read_html(LOTable,skiprows=2,flavor=['bs4'])
时,我得到了下面列出的错误,但我可以打印bs4美化的表

有什么建议我可以解决这个问题,而不需要得到每一个td和阅读1 1

样本表

学习成果
成功完成本模块后,学员将能够:
LO1
了解财务会计信息作为决策过程输入的重要作用。
LO2
了解财务报表编制所依据的基本会计概念、原则和惯例。
LO3
了解记录和分类交易或事件相关信息的各种格式。
LO4
运用会计概念、惯例和技术知识,如复式记账,将记录信息过账到名义分类账中的T账户。
LO5
根据试算表,以规定格式编制并呈报独家贸易商的财务报表,并附上附注和其他信息。
错误
-------------------------------------------------------------在()中键入错误回溯(最近一次调用)
10#将表格读入熊猫
11如果首先:
--->12 pddataframe=pd.read_html(LOTable,skiprows=2,flavor=['bs4'])
13第一个=错误
14数据帧
C:\Program Files\Anaconda3\envs\learningoutcouts\lib\site packages\pandas\io\html.py(io、匹配、风格、标题、索引、skiprows、属性、解析日期、元组、千、编码)
872 _验证_标题_参数(标题)
873返回解析(风格、io、匹配、标题、索引列、skiprows、,
-->874解析(日期、元组、千、属性、编码)
C:\Program Files\Anaconda3\envs\learningoutcouts\lib\site packages\pandas\io\html.py in\u parse(风格、io、匹配、标题、索引、skiprows、解析日期、元组、千、属性、编码)
734中断
735其他:
-->736带回溯的raise_(保留)
737
738 ret=[]
C:\Program Files\Anaconda3\envs\learningoutcouts\lib\site packages\pandas\compat\\uuuuuuuu init\uuuuuuuuuuuuuu.py在带回溯的raise\u中(exc,回溯)
331如果回溯==省略号:
332 u,u,traceback=sys.exc_info()
-->333带回溯的提升exc(回溯)
334其他:
335#此版本的raise在Python3中是一个语法错误
**TypeError:“非类型”对象不可调用**
熊猫可以猜到

HTML = '''\
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
    <tr>
     <th colspan="2">
      Learning Outcomes
     </th>


... omitting most of what you had here


      Prepare and present the financial statements of a Sole Trader  in prescribed format from a Trial Balance  accompanies by notes with additional information.
     </td>
    </tr>
   </table>'''

from io import StringIO
import pandas as pd

df = pd.read_html(StringIO(HTML))
print (df)

这个确切的代码对我有用

htm = """<table cellpadding="5" cellspacing="0" class="borders" width="100%">
    <tr>
     <th colspan="2">
      Learning Outcomes
     </th>
    </tr>
    <tr>
     <td class="info" colspan="2">
      On successful completion of this module the learner will be able to:
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO1
     </td>
     <td>
      Demonstrate an awareness of the important role of Financial Accounting information as an input into the decision making process.
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO2
     </td>
     <td>
      Display an understanding of the fundamental accounting concepts, principles and conventions that underpin the preparation of Financial statements.
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO3
     </td>
     <td>
      Understand the various formats in which  information in relation to transactions or events is recorded and classified.
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO4
     </td>
     <td>
      Apply a knowledge of accounting concepts,conventions and techniques such as double entry to the  posting of  recorded information to the T accounts in the Nominal Ledger.
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO5
     </td>
     <td>
      Prepare and present the financial statements of a Sole Trader  in prescribed format from a Trial Balance  accompanies by notes with additional information.
     </td>
    </tr>
   </table> 
"""

pd.read_html(htm, skiprows=2, flavor='bs4')[0]
htm=”“”
学习成果
成功完成本模块后,学员将能够:
LO1
了解财务会计信息作为决策过程输入的重要作用。
LO2
了解财务报表编制所依据的基本会计概念、原则和惯例。
LO3
了解记录和分类交易或事件相关信息的各种格式。
LO4
运用会计概念、惯例和技术知识,如复式记账,将记录信息过账到名义分类账中的T账户。
LO5
根据试算表,以规定格式编制并呈报独家贸易商的财务报表,并附上附注和其他信息。
"""
pd.read_html(htm,skiprows=2,flavor='bs4')[0]

感谢所有建议答案和评论中的提示,我的新手错误是,在使用bs4提取表后,我将其放入变量中。
当我需要运行
pd.read\u html(LOTable,skiprows=2,flavor='bs4')
时,我正在运行
pd.read\u html(LOTable.prettify(),skiprows=2,flavor='bs4')
flavor:str或None,字符串容器
。在没有flavor='bs4'pd的情况下尝试了它。read\u html(LOTable,skiprows=2)也出现了同样的错误。在尝试阅读之前,我可以调用print(LOTable.prettify()),并输出表格html,然后错误@AKS我不理解您的评论。请您再解释一下,好吗?问题在于您的示例或真实数据?谢谢告诉我们。我怀疑。我阅读了最近的panda文档后发现,没有必要将
flavor='bs4'
放入,因为panda将默认为lxml IIRC。这就是我建议它自己进行解析的原因,或者至少它处理解析而不需要用户这样做。
HTML = '''\
<table cellpadding="5" cellspacing="0" class="borders" width="100%">
    <tr>
     <th colspan="2">
      Learning Outcomes
     </th>


... omitting most of what you had here


      Prepare and present the financial statements of a Sole Trader  in prescribed format from a Trial Balance  accompanies by notes with additional information.
     </td>
    </tr>
   </table>'''

from io import StringIO
import pandas as pd

df = pd.read_html(StringIO(HTML))
print (df)
[                                                   0  \
0                                  Learning Outcomes   
1  On successful completion of this module the le...   
2                                                LO1   
3                                                LO2   
4                                                LO3   
5                                                LO4   
6                                                LO5   

                                                   1  
0                                                NaN  
1                                                NaN  
2  Demonstrate an awareness of the important role...  
3  Display an understanding of the fundamental ac...  
4  Understand the various formats in which inform...  
5  Apply a knowledge of accounting concepts,conve...  
6  Prepare and present the financial statements o...  ]
htm = """<table cellpadding="5" cellspacing="0" class="borders" width="100%">
    <tr>
     <th colspan="2">
      Learning Outcomes
     </th>
    </tr>
    <tr>
     <td class="info" colspan="2">
      On successful completion of this module the learner will be able to:
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO1
     </td>
     <td>
      Demonstrate an awareness of the important role of Financial Accounting information as an input into the decision making process.
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO2
     </td>
     <td>
      Display an understanding of the fundamental accounting concepts, principles and conventions that underpin the preparation of Financial statements.
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO3
     </td>
     <td>
      Understand the various formats in which  information in relation to transactions or events is recorded and classified.
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO4
     </td>
     <td>
      Apply a knowledge of accounting concepts,conventions and techniques such as double entry to the  posting of  recorded information to the T accounts in the Nominal Ledger.
     </td>
    </tr>
    <tr>
     <td style="width:10%;">
      LO5
     </td>
     <td>
      Prepare and present the financial statements of a Sole Trader  in prescribed format from a Trial Balance  accompanies by notes with additional information.
     </td>
    </tr>
   </table> 
"""

pd.read_html(htm, skiprows=2, flavor='bs4')[0]