Python 表格与camelot用于从PDF中提取表格_Python_Pdf_Tabula_Python Camelot

Python 表格与camelot用于从PDF中提取表格

python pdf

Python 表格与camelot用于从PDF中提取表格,python,pdf,tabula,python-camelot,Python,Pdf,Tabula,Python Camelot,我需要从pdf提取表格，这些表格可以是任何类型，多个标题，垂直标题，水平标题等我已经实现了这两个表的基本用例，发现tabla比camelot做得好一点，仍然无法完美地检测所有表，我不确定它是否适用于所有类型因此，向实施类似用例的专家寻求建议 PDF示例：表格实施： import tabula tab = tabula.read_pdf('pdfs/PDF1.pdf', pages='all') for t in tab: print(t, "\n==================

我需要从pdf提取表格，这些表格可以是任何类型，多个标题，垂直标题，水平标题等

我已经实现了这两个表的基本用例，发现tabla比camelot做得好一点，仍然无法完美地检测所有表，我不确定它是否适用于所有类型

因此，向实施类似用例的专家寻求建议

PDF示例：

表格实施：

import tabula
tab = tabula.read_pdf('pdfs/PDF1.pdf', pages='all')
for t in tab:
    print(t, "\n=========================\n")

import camelot
tables = camelot.read_pdf('pdfs/PDF1.pdf', pages='all', split_text=True)
tables
for tabs in tables:
    print(tabs.df, "\n=================================\n")

Camelot实施：

import tabula
tab = tabula.read_pdf('pdfs/PDF1.pdf', pages='all')
for t in tab:
    print(t, "\n=========================\n")

import camelot
tables = camelot.read_pdf('pdfs/PDF1.pdf', pages='all', split_text=True)
tables
for tabs in tables:
    print(tabs.df, "\n=================================\n")

请阅读：

Camelot的主要优点是该库包含丰富的参数，您可以通过这些参数改进提取

显然，这些参数的应用需要一些研究和各种尝试

您可以找到Camelot与其他PDF表格提取库的比较。

请阅读以下内容：

Camelot的主要优点是该库包含丰富的参数，您可以通过这些参数改进提取

显然，这些参数的应用需要一些研究和各种尝试

您可以找到Camelot与其他PDF表格提取库的比较。

“仍然无法完美检测所有表格”-极不可能有软件完美检测所有表格。“仍然无法完美检测所有表格”-极不可能有软件完美地检测所有表。