Java tabla py无法读取pdf文件_Java_Python

Java tabla py无法读取pdf文件

java python

Java tabla py无法读取pdf文件,java,python,Java,Python,我的代码： import tabula import os dir_path = os.path.dirname(os.path.realpath(__file__)) file_path = dir_path + '\ALPINE_' + str(20191107) + '.pdf' print(file_path) df = tabula.read_pdf('ALPINE_20191107.pdf',multiple_tables=True, pages="all") 结果: runfi

我的代码：

import tabula
import os

dir_path = os.path.dirname(os.path.realpath(__file__))
file_path = dir_path + '\ALPINE_' + str(20191107) + '.pdf'
print(file_path)
df = tabula.read_pdf('ALPINE_20191107.pdf',multiple_tables=True, pages="all")

结果:

runfile('C:/Users/Admin/Documents/lucas/testTabula.py.py', wdir='C:/Users/Admin/Documents/lucas')
Traceback (most recent call last):

  File "<ipython-input-29-a6b390aef3cf>", line 1, in <module>
    runfile('C:/Users/Admin/Documents/lucas/sem título0.py', wdir='C:/Users/Admin/Documents/lucas')

  File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)

  File "C:\ProgramData\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/Admin/Documents/lucas/sem título0.py", line 12, in <module>
    df = tabula.read_pdf('ALPINE_20191107.pdf',multiple_tables=True, pages="all")

  File "C:\ProgramData\Anaconda3\lib\site-packages\tabula\io.py", line 332, in read_pdf
    return _extract_from(raw_json, pandas_options)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tabula\io.py", line 664, in _extract_from
    df[c] = pd.to_numeric(df[c], errors="ignore")

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\numeric.py", line 138, in to_numeric
    raise TypeError("arg must be a list, tuple, 1-d array, or Series")

TypeError: arg must be a list, tuple, 1-d array, or Series

runfile（'C:/Users/Admin/Documents/lucas/testtabla.py.py'，wdir='C:/Users/Admin/Documents/lucas'）
回溯（最近一次呼叫最后一次）：
文件“”，第1行，在
runfile（'C:/Users/Admin/Documents/lucas/sem título0.py'，wdir='C:/Users/Admin/Documents/lucas'）
文件“C:\ProgramData\Anaconda3\lib\site packages\spyder\u kernels\customize\spyderrcustomize.py”，第827行，在运行文件中
execfile（文件名、命名空间）
文件“C:\ProgramData\Anaconda3\lib\site packages\spyder\u kernels\customize\spydercustomize.py”，第110行，在execfile中
exec（编译（f.read（），文件名，'exec'），命名空间）
文件“C:/Users/Admin/Documents/lucas/sem título0.py”，第12行，在
df=制表阅读pdf（'ALPINE_20191107.pdf'，多个表=真，pages=“all”）
文件“C:\ProgramData\Anaconda3\lib\site packages\tabla\io.py”，第332行，以read\U pdf格式
return\u extract\u from（原始json、熊猫选项）
文件“C:\ProgramData\Anaconda3\lib\site packages\tabla\io.py”，第664行，从
df[c]=pd.to_numeric（df[c]，errors=“ignore”）
文件“C:\ProgramData\Anaconda3\lib\site packages\pandas\core\tools\numeric.py”，第138行，输入到\u numeric
raise TypeError（“参数必须是列表、元组、1-d数组或序列”）
TypeError:arg必须是列表、元组、1-d数组或序列

它的功能似乎不起作用。我可以直接键入路径以使其更简单，但它也不起作用。pdf文件可能有问题，但我已经看到它在另一个环境中使用相同的脚本和相同的文件

我已经在两个可能的路径（“C:\Program Files\java\jre1.8.0\U 231\bin”）上设置了java，但这真的不重要，错误发生时是否设置了路径。我也尝试过添加jdk，但都没有解决

我注意到提到熊猫的错误，所以可能与我的版本（最新版本）冲突，但我不确定

python是3.7.4，java是迄今为止最新的

我也遇到过同样的问题。我使用的是使用pip安装的版本，即Table py 2.0.0。我卸载了该版本，并使用conda安装-c conda forge tabla py从Anaconda安装，当前版本为tabla py 1.4.1，解决了这个问题。

我在从Windows到Linux（docker）环境中遇到了同样的问题——我也没有找到解决方案。