Python 3.x from.pdftypes import PDFObjectNotFound ImportError:无法导入名称';PDFObjectNotFound';
我正在尝试将PDF转换为文本。但我在PDFPage课上有问题。我已经找过了。但我什么也没得到,它给了我以下的错误。我还为Python3.5安装了pdfminer.six,但仍然没有得到任何解决方案。请帮忙 代码:Python 3.x from.pdftypes import PDFObjectNotFound ImportError:无法导入名称';PDFObjectNotFound';,python-3.x,Python 3.x,我正在尝试将PDF转换为文本。但我在PDFPage课上有问题。我已经找过了。但我什么也没得到,它给了我以下的错误。我还为Python3.5安装了pdfminer.six,但仍然没有得到任何解决方案。请帮忙 代码: from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter from pdfminer.converter import TextConverter from pdfminer.layout import
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from pdfminer.pdfpage import PDFPage
import os
import sys, getopt
#converts pdf, returns its text content as a string
def extract_text_from_pdf(pdf_path):
with open(pdf_path, 'rb') as fh:
for page in PDFPage.get_pages(fh,
caching=True,
check_extractable=True):
resource_manager = PDFResourceManager()
fake_file_handle = io.StringIO()
converter = TextConverter(resource_manager, fake_file_handle, codec='utf-8', laparams=LAParams())
page_interpreter = PDFPageInterpreter(resource_manager, converter)
page_interpreter.process_page(page)
text = fake_file_handle.getvalue()
yield text
# close open handles
converter.close()
fake_file_handle.close()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/system/anaconda3/lib/python3.6/site-packages/pdfminer/pdfpage.py", line 5, in <module>
from .pdftypes import PDFObjectNotFound
ImportError: cannot import name 'PDFObjectNotFound'
错误:
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from pdfminer.pdfpage import PDFPage
import os
import sys, getopt
#converts pdf, returns its text content as a string
def extract_text_from_pdf(pdf_path):
with open(pdf_path, 'rb') as fh:
for page in PDFPage.get_pages(fh,
caching=True,
check_extractable=True):
resource_manager = PDFResourceManager()
fake_file_handle = io.StringIO()
converter = TextConverter(resource_manager, fake_file_handle, codec='utf-8', laparams=LAParams())
page_interpreter = PDFPageInterpreter(resource_manager, converter)
page_interpreter.process_page(page)
text = fake_file_handle.getvalue()
yield text
# close open handles
converter.close()
fake_file_handle.close()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/system/anaconda3/lib/python3.6/site-packages/pdfminer/pdfpage.py", line 5, in <module>
from .pdftypes import PDFObjectNotFound
ImportError: cannot import name 'PDFObjectNotFound'
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
文件“/home/system/anaconda3/lib/python3.6/site packages/pdfminer/pdfpage.py”,第5行,在
从.pdftypes导入PDFObjectNotFound
ImportError:无法导入名称“PDFObjectNotFound”
在代码的开头添加以下行,并试一试:
from io import StringIO
卸载
pdfminer3k
(如果已安装)
然后使用下面的命令安装pdfminer.six
$ python -m pip install pdfminer.six
它给了我一个错误,即SyntaxError@Neal Titus ThomasFile“”,io导入字符串的第1行。io^SyntaxError:无效的Syntax抱歉,它是StringIO而不是String。IOPDFPage实际上我有一个问题。(从pdfminer.pdfpage导入pdfpage)它给出了我上传的错误信息。StringIO在这里需要什么?我已经看过了,但仍然不起作用。这给了我同样的错误。我已经试了三天了,但还没有完成。请帮助@Neal Titus Thomas