Python 3.x 如何修复-TypeError:int()参数必须是字符串、类似字节的对象或数字,而不是';PSKeyword';?

Python 3.x 如何修复-TypeError:int()参数必须是字符串、类似字节的对象或数字,而不是';PSKeyword';?,python-3.x,pdfminer,Python 3.x,Pdfminer,我试图使用pdfminer从pdf文件中提取文本,我遇到了这个问题,但只针对一些文件。该代码在某些PDF上运行良好,但为其他PDF返回此错误消息。这是我的代码(我从本论坛的其他线程复制): 这是我得到的错误: Traceback (most recent call last): File "pdf.py", line 28, in <module> print(extract_text_from_pdf('test.pdf')) File &quo

我试图使用pdfminer从pdf文件中提取文本,我遇到了这个问题,但只针对一些文件。该代码在某些PDF上运行良好,但为其他PDF返回此错误消息。这是我的代码(我从本论坛的其他线程复制):

这是我得到的错误:

Traceback (most recent call last):
  File "pdf.py", line 28, in <module>
    print(extract_text_from_pdf('test.pdf'))
  File "pdf.py", line 13, in extract_text_from_pdf
for page in PDFPage.get_pages(fh,
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\pdfpage.py", line 129, in get_pages
    doc = PDFDocument(parser, password=password, caching=caching)
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\pdfdocument.py", line 566, in __init__
xref.load(parser)
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\pdfdocument.py", line 195, in load
(_, obj) = parser.nextobject()
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\psparser.py", line 616, in nextobject
self.do_keyword(pos, token)
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\pdfparser.py", line 79, in do_keyword
(objid, genno) = (int(objid), int(genno))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'PSKeyword'
回溯(最近一次呼叫最后一次):
文件“pdf.py”,第28行,在
打印(从pdf(“test.pdf”)中提取文本)
文件“pdf.py”,第13行,从pdf中提取文本
对于PDFPage.get_页面中的页面(fh,
文件“C:\AppData\Local\Programs\Python\Python38-32\lib\site packages\pdfminer\pdfpage.py”,第129行,在get\U页面中
doc=PDFDocument(解析器,密码=密码,缓存=缓存)
文件“C:\AppData\Local\Programs\Python\Python38-32\lib\site packages\pdfminer\pdfdocument.py”,第566行,在uu init中__
加载(解析器)
文件“C:\AppData\Local\Programs\Python\Python38-32\lib\site packages\pdfminer\pdfdocument.py”,第195行,已加载
(u,obj)=解析器.nextobject()
文件“C:\AppData\Local\Programs\Python\Python38-32\lib\site packages\pdfminer\psparser.py”,第616行,位于下一个对象中
self.do_关键字(pos、令牌)
文件“C:\AppData\Local\Programs\Python\Python38-32\lib\site packages\pdfminer\pdfparser.py”,第79行,在do\u关键字中
(objid,genno)=(int(objid),int(genno))
TypeError:int()参数必须是字符串、类似于对象的字节或数字,而不是“PSKeyword”
我一直在尝试寻找解决这个问题的方法,但还没有成功。感谢大家的帮助!谢谢大家

Traceback (most recent call last):
  File "pdf.py", line 28, in <module>
    print(extract_text_from_pdf('test.pdf'))
  File "pdf.py", line 13, in extract_text_from_pdf
for page in PDFPage.get_pages(fh,
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\pdfpage.py", line 129, in get_pages
    doc = PDFDocument(parser, password=password, caching=caching)
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\pdfdocument.py", line 566, in __init__
xref.load(parser)
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\pdfdocument.py", line 195, in load
(_, obj) = parser.nextobject()
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\psparser.py", line 616, in nextobject
self.do_keyword(pos, token)
  File "C:\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pdfminer\pdfparser.py", line 79, in do_keyword
(objid, genno) = (int(objid), int(genno))
TypeError: int() argument must be a string, a bytes-like object or a number, not 'PSKeyword'