Python (PyPDF2)尝试合并PDF时产生错误

Python (PyPDF2)尝试合并PDF时产生错误,python,reportlab,pypdf,Python,Reportlab,Pypdf,我一直在尝试添加水印,如中所示,但我不断从reportlab获得有关pdf数据的错误。输入pdf有问题吗 安装程序:Python 3.3 Anaconda发行版,Windows 7 我得到以下错误: Traceback (most recent call last): File "D:\IBP_Scripts\bsouthga\PDF Merge\merge.py", line 73, in <module> pageSelectionPDF("./merged_pdfs

我一直在尝试添加水印,如中所示,但我不断从reportlab获得有关pdf数据的错误。输入pdf有问题吗

安装程序:Python 3.3 Anaconda发行版,Windows 7

我得到以下错误:

Traceback (most recent call last):
  File "D:\IBP_Scripts\bsouthga\PDF Merge\merge.py", line 73, in <module>
    pageSelectionPDF("./merged_pdfs/FB1_report.pdf", [44,52])
  File "D:\IBP_Scripts\bsouthga\PDF Merge\merge.py", line 64, in pageSelectionPDF
    page0.mergePage(overlay.getPage(0))
  File "D:\Users\bsouthga\AppData\Local\Continuum\Anaconda\envs\py33\lib\site-packages\PyPDF2\pdf.py", line 1996, in mergePage
    self._mergePage(page2)
  File "D:\Users\bsouthga\AppData\Local\Continuum\Anaconda\envs\py33\lib\site-packages\PyPDF2\pdf.py", line 2042, in _mergePage
    page2Content = PageObject._pushPopGS(page2Content, self.pdf)
  File "D:\Users\bsouthga\AppData\Local\Continuum\Anaconda\envs\py33\lib\site-packages\PyPDF2\pdf.py", line 1956, in _pushPopGS
    stream = ContentStream(contents, pdf)
  File "D:\Users\bsouthga\AppData\Local\Continuum\Anaconda\envs\py33\lib\site-packages\PyPDF2\pdf.py", line 2428, in __init__
    stream = BytesIO(b_(stream.getData()))
  File "D:\Users\bsouthga\AppData\Local\Continuum\Anaconda\envs\py33\lib\site-packages\PyPDF2\generic.py", line 831, in getData
    decoded._data = filters.decodeStreamData(self)
  File "D:\Users\bsouthga\AppData\Local\Continuum\Anaconda\envs\py33\lib\site-packages\PyPDF2\filters.py", line 317, in decodeStreamData
    data = ASCII85Decode.decode(data)
  File "D:\Users\bsouthga\AppData\Local\Continuum\Anaconda\envs\py33\lib\site-packages\PyPDF2\filters.py", line 256, in decode
    data = [y for y in data if not (y in ' \n\r\t')]
  File "D:\Users\bsouthga\AppData\Local\Continuum\Anaconda\envs\py33\lib\site-packages\PyPDF2\filters.py", line 256, in <listcomp>
    data = [y for y in data if not (y in ' \n\r\t')]
TypeError: 'in <string>' requires string as left operand, not int

再次切换到Python2.7,anaconda dist和它似乎可以工作,这肯定是Python3的PyPDF2库中的一个问题。如果要使用python 3,则需要在filter.py文件中修补ascii85decode类。我也遇到了同样的问题,从pdfminer3k中的ascii85.py借用了ascii85.py的解码代码,这是pdfminer for python 3的一个端口,并将其粘贴到filter.py中的def中修复了这个问题。问题是,在Python3中,它需要返回字节,而在旧的Python2代码中则不需要。github中有一个合并更改的请求。我想我会在这里回答以防万一

将PyPDF2库中filter.py中的ascii85decode def中的代码替换为pdfminer3k中的代码:

if isinstance(data, str):
    data = data.encode('ascii')
n = b = 0
out = bytearray()
for c in data:
    if ord('!') <= c and c <= ord('u'):
        n += 1
        b = b*85+(c-33)
        if n == 5:
            out += struct.pack(b'>L',b)
            n = b = 0
    elif c == ord('z'):
        assert n == 0
        out += b'\0\0\0\0'
    elif c == ord('~'):
        if n:
            for _ in range(5-n):
                b = b*85+84
            out += struct.pack(b'>L',b)[:n-1]
        break
return bytes(out)
if isinstance(data, str):
    data = data.encode('ascii')
n = b = 0
out = bytearray()
for c in data:
    if ord('!') <= c and c <= ord('u'):
        n += 1
        b = b*85+(c-33)
        if n == 5:
            out += struct.pack(b'>L',b)
            n = b = 0
    elif c == ord('z'):
        assert n == 0
        out += b'\0\0\0\0'
    elif c == ord('~'):
        if n:
            for _ in range(5-n):
                b = b*85+84
            out += struct.pack(b'>L',b)[:n-1]
        break
return bytes(out)