Python PyPDF2压缩_Python_Pdf_Pypdf2

Python PyPDF2压缩

python pdf

Python PyPDF2压缩,python,pdf,pypdf2,Python,Pdf,Pypdf2,我正在努力使用PyPDF2模块压缩合并的pdf。这是我基于我收到的错误是 TypeError: must be string or read-only buffer, not file 我还尝试在合并完成后压缩pdf。我将失败的压缩基于使用PDFSAM压缩后得到的文件大小。有什么想法吗？谢谢。PyPDF2没有可靠的压缩方法。也就是说，有一个compressContentStreams（）方法具有以下描述：通过合并所有内容流并应用FlateCode筛选器压缩此页面的大小但是，如果由于某种

我正在努力使用PyPDF2模块压缩合并的pdf。这是我基于

我收到的错误是

TypeError: must be string or read-only buffer, not file

我还尝试在合并完成后压缩pdf。我将失败的压缩基于使用PDFSAM压缩后得到的文件大小。

有什么想法吗？谢谢。

PyPDF2没有可靠的压缩方法。也就是说，有一个

compressContentStreams（）

方法具有以下描述：

通过合并所有内容流并应用FlateCode筛选器压缩此页面的大小

但是，如果由于某种原因内容流压缩变为“自动”，则此函数可能不会执行任何操作

同样，这在大多数情况下不会有任何区别，但您可以尝试以下代码：

import PyPDF2

path = 'path/to/hello.pdf'
path2 = 'path/to/another.pdf'
pdfs = [path, path2]

writer = PyPDF2.PdfFileWriter()

for pdf in pdfs:
    reader = PyPDF2.PdfFileReader(pdf)
    for i in xrange(reader.numPages):
        page = reader.getPage(i)
        page.compressContentStreams()
        writer.addPage(page)

with open('test_out2.pdf', 'wb') as f:
    writer.write(f)

您的错误说明它必须是字符串或只读缓冲区，而不是文件

因此，最好将合并写入字节或字符串

import PyPDF2
from io import BytesIO

tmp = BytesIO()
path = open('path/to/hello.pdf', 'rb')
path2 = open('path/to/another.pdf', 'rb')
merger = PyPDF2.PdfFileMerger()
merger.append(fileobj=path2)
merger.append(fileobj=path)
merger.write(tmp)
PyPDF2.filters.compress(tmp.getvalue())
merger.write(open("test_out2.pdf", 'wb'))

pdf

未在代码示例中的任何位置定义。这是怎么一回事？另外，请提供完整的回溯，以便我们可以看到是哪一行导致了问题。

import PyPDF2
from io import BytesIO

tmp = BytesIO()
path = open('path/to/hello.pdf', 'rb')
path2 = open('path/to/another.pdf', 'rb')
merger = PyPDF2.PdfFileMerger()
merger.append(fileobj=path2)
merger.append(fileobj=path)
merger.write(tmp)
PyPDF2.filters.compress(tmp.getvalue())
merger.write(open("test_out2.pdf", 'wb'))