关闭python pypdf-writing时出现问题。获取值错误:对关闭的文件执行I/O操作

关闭python pypdf-writing时出现问题。获取值错误:对关闭的文件执行I/O操作,python,pypdf,Python,Pypdf,我想不出来 此函数(用于将internet站点抓取为pdf的类的一部分)用于合并使用pypdf从网页生成的pdf文件 这是方法代码: def mergePdf(self,mainname,inputlist=0): """merging the pdf pages getting an inputlist to merge or defaults to the class instance self.pdftomerge list""" from pyPdf import

我想不出来 此函数(用于将internet站点抓取为pdf的类的一部分)用于合并使用pypdf从网页生成的pdf文件

这是方法代码:

def mergePdf(self,mainname,inputlist=0):
    """merging the pdf pages
    getting an inputlist to merge or defaults to the class instance self.pdftomerge list"""
    from pyPdf import PdfFileWriter, PdfFileReader
    self._mergelist = inputlist or self.pdftomerge
    self.pdfoutput = PdfFileWriter()

    for name in self._mergelist:
        print "merging %s into main pdf file: %s" % (name,mainname)
        self._filestream = file(name,"rb")
        self.pdfinput = PdfFileReader(self._filestream)
        for p in self.pdfinput.pages:
            self.pdfoutput.addPage(p)
        self._filestream.close()

    self._pdfstream = file(mainname,"wb")
    self._pdfstream.open()
    self.pdfoutput.write(self._pdfstream)
    self._pdfstream.close()
我不断地发现这个错误:

  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 264, in write
    self._sweepIndirectReferences(externalReferenceMap, self._root)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 324, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, data[i])
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 339, in _sweepIndirectReferences
    self._sweepIndirectReferences(externMap, realdata)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 315, in _sweepIndirectReferences
    value = self._sweepIndirectReferences(externMap, value)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 345, in _sweepIndirectReferences
    newobj = data.pdf.getObject(data)
  File "c:\tmp\easy_install-iik9vj\pyPdf-1.13-py2.7-win32.egg.tmp\pyPdf\pdf.py", line 645, in getObject
    self.stream.seek(start, 0)
ValueError: I/O operation on closed file
但当我检查self的状态时,我得到:

<open file 'c:\python27\learn\dive.pdf', mode 'wb' at 0x013B2020>

我做错了什么


我很高兴能得到任何帮助

好的,我发现了你的问题。您调用
file()
是正确的。根本不要尝试调用
open()

您的问题是调用
self.pdfoutput.write(self.\u pdfstream)
时,输入文件仍然需要打开,因此需要删除行
self.\u filestream.close()

编辑:此脚本将触发问题。第一次写入将成功,第二次写入将失败

from pyPdf import PdfFileReader as PfR, PdfFileWriter as PfW

input_filename = 'in.PDF' # replace with a real file
output_filename = 'out.PDF' # something that doesn't exist

infile = file(input_filename, 'rb')
reader = PfR(infile)
writer = PfW()

writer.addPage(reader.getPage(0))
outfile = file(output_filename, 'wb')
writer.write(outfile)
print "First Write Successful!"
infile.close()
outfile.close()

infile = file(input_filename, 'rb')
reader = PfR(infile)
writer = PfW()

writer.addPage(reader.getPage(0))
outfile = file(output_filename, 'wb')
infile.close() # BAD!

writer.write(outfile)
print "You'll get an IOError Before this line"
outfile.close()

嘿,agf,正如我写的,我的问题是赛尔夫。我改为打开,但这没用。当我尝试从pypdf进行写操作时,以及当我检查对象时,仍然会得到错误-。wtf@alonisser你是对的,调用
open()
是错误的!但是你的问题不在于self.\u pdfstream,而在于输入流。编辑我的答案。这似乎解决了问题-非常感谢!但现在还有另一个问题!我得到了相同的长错误字符串和不同的结尾:第693行,在readObjectHeader return int(idnum)中,int(generation)ValueError:invalid literal for int(),以10为基数:“”任何ideasIt听起来像是PDF中的一个字段应该是整数,但不是。除此之外,您可能还需要深入到pyPdf源代码中来解决这个问题。问题似乎在于调整pypdf以将页面添加到已存在的文件中——将输出文件的名称更改为类似“output.pdf”的名称解决了这一问题。再次感谢@agf的所有帮助。