在docx中替换文本，并使用python docx保存更改的文件_Python_Ms Word_Docx_Python Docx

在docx中替换文本，并使用python docx保存更改的文件

python ms-word

在docx中替换文本，并使用python docx保存更改的文件,python,ms-word,docx,python-docx,Python,Ms Word,Docx,Python Docx,我正在尝试使用替换文件中的单词，并保存新文件，同时警告新文件必须与旧文件具有完全相同的格式，但替换了单词。我该怎么做 docx模块有一个savedocx，它接受7个输入：文件 coreprops appprops 内容类型腹板设置词语关系输出除了替换的单词外，如何保持原始文件中的所有内容不变？您使用的是来自的docx模块吗如果是，那么docx模块已经公开了replace、advReplace等方法，这些方法可以帮助您完成任务。有关公开方法的更多详细信息，请参阅。看起来，Docx

我正在尝试使用替换文件中的单词，并保存新文件，同时警告新文件必须与旧文件具有完全相同的格式，但替换了单词。我该怎么做

docx模块有一个savedocx，它接受7个输入：

文件
coreprops
appprops
内容类型
腹板设置
词语关系
输出

除了替换的单词外，如何保持原始文件中的所有内容不变？

您使用的是来自的docx模块吗

如果是，那么docx模块已经公开了replace、advReplace等方法，这些方法可以帮助您完成任务。有关公开方法的更多详细信息，请参阅。

看起来，Docx for Python并不是用来存储包含图像、标题等的完整Docx，但只包含文档的内部内容。所以没有简单的方法可以做到这一点

然而，以下是你可以做到的：

首先，请看一下：

它解释了如何解压docx文件：下面是一个典型文件的外观：

+--docProps
|  +  app.xml
|  \  core.xml
+  res.log
+--word //this folder contains most of the files that control the content of the document
|  +  document.xml //Is the actual content of the document
|  +  endnotes.xml
|  +  fontTable.xml
|  +  footer1.xml //Containst the elements in the footer of the document
|  +  footnotes.xml
|  +--media //This folder contains all images embedded in the word
|  |  \  image1.jpeg
|  +  settings.xml
|  +  styles.xml
|  +  stylesWithEffects.xml
|  +--theme
|  |  \  theme1.xml
|  +  webSettings.xml
|  \--_rels
|     \  document.xml.rels //this document tells word where the images are situated
+  [Content_Types].xml
\--_rels
   \  .rels

Docx只获取文档的一部分，方法是opendocx

def opendocx(file):
    '''Open a docx file, return a document XML tree'''
    mydoc = zipfile.ZipFile(file)
    xmlcontent = mydoc.read('word/document.xml')
    document = etree.fromstring(xmlcontent)
    return document

它只获取document.xml文件

我建议您做的是：

使用**opendocx获取文档内容*

用advReplace方法替换document.xml

以zip格式打开docx，并用新的xml内容替换document.xml内容

关闭并输出压缩文件（将其重命名为output.docx）

如果您安装了node.js，请注意，我已经开发了docx文档的模板引擎，该库正在积极开发中，将很快作为一个节点模块发布。

我已经创建了python docx的repo，它保留了docx文件中所有先前存在的数据，包括格式。希望这就是你想要的。

这对我很有用：

def docx_replace(old_file,new_file,rep):
    zin = zipfile.ZipFile (old_file, 'r')
    zout = zipfile.ZipFile (new_file, 'w')
    for item in zin.infolist():
        buffer = zin.read(item.filename)
        if (item.filename == 'word/document.xml'):
            res = buffer.decode("utf-8")
            for r in rep:
                res = res.replace(r,rep[r])
            buffer = res.encode("utf-8")
        zout.writestr(item, buffer)
    zout.close()
    zin.close()

def escape(escapee):
  escapee = escapee.replace("&", "&amp;")
  escapee = escapee.replace("<", "&lt;")
  escapee = escapee.replace(">", "&gt;")
  escapee = escapee.replace("\"", "&quot;")
  escapee = escapee.replace("'", "&apos;")
return escapee

除了@ramil之外，在将某些字符作为字符串值放入XML之前，还必须对其进行转义，因此这对我来说很有效：

def docx_replace(old_file,new_file,rep):
    zin = zipfile.ZipFile (old_file, 'r')
    zout = zipfile.ZipFile (new_file, 'w')
    for item in zin.infolist():
        buffer = zin.read(item.filename)
        if (item.filename == 'word/document.xml'):
            res = buffer.decode("utf-8")
            for r in rep:
                res = res.replace(r,rep[r])
            buffer = res.encode("utf-8")
        zout.writestr(item, buffer)
    zout.close()
    zin.close()

def escape(escapee):
  escapee = escapee.replace("&", "&amp;")
  escapee = escapee.replace("<", "&lt;")
  escapee = escapee.replace(">", "&gt;")
  escapee = escapee.replace("\"", "&quot;")
  escapee = escapee.replace("'", "&apos;")
return escapee

def逃逸（逃逸者）：
逃逸者=逃逸者。替换（“&”、“&；”）
转义对象=转义对象。替换（“，”）
转义对象=转义对象。替换（“\”，“”）
逃逸者=逃逸者。替换（“”，“&apos；”）
返回逃犯

上述方法的问题在于它们丢失了现有的格式。请参阅执行替换并保留格式的my

还有

python docx模板

，它允许在docx模板中使用jinja2样式的模板。这里有一个链接到

我们可以使用python docx在docx上保存图像。 docx将图像检测为段落。但这段文字是空的。所以你可以这样使用。

段落=文档。段落中段落的段落：如果段落。文本=''：继续

您尝试了什么？您还可以通过提供模块链接获得更多帮助。几乎所有这些参数都有良好的默认值。您几乎可以忽略所有内容，但是

输出

和

文档

。是的，我正在从那里使用模块。我使用replace进行更改，但我遇到的问题是我不知道如何将更改保存到文件中。这就是我所拥有的：1）document=opendocx（'sample.docx'）2）replace（document，“todo”，“tada”）3）那么我应该如何保存对doc sample.docx的更改，同时保持文档的原始格式？谢谢，我又搜索了一次如何做你概述的这个过程，发现了另一个显示如何做的问题：哇，这太棒了。一段时间以来，我一直在努力寻找这样的函数。非常感谢。