Python 3.x 在Python中将doc/docx文件转换为pdf

Python 3.x 在Python中将doc/docx文件转换为pdf,python-3.x,pdf-conversion,Python 3.x,Pdf Conversion,是否有一个好的库可以将文档文件转换为pdf?有一些付费选项可用,如cloudconvert、convertApi等,但我正在寻找一个免费选项。我的python应用程序托管在EC2机器上。 我还研究了python docx库,它允许我读取文档文件的内容,但将内容写入pdf文件会破坏我认为的样式设置。这里有一个VBA解决方案供您使用(我不知道如何使用python) 如果需要将多个Word文件转换为其他格式,如TXT、RTF、HTML或PDF,请运行下面的脚本 Option Explicit On

是否有一个好的库可以将文档文件转换为pdf?有一些付费选项可用,如cloudconvert、convertApi等,但我正在寻找一个免费选项。我的python应用程序托管在EC2机器上。
我还研究了python docx库,它允许我读取文档文件的内容,但将内容写入pdf文件会破坏我认为的样式设置。

这里有一个VBA解决方案供您使用(我不知道如何使用python)

如果需要将多个Word文件转换为其他格式,如TXT、RTF、HTML或PDF,请运行下面的脚本

Option Explicit On

Sub ChangeDocsToTxtOrRTFOrHTML()
    'with export to PDF in Word 2007
    Dim fs As Object
    Dim oFolder As Object
    Dim tFolder As Object
    Dim oFile As Object
    Dim strDocName As String
    Dim intPos As Integer
    Dim locFolder As String
    Dim fileType As String
    On Error Resume Next

    locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\Users\your_path_here\")
    Select Case Application.Version
        Case Is < 12
            Do
                fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
            Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
        Case Is >= 12
            Do
                fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT"))
            Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
    End Select

    Application.ScreenUpdating = False
    Set fs = CreateObject("Scripting.FileSystemObject")
    Set oFolder = fs.GetFolder(locFolder)
    Set tFolder = fs.CreateFolder(locFolder & "Converted")
    Set tFolder = fs.GetFolder(locFolder & "Converted")

    For Each oFile In oFolder.Files
        Dim d As Document
        Set d = Application.Documents.Open(oFile.Path)
        strDocName = ActiveDocument.Name
        intPos = InStrRev(strDocName, ".")
        strDocName = Left(strDocName, intPos - 1)
        ChangeFileOpenDirectory tFolder
        Select Case fileType
            Case Is = "TXT"
                strDocName = strDocName & ".txt"
                ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText
        Case Is = "RTF"
                strDocName = strDocName & ".rtf"
                ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF
        Case Is = "HTML"
                strDocName = strDocName & ".html"
                ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML
        Case Is = "PDF"
                strDocName = strDocName & ".pdf"
                ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
        End Select
        d.Close
        ChangeFileOpenDirectory oFolder
    Next oFile
    Application.ScreenUpdating = True

End Sub
选项显式打开
子变更docstotxtorrtforhtml()
'在Word 2007中导出为PDF
作为对象的Dim fs
作为对象的文件夹的尺寸
将tFolder设置为对象
作为对象的文件的尺寸
Dim strDocName作为字符串
作为整数的Dim intPos
将文件夹设置为字符串
将文件类型设置为字符串
出错时继续下一步
locFolder=InputBox(“输入文档的文件夹路径”、“文件转换”、“C:\Users\your\u path\u here\”)
选择案例应用程序。版本
病例<12例
做
fileType=UCase(输入框(“将文档更改为TXT、RTF、HTML”、“文件转换”、“TXT”))
循环直到(fileType=“TXT”或fileType=“RTF”或fileType=“HTML”)
案例>=12
做
fileType=UCase(输入框(“将文档更改为TXT、RTF、HTML或PDF(仅限2007年以上)”、“文件转换”、“TXT”))
循环直到(fileType=“TXT”或fileType=“RTF”或fileType=“HTML”或fileType=“PDF”)
结束选择
Application.ScreenUpdating=False
设置fs=CreateObject(“Scripting.FileSystemObject”)
文件夹集=fs.GetFolder(locFolder)
设置tFolder=fs.CreateFolder(locFolder&“已转换”)
Set tFolder=fs.GetFolder(locFolder&“已转换”)
对于oFolder.Files中的每个文件
将d作为文件
Set d=Application.Documents.Open(oFile.Path)
strDocName=ActiveDocument.Name
intPos=InStrRev(标准名称“.”)
strDocName=左(strDocName,intPos-1)
ChangeFileOpenDirectory tFolder
选择案例文件类型
Case Is=“TXT”
strDocName=strDocName&“.txt”
ActiveDocument.SaveAs文件名:=strDocName,文件格式:=wdFormatText
Case Is=“RTF”
strDocName=strDocName&“.rtf”
ActiveDocument.SaveAs文件名:=strDocName,文件格式:=wdFormatRTF
Case Is=“HTML”
strDocName=strDocName&“.html”
ActiveDocument.SaveAs文件名:=strDocName,文件格式:=wdFormatFilteredHTML
Case Is=“PDF”
strDocName=strDocName&“.pdf”
ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName,ExportFormat:=wdExportFormatPDF
结束选择
d、 接近
ChangeFileOpenDirectory文件夹
下一个文件
Application.ScreenUpdating=True
端接头

这里有一个VBA解决方案(我不知道如何使用Python实现)

如果需要将多个Word文件转换为其他格式,如TXT、RTF、HTML或PDF,请运行下面的脚本

Option Explicit On

Sub ChangeDocsToTxtOrRTFOrHTML()
    'with export to PDF in Word 2007
    Dim fs As Object
    Dim oFolder As Object
    Dim tFolder As Object
    Dim oFile As Object
    Dim strDocName As String
    Dim intPos As Integer
    Dim locFolder As String
    Dim fileType As String
    On Error Resume Next

    locFolder = InputBox("Enter the folder path to DOCs", "File Conversion", "C:\Users\your_path_here\")
    Select Case Application.Version
        Case Is < 12
            Do
                fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML", "File Conversion", "TXT"))
            Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML")
        Case Is >= 12
            Do
                fileType = UCase(InputBox("Change DOC to TXT, RTF, HTML or PDF(2007+ only)", "File Conversion", "TXT"))
            Loop Until (fileType = "TXT" Or fileType = "RTF" Or fileType = "HTML" Or fileType = "PDF")
    End Select

    Application.ScreenUpdating = False
    Set fs = CreateObject("Scripting.FileSystemObject")
    Set oFolder = fs.GetFolder(locFolder)
    Set tFolder = fs.CreateFolder(locFolder & "Converted")
    Set tFolder = fs.GetFolder(locFolder & "Converted")

    For Each oFile In oFolder.Files
        Dim d As Document
        Set d = Application.Documents.Open(oFile.Path)
        strDocName = ActiveDocument.Name
        intPos = InStrRev(strDocName, ".")
        strDocName = Left(strDocName, intPos - 1)
        ChangeFileOpenDirectory tFolder
        Select Case fileType
            Case Is = "TXT"
                strDocName = strDocName & ".txt"
                ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatText
        Case Is = "RTF"
                strDocName = strDocName & ".rtf"
                ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatRTF
        Case Is = "HTML"
                strDocName = strDocName & ".html"
                ActiveDocument.SaveAs FileName:=strDocName, FileFormat:=wdFormatFilteredHTML
        Case Is = "PDF"
                strDocName = strDocName & ".pdf"
                ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName, ExportFormat:=wdExportFormatPDF
        End Select
        d.Close
        ChangeFileOpenDirectory oFolder
    Next oFile
    Application.ScreenUpdating = True

End Sub
选项显式打开
子变更docstotxtorrtforhtml()
'在Word 2007中导出为PDF
作为对象的Dim fs
作为对象的文件夹的尺寸
将tFolder设置为对象
作为对象的文件的尺寸
Dim strDocName作为字符串
作为整数的Dim intPos
将文件夹设置为字符串
将文件类型设置为字符串
出错时继续下一步
locFolder=InputBox(“输入文档的文件夹路径”、“文件转换”、“C:\Users\your\u path\u here\”)
选择案例应用程序。版本
病例<12例
做
fileType=UCase(输入框(“将文档更改为TXT、RTF、HTML”、“文件转换”、“TXT”))
循环直到(fileType=“TXT”或fileType=“RTF”或fileType=“HTML”)
案例>=12
做
fileType=UCase(输入框(“将文档更改为TXT、RTF、HTML或PDF(仅限2007年以上)”、“文件转换”、“TXT”))
循环直到(fileType=“TXT”或fileType=“RTF”或fileType=“HTML”或fileType=“PDF”)
结束选择
Application.ScreenUpdating=False
设置fs=CreateObject(“Scripting.FileSystemObject”)
文件夹集=fs.GetFolder(locFolder)
设置tFolder=fs.CreateFolder(locFolder&“已转换”)
Set tFolder=fs.GetFolder(locFolder&“已转换”)
对于oFolder.Files中的每个文件
将d作为文件
Set d=Application.Documents.Open(oFile.Path)
strDocName=ActiveDocument.Name
intPos=InStrRev(标准名称“.”)
strDocName=左(strDocName,intPos-1)
ChangeFileOpenDirectory tFolder
选择案例文件类型
Case Is=“TXT”
strDocName=strDocName&“.txt”
ActiveDocument.SaveAs文件名:=strDocName,文件格式:=wdFormatText
Case Is=“RTF”
strDocName=strDocName&“.rtf”
ActiveDocument.SaveAs文件名:=strDocName,文件格式:=wdFormatRTF
Case Is=“HTML”
strDocName=strDocName&“.html”
ActiveDocument.SaveAs文件名:=strDocName,文件格式:=wdFormatFilteredHTML
Case Is=“PDF”
strDocName=strDocName&“.pdf”
ActiveDocument.ExportAsFixedFormat OutputFileName:=strDocName,ExportFormat:=wdExportFormatPDF
结束选择
d、 接近
ChangeFileOpenDirectory文件夹
下一个文件
Application.ScreenUpdating=True
端接头

您可以使用用于Python的Aspose.Words云SDK。它支持DOC/DOCX到PDF的转换,并保持格式/样式不变。它的免费试用计划每月提供150次API调用

附言:我是Aspose的开发者宣传员

# For complete examples and data files, please go to https://github.com/aspose-words-cloud/aspose-words-cloud-python
# Import module
import asposewordscloud
import asposewordscloud.models.requests
from shutil import copyfile

# Please get your Client ID and Secret from https://dashboard.aspose.cloud.
client_id='xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx'
client_secret='xxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

words_api = asposewordscloud.WordsApi(client_id,client_secret)
words_api.api_client.configuration.host='https://api.aspose.cloud'

filename = 'C:/Temp/02_pages.docx'
dest_name = 'C:/Temp/02_pages.pdf'
#Convert DOCX to PDF
request = asposewordscloud.models.requests.ConvertDocumentRequest(document=open(filename, 'rb'), format='pdf')
result = words_api.convert_document(request)
copyfile(result, dest_name)
print("Result {}".format(result))


您可以使用Aspose.Words Cloud SDK for Python。它支持DOC/DOCX到PDF的转换,并保持格式/样式不变。它的免费试用计划每月提供150次API调用

附言:我是Aspose的开发者宣传员

# For complete examples and data files, please go to https://github.com/aspose-words-cloud/aspose-words-cloud-python
# Import module
import asposewordscloud
import asposewordscloud.models.requests
from shutil import copyfile

# Please get your Client ID and Secret from https://dashboard.aspose.cloud.
client_id='xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx'
client_secret='xxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

words_api = asposewordscloud.WordsApi(client_id,client_secret)
words_api.api_client.configuration.host='https://api.aspose.cloud'

filename = 'C:/Temp/02_pages.docx'
dest_name = 'C:/Temp/02_pages.pdf'
#Convert DOCX to PDF
request = asposewordscloud.models.requests.ConvertDocumentRequest(document=open(filename, 'rb'), format='pdf')
result = words_api.convert_document(request)
copyfile(result, dest_name)
print("Result {}".format(result))