如何在Python中将.docx转换为.txt

如何在Python中将.docx转换为.txt,python,word,Python,Word,我想将大量MS Word文件转换为纯文本格式。我不知道如何在Python中实现它。我在网上找到了以下代码。我的路径是本地路径,所有文件名都类似于cx xxx(即c1-000、c1-001、c2-000、c2-001等): 使用pypandoc将docx转换为txt: import pypandoc # Example file: docxFilename = 'somefile.docx' output = pypandoc.convert_file(docxFilename, 'txt',

我想将大量MS Word文件转换为纯文本格式。我不知道如何在Python中实现它。我在网上找到了以下代码。我的路径是本地路径,所有文件名都类似于cx xxx(即c1-000、c1-001、c2-000、c2-001等):


使用pypandoc将docx转换为txt:

import pypandoc

# Example file:
docxFilename = 'somefile.docx'
output = pypandoc.convert_file(docxFilename, 'txt', outputfile="somefile.txt")
assert output == ""
请参见此处的官方文档:


使用pypandoc将docx转换为txt:

import pypandoc

# Example file:
docxFilename = 'somefile.docx'
output = pypandoc.convert_file(docxFilename, 'txt', outputfile="somefile.txt")
assert output == ""
请参见此处的官方文档:

支持50多种文件格式转换。它的免费计划每月提供150次免费API调用

# Import module
import groupdocs_conversion_cloud
from shutil import copyfile

# Get your client_id and client_key at https://dashboard.groupdocs.cloud (free registration is required).
client_id = "xxxxx-xxxx-xxxx-xxxx-xxxxxxxx"
client_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxx"

# Create instance of the API
convert_api = groupdocs_conversion_cloud.ConvertApi.from_keys(client_id, client_key)

try:

        #Convert DOCX to txt
        # Prepare request
        request = groupdocs_conversion_cloud.ConvertDocumentDirectRequest("txt", "C:/Temp/sample.docx")

        # Convert
        result = convert_api.convert_document_direct(request)       
        copyfile(result, 'C:/Temp/sample.txt')
        
except groupdocs_conversion_cloud.ApiException as e:
        print("Exception when calling get_supported_conversion_types: {0}".format(e.message))
支持50多种文件格式转换。它的免费计划每月提供150次免费API调用

# Import module
import groupdocs_conversion_cloud
from shutil import copyfile

# Get your client_id and client_key at https://dashboard.groupdocs.cloud (free registration is required).
client_id = "xxxxx-xxxx-xxxx-xxxx-xxxxxxxx"
client_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxx"

# Create instance of the API
convert_api = groupdocs_conversion_cloud.ConvertApi.from_keys(client_id, client_key)

try:

        #Convert DOCX to txt
        # Prepare request
        request = groupdocs_conversion_cloud.ConvertDocumentDirectRequest("txt", "C:/Temp/sample.docx")

        # Convert
        result = convert_api.convert_document_direct(request)       
        copyfile(result, 'C:/Temp/sample.txt')
        
except groupdocs_conversion_cloud.ApiException as e:
        print("Exception when calling get_supported_conversion_types: {0}".format(e.message))
使用“普通”而不是“txt”使用“普通”而不是“txt”