在java中将docx转换为pdf

在java中将docx转换为pdf,java,pdf,ms-word,apache-poi,Java,Pdf,Ms Word,Apache Poi,我正在尝试将包含表格和图像的docx文件转换为pdf格式文件 我一直在到处寻找,但没有得到适当的解决方案,要求提供适当和正确的解决方案: 以下是我尝试过的: 例外情况: Exception in thread "main" java.lang.IllegalAccessError: tried to access method org.apache.poi.util.POILogger.log(ILjava/lang/Object;)V from class org.apache.poi.ope

我正在尝试将包含表格和图像的
docx
文件转换为
pdf
格式文件

我一直在到处寻找,但没有得到适当的解决方案,要求提供适当和正确的解决方案:

以下是我尝试过的:

例外情况:

Exception in thread "main" java.lang.IllegalAccessError: tried to access method org.apache.poi.util.POILogger.log(ILjava/lang/Object;)V from class org.apache.poi.openxml4j.opc.PackageRelationshipCollection
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.parseRelationshipsPart(PackageRelationshipCollection.java:313)
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:162)
at org.apache.poi.openxml4j.opc.PackageRelationshipCollection.<init>(PackageRelationshipCollection.java:130)
at org.apache.poi.openxml4j.opc.PackagePart.loadRelationships(PackagePart.java:559)
at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:112)
at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:83)
at org.apache.poi.openxml4j.opc.PackagePart.<init>(PackagePart.java:128)
at org.apache.poi.openxml4j.opc.ZipPackagePart.<init>(ZipPackagePart.java:78)
at org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:239)
at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:665)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:274)
at org.apache.poi.util.PackageHelper.open(PackageHelper.java:39)
at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:121)
at test.TestCon.ConvertToPDF(TestCon.java:31)
at test.TestCon.main(TestCon.java:25)
线程“main”java.lang.IllegalAccessError中的异常:试图从org.apache.poi.util.POILogger.log(ILjava/lang/Object;)V类org.apache.poi.openxml4j.opc.PackageRelationshipCollection访问方法org.apache.poi.util.POILogger.log(ILjava/lang/Object;)V 在org.apache.poi.openxml4j.opc.PackageRelationshipCollection.parseRelationshipsPart(PackageRelationshipCollection.java:313) 位于org.apache.poi.openxml4j.opc.PackageRelationshipCollection.(PackageRelationshipCollection.java:162) 位于org.apache.poi.openxml4j.opc.PackageRelationshipCollection.(PackageRelationshipCollection.java:130) 在org.apache.poi.openxml4j.opc.PackagePart.loadRelationships(PackagePart.java:559)上 位于org.apache.poi.openxml4j.opc.PackagePart.(PackagePart.java:112) 位于org.apache.poi.openxml4j.opc.PackagePart.(PackagePart.java:83) 位于org.apache.poi.openxml4j.opc.PackagePart(PackagePart.java:128) 在org.apache.poi.openxml4j.opc.ZipPackagePart.(ZipPackagePart.java:78) 位于org.apache.poi.openxml4j.opc.ZipPackage.getpartsiml(ZipPackage.java:239) 位于org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:665) 位于org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:274) 位于org.apache.poi.util.PackageHelper.open(PackageHelper.java:39) 位于org.apache.poi.xwpf.usermodel.XWPFDocument.(XWPFDocument.java:121) 位于test.TestCon.ConvertToPDF(TestCon.java:31) 位于test.TestCon.main(TestCon.java:25) 我的要求是创建一个java代码,将现有的docx转换为具有适当格式和对齐方式的pdf

请建议

使用的罐子:


您缺少一些库

我可以通过添加以下库来运行您的代码:

Apache POI 3.15 org.apache.poi.xwpf.converter.core-1.0.6.jar org.apache.poi.xwpf.converter.pdf-1.0.6.jar fr.opensagres.xdocreport.itext.extension-2.0.0.jar itext-2.1.7.jar ooxml-schemas-1.3.jar ApachePOI3.15 org.apache.poi.xwpf.converter.core-1.0.6.jar org.apache.poi.xwpf.converter.pdf-1.0.6.jar fr.opensagres.xdocreport.itext.extension-2.0.0.jar itext-2.1.7.jar ooxml-schemas-1.3.jar
我已经成功地转换了一个6页长的Word文档(.docx),其中包含表格、图像和各种格式。

除了VivekRatanSinha之外,我还想为将来需要它的人发布完整的代码和所需的JAR

代码:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.poi.xwpf.converter.pdf.PdfConverter;
import org.apache.poi.xwpf.converter.pdf.PdfOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;

public class WordConvertPDF {
    public static void main(String[] args) {
        WordConvertPDF cwoWord = new WordConvertPDF();
        cwoWord.ConvertToPDF("D:/Test.docx", "D:/Test.pdf");
    }

    public void ConvertToPDF(String docPath, String pdfPath) {
        try {
            InputStream doc = new FileInputStream(new File(docPath));
            XWPFDocument document = new XWPFDocument(doc);
            PdfOptions options = PdfOptions.create();
            OutputStream out = new FileOutputStream(new File(pdfPath));
            PdfConverter.getInstance().convert(document, out, options);
        } catch (IOException ex) {
            System.out.println(ex.getMessage());
        }
    }
}
和罐子:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.poi.xwpf.converter.pdf.PdfConverter;
import org.apache.poi.xwpf.converter.pdf.PdfOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;

public class WordConvertPDF {
    public static void main(String[] args) {
        WordConvertPDF cwoWord = new WordConvertPDF();
        cwoWord.ConvertToPDF("D:/Test.docx", "D:/Test.pdf");
    }

    public void ConvertToPDF(String docPath, String pdfPath) {
        try {
            InputStream doc = new FileInputStream(new File(docPath));
            XWPFDocument document = new XWPFDocument(doc);
            PdfOptions options = PdfOptions.create();
            OutputStream out = new FileOutputStream(new File(pdfPath));
            PdfConverter.getInstance().convert(document, out, options);
        } catch (IOException ex) {
            System.out.println(ex.getMessage());
        }
    }
}

享受:)

我使用这个代码

private byte[] toPdf(ByteArrayOutputStream docx) {
    InputStream isFromFirstData = new ByteArrayInputStream(docx.toByteArray());

    XWPFDocument document = new XWPFDocument(isFromFirstData);
    PdfOptions options = PdfOptions.create();

    //make new file in c:\temp\
    OutputStream out = new FileOutputStream(new File("c:\\tmp\\HelloWord.pdf"));
    PdfConverter.getInstance().convert(document, out, options);

    //return byte array for return in http request.
    ByteArrayOutputStream pdf = new ByteArrayOutputStream();
    PdfConverter.getInstance().convert(document, pdf, options);

    document.write(pdf);
    document.close();
    return pdf.toByteArray();
}

您需要添加这些maven依赖项

  <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-scratchpad</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-ooxml-schemas</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-excelant</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.poi</groupId>
        <artifactId>poi-examples</artifactId>
        <version>4.0.1</version>
    </dependency>

    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>org.apache.poi.xwpf.converter.core</artifactId>
        <version>1.0.6</version>
    </dependency>

    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>org.apache.poi.xwpf.converter.pdf</artifactId>
        <version>1.0.6</version>
    </dependency>
<dependency>
    <groupId>fr.opensagres.xdocreport</groupId>
    <artifactId>fr.opensagres.poi.xwpf.converter.pdf</artifactId>
    <version>2.0.2</version>
</dependency>

org.apache.poi
poi
4.0.1
org.apache.poi
poi ooxml
4.0.1
org.apache.poi
poi草稿行
4.0.1
org.apache.poi
poi ooxml模式
4.0.1
org.apache.poi
卓越的
4.0.1
org.apache.poi
poi示例
4.0.1
fr.opensagres.xdocreport
org.apache.poi.xwpf.converter.core
1.0.6
fr.opensagres.xdocreport
org.apache.poi.xwpf.converter.pdf
1.0.6

我做了大量研究,发现Documents4j是将docx转换为pdf的最佳免费API对齐字体所有文档4J都做得很好

Maven依赖项:

<dependency>
    <groupId>com.documents4j</groupId>
    <artifactId>documents4j-local</artifactId>
    <version>1.0.3</version>
</dependency>
<dependency>
    <groupId>com.documents4j</groupId>
    <artifactId>documents4j-transformer-msoffice-word</artifactId>
    <version>1.0.3</version>
</dependency>

只需手动添加一个依赖项(其他依赖项应自动拉入)。到目前为止,最新版本是2.0.2

格拉德尔:

依赖项{
//你应该已经拥有的
实现“org.apache.poi:poi ooxml:latest.release”
//将以下行添加到build.gradle中的DEPENDENCIES块
实现“fr.opensagres.xdocreport:fr.opensagres.poi.xwpf.converter.pdf:2.0.2”
}
仅限Maven:

<dependency>
    <groupId>fr.opensagres.xdocreport</groupId>
    <artifactId>fr.opensagres.poi.xwpf.converter.pdf</artifactId>
    <version>2.0.2</version>
</dependency>

fr.opensagres.xdocreport
fr.opensagres.poi.xwpf.converter.pdf
2.0.2

我将提供3种将docx转换为pdf的方法:

  • 使用itext、opensagres和apache poi
  • 代码:

    依赖项:使用Maven解析依赖项

    fr.opensagres.poi.xwpf.converter.core的新版本2.0.2使用apache poi 4.0.1和itext 2.17运行。您只需要在Maven中添加以下依赖项,然后Maven将自动下载所有依赖项。(更新了Maven项目,因此它下载了所有这些库及其所有依赖项)

    依赖项:使用Maven解析依赖项

    依赖项:使用Maven解析依赖项

    
    org.openoffice
    乌努伊尔
    3.2.1
    org.openoffice
    朱
    3.2.1
    org.openoffice
    引导连接器
    0.1.1
    
    如果您的文档非常丰富,并且选择在Linux/Unix上进行转换,那么实现起来可能会“有点”痛苦

    我建议的解决方案是:使用Docker支持的无状态API将HTML、Markdown和Office文档转换为PDF

    • 启动容器
      $docker run--rm-p 3000:3000编码机/gotenberg:6
    • 向容器发出请求。下面是如何使用
      curl
    $curl——请求发布\
    --网址http://localhost:3000/convert/office \
    --标题“内容类型:多部分/表单数据”\
    --表单文件=@document.docx\
    --表单文件=@document2.docx\
    -o result.pdf
    
    将其部署到您的基础架构(例如,作为单独的微服务),并通过Java服务发出简单的HTTP请求。在响应中获取您的PDF文件,并对其执行所需操作


    经过测试,效果很好

    Docx4j是开源的,是将Docx转换为pdf的最佳API,没有任何对齐或字体问题

    Maven依赖项

    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-JAXB-Internal</artifactId>
        <version>8.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
        <version>8.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-JAXB-MOXy</artifactId>
        <version>8.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-export-fo</artifactId>
        <version>8.0.0</version>
    </dependency>
    
    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    import java.io.InputStream;
    
    import org.docx4j.Docx4J;
    import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
    import org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart;
    
    public class DocToPDF {
    
        public static void main(String[] args) {
            
            try {
                InputStream templateInputStream = new FileInputStream("D:\\\\Workspace\\\\New\\\\Sample.docx");
                WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(templateInputStream);
                MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
    
                String outputfilepath = "D:\\\\Workspace\\\\New\\\\Sample.pdf";
                FileOutputStream os = new FileOutputStream(outputfilepath);
                Docx4J.toPDF(wordMLPackage,os);
                os.flush();
                os.close();
            } catch (Throwable e) {
    
                e.printStackTrace();
            } 
        }
    
    }
    

    我有一个非常复杂的文档,Microsoft Graph Api帮助使用
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    import java.io.InputStream;
    import java.io.OutputStream;
    
    import com.documents4j.api.DocumentType;
    import com.documents4j.api.IConverter;
    import com.documents4j.job.LocalConverter;
    
    public class Document4jApp {
    
      public static void main(String[] args) {
    
          File inputWord = new File("C:/Users/avijit.shaw/Desktop/testing/docx/Account Opening Prototype Details.docx");
          File outputFile = new File("Test_out.pdf");
          try  {
              InputStream docxInputStream = new FileInputStream(inputWord);
              OutputStream outputStream = new FileOutputStream(outputFile);
              IConverter converter = LocalConverter.builder().build();         
              converter.convert(docxInputStream).as(DocumentType.DOCX).to(outputStream).as(DocumentType.PDF).execute();
              outputStream.close();
              System.out.println("success");
          } catch (Exception e) {
              e.printStackTrace();
          }
      }
    }
    
    <dependency>
        <groupId>com.documents4j</groupId>
        <artifactId>documents4j-local</artifactId>
        <version>1.0.3</version>
    </dependency>
    <dependency>
        <groupId>com.documents4j</groupId>
        <artifactId>documents4j-transformer-msoffice-word</artifactId>
        <version>1.0.3</version>
    </dependency>
    
    
    import java.io.File;
    import com.sun.star.beans.PropertyValue;
    import com.sun.star.comp.helper.BootstrapException;
    import com.sun.star.frame.XComponentLoader;
    import com.sun.star.frame.XDesktop;
    import com.sun.star.frame.XStorable;
    import com.sun.star.lang.XComponent;
    import com.sun.star.lang.XMultiComponentFactory;
    import com.sun.star.uno.Exception;
    import com.sun.star.uno.UnoRuntime;
    import com.sun.star.uno.XComponentContext;
    
    import ooo.connector.BootstrapSocketConnector;
    
    public class App {
      public static void main(String[] args) throws Exception, BootstrapException {
          System.out.println("Stating conversion!!!");
          // Initialise
          String oooExeFolder = "C:\\Program Files (x86)\\OpenOffice 4\\program"; //Provide path on which OpenOffice is installed
          XComponentContext xContext = BootstrapSocketConnector.bootstrap(oooExeFolder);
          XMultiComponentFactory xMCF = xContext.getServiceManager();
      
          Object oDesktop = xMCF.createInstanceWithContext("com.sun.star.frame.Desktop", xContext);
      
          XDesktop xDesktop = (XDesktop) UnoRuntime.queryInterface(XDesktop.class, oDesktop);
      
          // Load the Document
          String workingDir = "C:/Users/avijit.shaw/Desktop/testing/docx/"; //Provide directory path of docx file to be converted
          String myTemplate = workingDir + "Account Opening Prototype Details.docx"; // Name of docx file to be converted
      
          if (!new File(myTemplate).canRead()) {
            throw new RuntimeException("Cannot load template:" + new File(myTemplate));
          }
      
          XComponentLoader xCompLoader = (XComponentLoader) UnoRuntime
              .queryInterface(com.sun.star.frame.XComponentLoader.class, xDesktop);
      
          String sUrl = "file:///" + myTemplate;
      
          PropertyValue[] propertyValues = new PropertyValue[0];
      
          propertyValues = new PropertyValue[1];
          propertyValues[0] = new PropertyValue();
          propertyValues[0].Name = "Hidden";
      
    
              propertyValues[0].Value = new Boolean(true);
      
          XComponent xComp = xCompLoader.loadComponentFromURL(sUrl, "_blank", 0, propertyValues);
      
          // save as a PDF
          XStorable xStorable = (XStorable) UnoRuntime.queryInterface(XStorable.class, xComp);
      
          propertyValues = new PropertyValue[2];
          // Setting the flag for overwriting
          propertyValues[0] = new PropertyValue();
          propertyValues[0].Name = "Overwrite";
          propertyValues[0].Value = new Boolean(true);
          // Setting the filter name
          propertyValues[1] = new PropertyValue();
          propertyValues[1].Name = "FilterName";
          propertyValues[1].Value = "writer_pdf_Export";
      
          // Appending the favoured extension to the origin document name
          String myResult = workingDir + "letterOutput.pdf"; // Name of pdf file to be output
          xStorable.storeToURL("file:///" + myResult, propertyValues);
      
          System.out.println("Saved " + myResult);
      
          // shutdown
          xDesktop.terminate();
      }
    }
    
    <!-- https://mvnrepository.com/artifact/org.openoffice/unoil -->
        <dependency>
            <groupId>org.openoffice</groupId>
            <artifactId>unoil</artifactId>
            <version>3.2.1</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.openoffice/juh -->
        <dependency>
            <groupId>org.openoffice</groupId>
            <artifactId>juh</artifactId>
            <version>3.2.1</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.openoffice/bootstrap-connector -->
        <dependency>
            <groupId>org.openoffice</groupId>
            <artifactId>bootstrap-connector</artifactId>
            <version>0.1.1</version>
        </dependency>
    
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-JAXB-Internal</artifactId>
        <version>8.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-JAXB-ReferenceImpl</artifactId>
        <version>8.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-JAXB-MOXy</artifactId>
        <version>8.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.docx4j</groupId>
        <artifactId>docx4j-export-fo</artifactId>
        <version>8.0.0</version>
    </dependency>
    
    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    import java.io.InputStream;
    
    import org.docx4j.Docx4J;
    import org.docx4j.openpackaging.packages.WordprocessingMLPackage;
    import org.docx4j.openpackaging.parts.WordprocessingML.MainDocumentPart;
    
    public class DocToPDF {
    
        public static void main(String[] args) {
            
            try {
                InputStream templateInputStream = new FileInputStream("D:\\\\Workspace\\\\New\\\\Sample.docx");
                WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(templateInputStream);
                MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
    
                String outputfilepath = "D:\\\\Workspace\\\\New\\\\Sample.pdf";
                FileOutputStream os = new FileOutputStream(outputfilepath);
                Docx4J.toPDF(wordMLPackage,os);
                os.flush();
                os.close();
            } catch (Throwable e) {
    
                e.printStackTrace();
            } 
        }
    
    }