Java OpenHTMLToPDF:将自定义字体嵌入到由HTML创建的PDF中
我使用Jsoup和HTML从HTML创建PDF。我必须在我的PDF中使用不同的字体来覆盖非拉丁字形(请参阅)。如何正确嵌入字体 复制问题的简化程序: src/main/resources/test.htmlJava OpenHTMLToPDF:将自定义字体嵌入到由HTML创建的PDF中,java,fonts,jsoup,pdfbox,openhtmltopdf,Java,Fonts,Jsoup,Pdfbox,Openhtmltopdf,我使用Jsoup和HTML从HTML创建PDF。我必须在我的PDF中使用不同的字体来覆盖非拉丁字形(请参阅)。如何正确嵌入字体 复制问题的简化程序: src/main/resources/test.html <!DOCTYPE html> <html> <head> <meta charset="UTF-8" /> <title>Font Test</title> &l
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<title>Font Test</title>
<style>
@font-face {
font-family: 'source-sans';
font-style: normal;
font-weight: 400;
src: url(fonts/SourceSansPro-Regular.ttf);
}
</style>
</head>
<body>
<p style="font-family: 'source-sans',serif">Latin Script</p>
<p style="font-family: 'source-sans',serif">Είμαι ελληνικό κείμενο.</p>
</body>
</html>
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>paf</groupId>
<artifactId>test</artifactId>
<version>1.0-SNAPSHOT</version>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>7</source>
<target>7</target>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>com.openhtmltopdf</groupId>
<artifactId>openhtmltopdf-pdfbox</artifactId>
<version>0.0.1-RC18</version>
</dependency>
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.11.2</version>
</dependency>
</dependencies>
</project>
- 不必担心第二个函数,它只读取HTML文件,并且只包含在这里,以便有一个完整的程序
- 请在此下载:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<title>Font Test</title>
<style>
@font-face {
font-family: 'source-sans';
font-style: normal;
font-weight: 400;
src: url(fonts/SourceSansPro-Regular.ttf);
}
</style>
</head>
<body>
<p style="font-family: 'source-sans',serif">Latin Script</p>
<p style="font-family: 'source-sans',serif">Είμαι ελληνικό κείμενο.</p>
</body>
</html>
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>paf</groupId>
<artifactId>test</artifactId>
<version>1.0-SNAPSHOT</version>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>7</source>
<target>7</target>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>com.openhtmltopdf</groupId>
<artifactId>openhtmltopdf-pdfbox</artifactId>
<version>0.0.1-RC18</version>
</dependency>
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.11.2</version>
</dependency>
</dependencies>
</project>
生成的PDF
- 用衬线字体
编辑1:根据注释中链接的页面进行各种更改,并更新为RC18。新输出现在,但PDF中的字体仍然不正确
编辑2:尝试快速渲染器好的。感谢@Tilman Hausherr的评论,我在openhtmltopdf的GitHub问题跟踪程序中询问了他们 如果有人对此感兴趣,这些更改使其工作正常: src/main/java/main.java(仅更改部分,请参见上面的其余部分):
import com.openhtmltopdf.extend.FSSupplier;
import com.openhtmltopdf.pdfboxout.PdfRendererBuilder;
import org.jsoup.Jsoup;
import org.jsoup.helper.W3CDom;
import org.w3c.dom.Document;
import java.io.*;
import java.nio.charset.StandardCharsets;
import java.util.Objects;
public class main {
public static void main(String[] args) {
System.out.println("Starting");
try {
final W3CDom w3cDom = new W3CDom();
final Document w3cDoc = w3cDom.fromJsoup(Jsoup.parse(readFile()));
final OutputStream outStream = new FileOutputStream("test.pdf");
final PdfRendererBuilder pdfBuilder = new PdfRendererBuilder();
pdfBuilder.useFastMode();
pdfBuilder.withW3cDocument(w3cDoc, "/");
pdfBuilder.useFont(new File(main.class.getClassLoader().getResource("fonts/SourceSansPro-Regular.ttf").getFile()), "source-sans");
pdfBuilder.toStream(outStream);
pdfBuilder.run();
outStream.close();
} catch (Exception e) {
System.out.println("PDF could not be created: " + e.getMessage());
}
System.out.println("Finish.");
}
private static String readFile() throws IOException {
final ClassLoader classLoader = main.class.getClassLoader();
final InputStream inputStream = classLoader.getResourceAsStream("test.html");
final StringBuilder sb = new StringBuilder();
final Reader r = new InputStreamReader(Objects.requireNonNull(inputStream), StandardCharsets.UTF_8);
char[] buf = new char[1024];
int amt = r.read(buf);
while(amt > 0) {
sb.append(buf, 0, amt);
amt = r.read(buf);
}
return sb.toString();
}
}
public static void main(String[] args) {
System.out.println("Starting");
try {
final W3CDom w3cDom = new W3CDom();
final Document w3cDoc = w3cDom.fromJsoup(Jsoup.parse(readFile()));
final OutputStream outStream = new FileOutputStream("test.pdf");
final PdfRendererBuilder pdfBuilder = new PdfRendererBuilder();
pdfBuilder.useFastMode();
pdfBuilder.withW3cDocument(w3cDoc, "/");
pdfBuilder.useFont(new File(main.class.getClassLoader().getResource("fonts/SourceSansPro-Regular.ttf").getFile()), "source-sans");
pdfBuilder.toStream(outStream);
pdfBuilder.run();
outStream.close();
} catch (Exception e) {
System.out.println("PDF could not be created: " + e.getMessage());
}
System.out.println("Finish.");
}
src/main/resources/font/SourceSansPro-regular.ttf
- 已在此处下载较新版本:
我认为你应该在openhtmltopdf问题跟踪器中问这个问题(除非他们指示你来这里)。这不是一个真正的PDFBox问题,PDFBox本身可以从2.0.0开始使用unicode字体。也可以看看这里,也许这会有所帮助?并使用最新版本,即0.0.1-RC18,而不是0.0.1-RC12。考虑使用Maven版本插件,谢谢你们两位。我改变了加载字体的方式,并根据链接页面使用TTF版本而不是OTF。我还更新了RC18。输出现在不同了,但仍然不起作用。我想我真的应该在openhtmltopdf的GitHub问题跟踪器中发布这篇文章。main方法有变化吗?显然没有。@Paflow在这个useFont方法中,我如何使用字体作为InputStream Java。
@font-face {
font-family: 'source-sans';
font-style: normal;
font-weight: 400;
src: url(fonts/SourceSansPro-Regular.ttf);
-fs-font-subset: complete-font;
}