Java PDFBox不支持多种语言_Java_Pdfbox_Liferay 7

Java PDFBox不支持多种语言

java

Java PDFBox不支持多种语言,java,pdfbox,liferay-7,Java,Pdfbox,Liferay 7,我正在尝试生成一份PDF报告，包含多种语言的句子。为此，我使用谷歌NOTO字体，但谷歌CJK字体不支持一些拉丁特殊字符，因为我的PDF框无法生成报告或有时显示奇怪的字符谁有合适的解决方案？我尝试了多种方法，但找不到一个支持所有Unicode的TTF文件。我也试着回退到不同的字体文件，但这将是太多的工作我支持的语言：-日语、德语、西班牙语、葡萄牙语、英语注意：-由于许可问题，我不想使用arialuni.ttf文件任何人都可以提出任何建议。以下是示例子项目2.0.14版中的代码： /**

我正在尝试生成一份PDF报告，包含多种语言的句子。为此，我使用谷歌NOTO字体，但谷歌CJK字体不支持一些拉丁特殊字符，因为我的PDF框无法生成报告或有时显示奇怪的字符

谁有合适的解决方案？我尝试了多种方法，但找不到一个支持所有Unicode的TTF文件。我也试着回退到不同的字体文件，但这将是太多的工作

我支持的语言：-日语、德语、西班牙语、葡萄牙语、英语

注意：-由于许可问题，我不想使用arialuni.ttf文件

任何人都可以提出任何建议。

以下是示例子项目2.0.14版中的代码：

/**
 * Output a text without knowing which font is the right one. One use case is a worldwide
 * address list. Only LTR languages are supported, RTL (e.g. Hebrew, Arabic) are not 
 * supported so they would appear in the wrong direction.
 * Complex scripts (Thai, Arabic, some Indian languages) are also not supported, any output
 * will look weird. There is an (unfinished) effort here:
 * https://issues.apache.org/jira/browse/PDFBOX-4189
 * 
 * @author Tilman Hausherr
 */
public class EmbeddedMultipleFonts
{
    public static void main(String[] args) throws IOException
    {
        try (PDDocument document = new PDDocument())
        {
            PDPage page = new PDPage(PDRectangle.A4);
            document.addPage(page);

            PDFont font1 = PDType1Font.HELVETICA; // always have a simple font as first one
            TrueTypeCollection ttc2 = new TrueTypeCollection(new File("c:/windows/fonts/batang.ttc"));
            PDType0Font font2 = PDType0Font.load(document, ttc2.getFontByName("Batang"), true); // Korean
            TrueTypeCollection ttc3 = new TrueTypeCollection(new File("c:/windows/fonts/mingliu.ttc"));
            PDType0Font font3 = PDType0Font.load(document, ttc3.getFontByName("MingLiU"), true); // Chinese
            PDType0Font font4 = PDType0Font.load(document, new File("c:/windows/fonts/mangal.ttf")); // Indian
            PDType0Font font5 = PDType0Font.load(document, new File("c:/windows/fonts/ArialUni.ttf")); // Fallback

            try (PDPageContentStream cs = new PDPageContentStream(document, page))
            {
                cs.beginText();
                List<PDFont> fonts = new ArrayList<>();
                fonts.add(font1);
                fonts.add(font2);
                fonts.add(font3);
                fonts.add(font4);
                fonts.add(font5);
                cs.newLineAtOffset(20, 700);
                showTextMultiple(cs, "abc 한국 中国 भारत 日本 abc", fonts, 20);
                cs.endText();
            }

            document.save("example.pdf");
        }
    }

    static void showTextMultiple(PDPageContentStream cs, String text, List<PDFont> fonts, float size)
            throws IOException
    {
        try
        {
            // first try all at once
            fonts.get(0).encode(text);
            cs.setFont(fonts.get(0), size);
            cs.showText(text);
            return;
        }
        catch (IllegalArgumentException ex)
        {
            // do nothing
        }
        // now try separately
        int i = 0;
        while (i < text.length())
        {
            boolean found = false;
            for (PDFont font : fonts)
            {
                try
                {
                    String s = text.substring(i, i + 1);
                    font.encode(s);
                    // it works! Try more with this font
                    int j = i + 1;
                    for (; j < text.length(); ++j)
                    {
                        String s2 = text.substring(j, j + 1);

                        if (isWinAnsiEncoding(s2.codePointAt(0)) && font != fonts.get(0))
                        {
                            // Without this segment, the example would have a flaw:
                            // This code tries to keep the current font, so
                            // the second "abc" would appear in a different font
                            // than the first one, which would be weird.
                            // This segment assumes that the first font has WinAnsiEncoding.
                            // (all static PDType1Font Times / Helvetica / Courier fonts)
                            break;
                        }
                        try
                        {
                            font.encode(s2);
                        }
                        catch (IllegalArgumentException ex)
                        {
                            // it's over
                            break;
                        }
                    }
                    s = text.substring(i, j);
                    cs.setFont(font, size);
                    cs.showText(s);
                    i = j;
                    found = true;
                    break;
                }
                catch (IllegalArgumentException ex)
                {
                    // didn't work, will try next font
                }
            }
            if (!found)
            {
                throw new IllegalArgumentException("Could not show '" + text.substring(i, i + 1) +
                        "' with the fonts provided");
            }
        }
    }

    static boolean isWinAnsiEncoding(int unicode)
    {
        String name = GlyphList.getAdobeGlyphList().codePointToName(unicode);
        if (".notdef".equals(name))
        {
            return false;
        }
        return WinAnsiEncoding.INSTANCE.contains(name);
    }
}

/**
*在不知道哪种字体是正确的情况下输出文本。一个用例是全球范围的
*地址列表。仅支持LTR语言，不支持RTL（如希伯来语、阿拉伯语）
*支持，以便它们出现在错误的方向。
*也不支持复杂脚本（泰语、阿拉伯语、某些印度语言），不支持任何输出
*看起来会很奇怪。这里有一项（未完成的）工作：
* https://issues.apache.org/jira/browse/PDFBOX-4189
* 
*@作者Tilman Hausherr
*/
公共类嵌入的多重元素
{
公共静态void main（字符串[]args）引发IOException
{
try（PDDocument document=new PDDocument（））
{
PDPage page=新的PDPage（PD矩形.A4）；
文件。添加页（第页）；
PDFont font1=PDType1Font.HELVETICA；//始终使用简单字体作为第一个字体
TrueTypeCollection ttc2=新的TrueTypeCollection（新文件（“c:/windows/fonts/batang.ttc”）；
PDType0Font font2=PDType0Font.load（文档，ttc2.getFontByName（“Batang”），true）；//韩语
TrueTypeCollection ttc3=新的TrueTypeCollection（新文件（“c:/windows/fonts/mingliu.ttc”）；
PDType0Font-font3=PDType0Font.load（document，ttc3.getFontByName（“MingLiU”），true）；//中文
PDType0Font font4=PDType0Font.load（文档，新文件（“c:/windows/fonts/mangal.ttf”）；//印度
PDType0Font font5=PDType0Font.load（文档，新文件（“c:/windows/fonts/ArialUni.ttf”）；//回退
try（PDPageContentStream cs=newpdpagecontentstream（文档，页面））
{
cs.beginText（）；
列表字体=新建ArrayList（）；
字体。添加（font1）；
字体。添加（font2）；
字体。添加（font3）；
字体。添加（font4）；
字体。添加（font5）；
cs.Newlineatofset（20700）；
showTextMultiple（cs），abc한국 中国 भारत 日本 abc”，字体，20）；
cs.endText（）；
}
document.save（“example.pdf”）；
}
}
静态void showTextMultiple（PDPageContentStream cs、字符串文本、列表字体、浮动大小）
抛出IOException
{
尝试
{
//首先一次尝试所有方法
字体。获取（0）。编码（文本）；
cs.setFont（字体.get（0），大小）；
cs.showText（text）；
返回；
}
捕获（IllegalArgumentException ex）
{
//无所事事
}
//现在分别尝试
int i=0；
而（i


arialuni的替代品可在此处找到：