Java 阿帕奇FOP。问题w/西里尔字母
我使用ApacheFop库在Java8项目中生成了一些pdf文件。英文内容显示没有任何问题,但俄文字符很奇怪。它们看起来是这样的:ŦŦ¾Ð̧н 这里的问题似乎与编码有关,但我如何解决它呢 下面是我用来生成pdf的类:Java 阿帕奇FOP。问题w/西里尔字母,java,pdf,pdf-generation,apache-fop,Java,Pdf,Pdf Generation,Apache Fop,我使用ApacheFop库在Java8项目中生成了一些pdf文件。英文内容显示没有任何问题,但俄文字符很奇怪。它们看起来是这样的:ŦŦ¾Ð̧н 这里的问题似乎与编码有关,但我如何解决它呢 下面是我用来生成pdf的类: public class PdfGenerationTools implements StreamResource.StreamSource { String content; public PdfGenerationTools(String conte
public class PdfGenerationTools implements StreamResource.StreamSource
{
String content;
public PdfGenerationTools(String content) {
this.content = content;
}
@Override
public InputStream getStream()
{
ByteArrayInputStream foStream =
new ByteArrayInputStream(content.getBytes(StringTools.UTF8));
// Basic FOP configuration. You could create this object
// just once and keep it.
FopFactory fopFactory = FopFactory.newInstance();
fopFactory.setStrictValidation(false); // For an example
// Configuration for this PDF document - mainly metadata
FOUserAgent userAgent = getFOUserAgent(fopFactory);
// Transform to PDF
ByteArrayOutputStream fopOut = new ByteArrayOutputStream();
try {
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF,
userAgent, fopOut);
TransformerFactory factory =
TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
Source src = new
javax.xml.transform.stream.StreamSource(foStream);
Result res = new SAXResult(fop.getDefaultHandler());
transformer.transform(src, res);
fopOut.close();
return new ByteArrayInputStream(fopOut.toByteArray());
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
private FOUserAgent getFOUserAgent(FopFactory factory)
{
FOUserAgent userAgent = factory.newFOUserAgent();
userAgent.setProducer("Company");
userAgent.setCreationDate(new Date());
userAgent.setTitle("Printing jobs");
userAgent.setTargetResolution(300); // DPI
return userAgent;
}
public static String initDoc()
{
return "<?xml version='1.0' encoding='ISO-8859-1'?>"+
"<fo:root xmlns:fo='http://www.w3.org/1999/XSL/Format'>"+
"<fo:layout-master-set>"+
"<fo:simple-page-master master-name='A4' margin='2cm'>"+
"<fo:region-body />"+
"</fo:simple-page-master>"+
"</fo:layout-master-set>"+
"<fo:page-sequence master-reference='A4'>"+
"<fo:flow flow-name='xsl-region-body'>";
}
public static String closeDoc()
{
return "</fo:flow>"+
"</fo:page-sequence>"+
"</fo:root>";
}
public static String initTable()
{
return "<fo:block space-before.optimum=\"10pt\"></fo:block>" +
"<fo:table table-layout=\"fixed\" border-width=\"1mm\" border-style=\"solid\">" +
"<fo:table-column column-number=\"1\" column-width=\"50%\"/>" +
"<fo:table-column column-number=\"2\" column-width=\"50%\"/>" +
"<fo:table-body>";
}
public static String closeTable()
{
return "</fo:table-body>" +
"</fo:table>";
}
public static String initTableRow()
{
return "<fo:table-row keep-together.within-page=\"always\">";
}
public static String closeTableRow()
{
return "</fo:table-row>";
}
public static String getCell(String ... args)
{
final StringBuilder sb = new StringBuilder();
sb.append("<fo:table-cell padding=\"1mm\" border-width=\"1mm\" border-style=\"double\">");
for (String arg : args)
{
sb.append("<fo:block font-family=\"SansSerif\">")
.append(arg)
.append("</fo:block>");
}
sb.append("</fo:table-cell>");
return sb.toString();
}
}
公共类PDFGGenerationTools实现StreamResource.StreamSource
{
字符串内容;
公共PDFGGenerationTools(字符串内容){
this.content=内容;
}
@凌驾
公共输入流getStream()
{
ByteArrayInputStream-foStream=
新的ByteArrayInputStream(content.getBytes(StringTools.UTF8));
//基本FOP配置。您可以创建此对象
//就一次,留着吧。
FopFactory FopFactory=FopFactory.newInstance();
fopFactory.setStritValidation(false);//例如
//此PDF文档的配置-主要是元数据
FOUserAgent用户代理=getFOUserAgent(fopFactory);
//转换为PDF
ByteArrayOutputStream fopOut=新的ByteArrayOutputStream();
试一试{
Fop Fop=fopFactory.newFop(MimeConstants.MIME_PDF,
用户代理(fopOut);
变压器厂=
TransformerFactory.newInstance();
变压器=工厂新变压器();
Source src=新
javax.xml.transform.stream.StreamSource(foStream);
Result res=新的SAXSult(fop.getDefaultHandler());
变换(src,res);
fopOut.close();
返回新的ByteArrayInputStream(fopOut.toByteArray());
}捕获(例外e){
e、 printStackTrace();
}
返回null;
}
私人FOUserAgent getFOUserAgent(福普工厂)
{
FousAgent userAgent=factory.newFousAgent();
用户代理。setProducer(“公司”);
setCreationDate(新日期());
userAgent.setTitle(“打印作业”);
userAgent.setTargetResolution(300);//DPI
返回用户代理;
}
公共静态字符串initDoc()
{
返回“”+
""+
""+
""+
""+
""+
""+
""+
"";
}
公共静态字符串closeDoc()
{
返回“”+
""+
"";
}
公共静态字符串initTable()
{
返回“”+
"" +
"" +
"" +
"";
}
公共静态字符串closeTable()
{
返回“”+
"";
}
公共静态字符串initTableRow()
{
返回“”;
}
公共静态字符串closeTableRow()
{
返回“”;
}
公共静态字符串getCell(字符串…参数)
{
最终StringBuilder sb=新StringBuilder();
某人加上(“”);
for(字符串arg:args)
{
某人加上(“”)
.append(arg)
.附加(“”);
}
某人加上(“”);
使某人返回字符串();
}
}
当我将编码从“ISO-8859-1”更改为“UTF-8”时,我的西里尔文子字符串
看起来像这样:“#####”。这里似乎缺少字体。您必须使用FOP的配置文件,该文件指示要嵌入PDF文档中的字体,例如:
<?xml version="1.0" encoding="UTF-8"?>
<fop version='1.0'>
<renderers>
<renderer mime='application/pdf'>
<fonts>
<!-- TTF fonts -->
<font kerning='yes' embed-url='c:\windows\fonts\arial.ttf'>
<font-triplet name='Arial' style='normal' weight='normal' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\arialbd.ttf'>
<font-triplet name='Arial' style='normal' weight='bold' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\ariali.ttf'>
<font-triplet name='Arial' style='italic' weight='normal' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\arialbi.ttf'>
<font-triplet name='Arial' style='italic' weight='bold' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\times.ttf'>
<font-triplet name='TimesNewRoman' style='normal' weight='normal' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\timesbd.ttf'>
<font-triplet name='TimesNewRoman' style='normal' weight='bold' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\timesi.ttf'>
<font-triplet name='TimesNewRoman' style='italic' weight='normal' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\timesbi.ttf'>
<font-triplet name='TimesNewRoman' style='italic' weight='bold' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\cour.ttf'>
<font-triplet name='CourierNew' style='normal' weight='normal' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\courbd.ttf'>
<font-triplet name='CourierNew' style='normal' weight='bold' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\couri.ttf'>
<font-triplet name='CourierNew' style='italic' weight='normal' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\courbi.ttf'>
<font-triplet name='CourierNew' style='italic' weight='bold' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\verdana.ttf'>
<font-triplet name='Verdana' style='normal' weight='normal' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\verdanab.ttf'>
<font-triplet name='Verdana' style='normal' weight='bold' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\verdanai.ttf'>
<font-triplet name='Verdana' style='italic' weight='normal' />
</font>
<font kerning='yes' embed-url='c:\windows\fonts\verdanaz.ttf'>
<font-triplet name='Verdana' style='italic' weight='bold' />
</font>
</fonts>
</renderer>
</renderers>
</fop>
这看起来像是多字节UTF-8被看作是某种单字节ISO/Windows编码。对于其余部分,进行一些小测试,好像这可能是字体配置问题(可能很方便)或编码问题。添加一个带有西里尔字符的小FO代码段有助于获得答案,否则无法尝试重现您的问题(请参阅)。我在上面添加了一个代码段,以说明如何生成pdf内容。最后,我返回到本期。。问题是我在Ubuntu 14下工作。所以这里没有MS字体(你可以使用任何包含西里尔字母的字体。你也可以在Ubuntu中设置MS字体。打开Ubuntu软件中心并搜索“ttf mscorefonts installer”。这将安装Microsoft的核心字体。
// configure fopFactory as desired
FopFactory fopFactory = FopFactory.newInstance();
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();
fopFactory.setUserConfig(new File("fop.xml"));