Java StringEscapeUtils.unescapeHtml不';无法处理从文件读取的字符串
我正在尝试读取包含unicode字符的文件,将这些字符转换为相应的符号,然后将生成的文本打印到新文件中。我正试图使用StringEscapeUtils.unescapeHtml来实现这一点,但行只是按原样打印,unicode点仍然完好无损。我做了一个练习,从文件中复制一行,从中生成一个字符串,然后在此基础上调用StringEscapeUtils.unescapeHtml,这非常有效。我的代码如下:Java StringEscapeUtils.unescapeHtml不';无法处理从文件读取的字符串,java,file-io,unicode,Java,File Io,Unicode,我正在尝试读取包含unicode字符的文件,将这些字符转换为相应的符号,然后将生成的文本打印到新文件中。我正试图使用StringEscapeUtils.unescapeHtml来实现这一点,但行只是按原样打印,unicode点仍然完好无损。我做了一个练习,从文件中复制一行,从中生成一个字符串,然后在此基础上调用StringEscapeUtils.unescapeHtml,这非常有效。我的代码如下: class FileWrite { public static void main(S
class FileWrite
{
public static void main(String args[])
{
try{
String testString = " \"text\":\"Dude With Knit Hat At Party Calls Beer \u2018Libations\u2019 http://t.co/rop8NSnRFu\" ";
FileReader instream = new FileReader("Home Timeline.txt");
BufferedReader b = new BufferedReader(instream);
FileWriter fstream = new FileWriter("out.txt");
BufferedWriter out = new BufferedWriter(fstream);
out.write(StringEscapeUtils.unescapeHtml3(testString) + "\n");//This gives the desired output,
//with unicode points converted
String line = b.readLine().toString();
while(line != null){
out.write(StringEscapeUtils.unescapeHtml3(line) + "\n");
line = b.readLine();
}
//Close the output streams
b.close();
out.close();
}
catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
}
您的字符串正在使用您的平台默认编码进行读写。要显式指定用作“UTF-8”的字符集: 输入流:
BufferedReader b = new BufferedReader(new InputStreamReader(
new FileInputStream("Home Timeline.txt"),
Charset.forName("UTF-8")));
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("out.txt"),
Charset.forName("UTF-8")));
输出流:
BufferedReader b = new BufferedReader(new InputStreamReader(
new FileInputStream("Home Timeline.txt"),
Charset.forName("UTF-8")));
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("out.txt"),
Charset.forName("UTF-8")));
您的字符串正在使用您的平台默认编码进行读写。要显式指定用作“UTF-8”的字符集: 输入流:
BufferedReader b = new BufferedReader(new InputStreamReader(
new FileInputStream("Home Timeline.txt"),
Charset.forName("UTF-8")));
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("out.txt"),
Charset.forName("UTF-8")));
输出流:
BufferedReader b = new BufferedReader(new InputStreamReader(
new FileInputStream("Home Timeline.txt"),
Charset.forName("UTF-8")));
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("out.txt"),
Charset.forName("UTF-8")));
你错了。Java在编译时将此表单的字符串文本构建到类文件中时,会对其进行卸载:
"\u2018Libations\u2019"
此代码中没有转义。您选择的方法旨在取消对‘代码>
你可能想要这个方法
你错了。Java在编译时将此表单的字符串文本构建到类文件中时,会对其进行卸载:
"\u2018Libations\u2019"
此代码中没有转义。您选择的方法旨在取消对‘代码>
你可能想要这个方法