Java 使用标准openStream和DocumentBuilder的utf-8
需要将输出格式转换为UTF-8,因为输出不处理特殊字符。Java 使用标准openStream和DocumentBuilder的utf-8,java,xml,utf-8,rss,Java,Xml,Utf 8,Rss,需要将输出格式转换为UTF-8,因为输出不处理特殊字符。 有人知道怎么做吗 DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder(); URL u = new URL("http://www.aredacao.com.br/tv-saude"); Document doc = builder.parse(u.openStream()); NodeList nodes = doc.getE
有人知道怎么做吗
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
URL u = new URL("http://www.aredacao.com.br/tv-saude");
Document doc = builder.parse(u.openStream());
NodeList nodes = doc.getElementsByTagName("item");`
问题是站点返回
,但它应该返回
一种解决方案是自己翻译每个元素的文本:
static void readData()
throws IOException,
ParserConfigurationException,
SAXException {
DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
URL u = new URL("http://www.aredacao.com.br/tv-saude");
Document doc = builder.parse(u.toString());
NodeList nodes = doc.getElementsByTagName("item");
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
Element el = (Element) node;
String title =
el.getElementsByTagName("title").item(0).getTextContent();
title = treatCharsAsUtf8Bytes(title);
String description =
el.getElementsByTagName("description").item(0).getTextContent();
description = treatCharsAsUtf8Bytes(description);
System.out.println("title=" + title);
System.out.println("description=" + description);
System.out.println();
}
}
private static String treatCharsAsUtf8Bytes(String s) {
byte[] bytes = s.getBytes(StandardCharsets.ISO_8859_1);
return new String(bytes, StandardCharsets.UTF_8);
}
static void readData()
抛出一个异常,
ParserConfiguration异常,
萨克斯例外{
文档生成器=
DocumentBuilderFactory.newInstance().newDocumentBuilder();
URL u=新的URL(“http://www.aredacao.com.br/tv-saude");
Document doc=builder.parse(u.toString());
NodeList节点=doc.getElementsByTagName(“项”);
对于(int i=0;i
另一种可能是编写FilterInputStream的子类来替换错误的