Java XML和Dom4J。如何使用迭代器检索值?
我正在遍历(下面的示例xml)中的所有数据,对于如何获得所需的值感到困惑Java XML和Dom4J。如何使用迭代器检索值?,java,dom4j,Java,Dom4j,我正在遍历(下面的示例xml)中的所有数据,对于如何获得所需的值感到困惑 <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet title="XSL_formatting" type="text/xsl" href="/i/xml/xsl_formatting_rss.xml"?> <rss xmlns:blogChannel="http://backend.userland.com/blogChannelM
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet title="XSL_formatting" type="text/xsl" href="/i/xml/xsl_formatting_rss.xml"?>
<rss xmlns:blogChannel="http://backend.userland.com/blogChannelModule" version="2.0">
<channel>
<title>Ariana Resources News</title>
<link>http://www.iii.co.uk/investment/detail?code=cotn:AAU.L&display=news</link>
<description />
<item>
<title>Ariana Resources PLC - Environmental Impact Assessment Submitted for Kiziltepe</title>
<link>http://www.iii.co.uk/investment/detail?code=cotn:AAU.L&display=news&action=article&articleid=9084833&from=rss</link>
<description>Some Article information</description>
<pubDate>Fri, 30 Aug 2013 07:00:00 GMT</pubDate>
</item>
<item>
<title>Ariana Resources PLC - Directors' Dealings and Holding in Company</title>
<link>http://www.iii.co.uk/investment/detail?code=cotn:AAU.L&display=news&action=article&articleid=9053338&from=rss</link>
<description>Some Article information</description>
<pubDate>Wed, 31 Jul 2013 07:00:00 GMT</pubDate>
</item>
<item>
<title>Ariana Resources PLC - Directorship Changes</title>
<link>http://www.iii.co.uk/investment/detail?code=cotn:AAU.L&display=news&action=article&articleid=9046582&from=rss</link>
<description>Some Article information</description>
<pubDate>Wed, 24 Jul 2013 09:31:00 GMT</pubDate>
</item>
<item>
<title>Ariana Resources PLC - Ariana Resources plc : Capital Reorganisation</title>
<link>http://www.iii.co.uk/investment/detail?code=cotn:AAU.L&display=news&action=article&articleid=9038706&from=rss</link>
<description>Some Article information</description>
<pubDate>Wed, 24 Jul 2013 09:31:00 GMT</pubDate>
</item>
<item>
</channel>
</rss>
使用dom4j的XPath功能:
// Place the root element of theXML into a variable
List<? extends Node> items =
(List<? extends Node>)theXML.selectNodes("//rss/channel/item");
// RFC-dictated date format used with RSS
DateFormat dateFormatterRssPubDate =
new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss Z", Locale.ENGLISH);
// today started at this time
DateTime timeTodayStartedAt = new DateTime().withTimeAtStartOfDay();
for (Node node: items) {
String pubDate = node.valueOf( "pubDate" );
DateTime date = new DateTime(dateFormatterRssPubDate.parse(pubDate));
if (date.isAfter(timeTodayStartedAt)) {
// it's today, do something!
System.out.println("Today: " + date);
} else {
System.out.println("Not today: " + date);
}
}
//将XML的根元素放入变量中
List第二个循环没有问题,您必须在元素层次结构中导航才能找到感兴趣的元素,因此您已经走上了正确的道路。以下是您可以继续的方式:
public class Dom4JRssParser {
private void parse(Date day) throws DocumentException, ParseException {
Date dayOnly = removeTime(day);
// Fri, 30 Aug 2013 07:00:00 GMT
SimpleDateFormat sdfXml = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss z", Locale.ENGLISH);
System.out.println("Day: " + sdfXml.format(dayOnly));
SAXReader reader = new SAXReader();
Document doc = reader.read(getClass().getResourceAsStream("/com/so/dom4j/parser/rss/example_01.xml"));
Element root = doc.getRootElement(); // rss
for(Iterator rootIt = root.elementIterator("channel"); rootIt.hasNext(); ) {
Element channel = (Element) rootIt.next();
for(Iterator itemIt = channel.elementIterator("item"); itemIt.hasNext(); ) {
Element item = (Element) itemIt.next();
Element pubDate = item.element("pubDate");
if(pubDate != null) {
if(removeTime(sdfXml.parse(pubDate.getTextTrim())).equals(dayOnly)) {
Rns rns = new Rns(item.element("title"),
item.element("link"),
item.element("description"),
item.element("constituent"));
System.out.println(rns.toString());
System.out.println();
}
}
}
}
}
private Date removeTime(Date day) {
Calendar c = Calendar.getInstance(Locale.ENGLISH);
c.setTime(day);
c.set(Calendar.HOUR_OF_DAY, 0);
c.set(Calendar.MINUTE, 0);
c.set(Calendar.SECOND, 0);
c.set(Calendar.MILLISECOND, 0);
return c.getTime();
}
public static void main(String... args) throws ParseException, DocumentException {
Dom4JRssParser o = new Dom4JRssParser();
if(args.length == 0) {
o.parse(new Date());
} else {
SimpleDateFormat sdfInput = new SimpleDateFormat("yyyyMMdd");
for(String arg : args) {
o.parse(sdfInput.parse(arg));
}
}
}
}
使用参数进行测试运行
20130731
输出
Day: Wed, 31 Jul 2013 00:00:00 CEST
Rns [rnsHeadline=Ariana Resources PLC - Directors' Dealings and Holding in Company
rnsLink=http://www.iii.co.uk/investment/detail?code=cotn:AAU.L&display=news&action=article&articleid=9053338&from=rss
rnsFullText=Some Article information
rnsConstituentName=]
您也可以考虑使用API(区段<代码>带有XPath <代码>的强大导航,在您已发布的快速启动链接中),因为它更舒服,见EIS的答案。
问题到底在哪里?它是在获取元素下面的值。这让我很难堪。这太棒了,成功了。我正在查找中的所有值,并设法使用您的代码。谢谢我知道这已经是旧的了。然而,ListI的声明解决了这个问题。需要我没有安装的Jaxen库。都做完了。谢谢您的帮助。@TheMightyLlama是的,正如我在回答中所写,XPath查询需要jaxen依赖关系。真是太好了!
Day: Wed, 31 Jul 2013 00:00:00 CEST
Rns [rnsHeadline=Ariana Resources PLC - Directors' Dealings and Holding in Company
rnsLink=http://www.iii.co.uk/investment/detail?code=cotn:AAU.L&display=news&action=article&articleid=9053338&from=rss
rnsFullText=Some Article information
rnsConstituentName=]