Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/388.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/xml/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Java中从XML标记检索值_Java_Xml_Nlp - Fatal编程技术网

在Java中从XML标记检索值

在Java中从XML标记检索值,java,xml,nlp,Java,Xml,Nlp,我有一组来自自然语言工具的XML字符串输出,需要从中检索值,还需要为输出字符串中未显示的标记提供空值。尝试使用中提供的Java代码,但似乎不起作用 当前样本标签清单如下所示: <TimeStamp>, <Role>, <SpeakerId>, <Person>, <Location>, <Organization> 为了使用上面链接中提供的Java代码(在更新的代码中),我插入了和,如下所示: <Dummy>

我有一组来自自然语言工具的XML字符串输出,需要从中检索值,还需要为输出字符串中未显示的标记提供空值。尝试使用中提供的Java代码,但似乎不起作用

当前样本标签清单如下所示:

<TimeStamp>, <Role>, <SpeakerId>, <Person>, <Location>, <Organization> 
为了使用上面链接中提供的Java代码(在更新的代码中),我插入了
,如下所示:

<Dummy><TimeStamp>00.00.00</TimeStamp><Role>Speaker1</Role><SpeakerId>1234</SpeakerId>Blah, blah, blah.</Dummy>
00.00.00扬声器11234诸如此类,诸如此类。

但是,它只返回dummy和null。由于我还是Java的新手,非常感谢详细的解释。

试着这样做:希望能对您有所帮助

File fXmlFile = new File("yourfile.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
您可以获得如下所示的子节点列表:

NodeList nList = doc.getElementsByTagName("staff");
Node nNode = nList.item(temp);
获取如下所示的项目:

NodeList nList = doc.getElementsByTagName("staff");
Node nNode = nList.item(temp);

这就是我最后为Java包装器所做的事情(仅显示时间戳)

公共类邮件{
公共字符串转换XML(字符串输入){
字符串输出=输入;
试一试{
DocumentBuilderFactory docBuilderFactory=
DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder=docBuilderFactory.newDocumentBuilder();
InputSource is=新的InputSource();
is.setCharacterStream(新的StringReader(nerOutput));
Document doc=docBuilder.parse(is);
//规范化文本表示
doc.getDocumentElement().normalize();
NodeList listOfDummies=doc.getElementsByTagName(“虚拟”);

对于(int s=0;sShow您使用的代码。以及实际使用的xml作为输入。这就是我最终要做的:很高兴我能帮助您:D
  public class NERPost {

      public String convertXML (String input) {
      String nerOutput = input;
      try {
           DocumentBuilderFactory docBuilderFactory = 
           DocumentBuilderFactory.newInstance();
           DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
           InputSource is = new InputSource();            
           is.setCharacterStream(new StringReader(nerOutput));       
           Document doc = docBuilder.parse(is);

        // normalize text representation
        doc.getDocumentElement ().normalize ();
        NodeList listOfDummies = doc.getElementsByTagName("dummy");


        for(int s=0; s<listOfDummies.getLength() ; s++){
            Node firstDummyNode = listOfDummies.item(s);
            if(firstDummyNode.getNodeType() == Node.ELEMENT_NODE){
               Element firstDummyElement = (Element)firstDummyNode;

         //Convert each entity label --------------------------------

          //TimeStamp
               String ts = "<TimeStamp>";
               Boolean foundTs;

               if (foundTs = nerOutput.contains(ts)) {                    
           NodeList timeStampList = firstDummyElement.getElementsByTagName("TimeStamp");

          //do it recursively  
                for (int i=0; i<timeStampList.getLength(); i++) {       
                Node firstTimeStampNode = timeStampList.item(i);
                Element timeStampElement = (Element)firstTimeStampNode;
                NodeList textTSList = timeStampElement.getChildNodes();
                String timeStampOutput = ((Node)textTSList.item(0)).getNodeValue().trim();
                System.out.println ("<TimeStamp>" + timeStampOutput + "</TimeStamp>\n")
                   } //end for
                }//end if
             //other XML tags
              //.....
               }//end if
              }//end for
           }
            catch...
              }//end try
                }}