Java 如何访问xml中下一个标记的文本内容?
我有以下代码:Java 如何访问xml中下一个标记的文本内容?,java,xml,Java,Xml,我有以下代码: public String depRel() throws SAXException, IOException, ParserConfigurationException, ClassNotFoundException, ClassCastException { String xmlString = Features.dependencyGraph(); ; String result = ""; Stri
public String depRel() throws SAXException, IOException,
ParserConfigurationException, ClassNotFoundException,
ClassCastException {
String xmlString = Features.dependencyGraph();
;
String result = "";
String dependent = "";
String governor = "";
String type = "";
// System.out.println("A value is :" + xmlString);
// aici il convertesc ca sa il pot citi ca si xml
Document document = convertStringToDocument(xmlString);
document.getDocumentElement().normalize();
Element root = document.getDocumentElement();
NodeList nList = document.getElementsByTagName("dependencies");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node node = nList.item(temp);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element eElement1 = (Element) node;
}
NodeList nodesDocPart = node.getChildNodes();
for (int temp2 = 0; temp2 < nodesDocPart.getLength(); temp2++) {
Node n = nodesDocPart.item(temp2);
if (n.getNodeType() == Node.ELEMENT_NODE) {
Element el1 = (Element) n;
type = el1.getAttribute("type");
}
// /////////////////////////////////////////////////sentence/////////////////////////////////////////////
NodeList nodesSentencePart = n.getChildNodes();
for (int temp3 = 0; temp3 < nodesSentencePart.getLength(); temp3++) {
Node sentence = nodesSentencePart.item(temp3);
if (sentence.getNodeType() == Node.ELEMENT_NODE) {
Element eElement4 = (Element) sentence;
if (eElement4.getTagName().equals("dependent")) {
dependent = eElement4.getTextContent();
}
if (eElement4.getTagName().equals("governor")) {
governor = eElement4.getTextContent();
enter code here
public String depRel()抛出SAXException、IOException、,
ParserConfiguration异常,ClassNotFoundException,
类卡斯特例外{
字符串xmlString=Features.dependencyGraph();
;
字符串结果=”;
字符串相关=”;
字符串调控器=”;
字符串类型=”;
//System.out.println(“值为:”+xmlString);
//aici il convertesc ca sa il pot花旗ca si xml
Document Document=convertStringToDocument(xmlString);
document.getDocumentElement().normalize();
元素根=document.getDocumentElement();
NodeList nList=document.getElementsByTagName(“依赖项”);
对于(int-temp=0;temp
下一个xml格式描述一个句子的依赖关系图。
这句话是:在用维甲酸或PMA刺激U937促核细胞系后,在纯化的人单核细胞和巨噬细胞中,跟踪人类免疫缺陷病毒1型(HIV-1)子代的产生
<dependencies style="typed">
<dep type="det">
<governor idx="2">production</governor>
<dependent idx="1">The</dependent>
</dep>
<dep type="nsubjpass">
<governor idx="14">followed</governor>
<dependent idx="2">production</dependent>
</dep>
<dep type="case">
<governor idx="7">type</governor>
<dependent idx="3">of</dependent>
</dep>
<dep type="amod">
<governor idx="7">type</governor>
<dependent idx="4">human</dependent>
</dep>
<dep type="compound">
<governor idx="7">type</governor>
<dependent idx="5">immunodeficiency</dependent>
</dep>
<dep type="compound">
<governor idx="7">type</governor>
<dependent idx="6">virus</dependent>
</dep>
<dep type="nmod:of">
<governor idx="2">production</governor>
<dependent idx="7">type</dependent>
</dep>
<dep type="nummod">
<governor idx="7">type</governor>
<dependent idx="8">1</dependent>
</dep>
<dep type="punct">
<governor idx="10">HIV-1</governor>
<dependent idx="9">-LRB-</dependent>
</dep>
<dep type="appos">
<governor idx="7">type</governor>
<dependent idx="10">HIV-1</dependent>
</dep>
<dep type="punct">
<governor idx="10">HIV-1</governor>
<dependent idx="11">-RRB-</dependent>
</dep>
<dep type="dep">
<governor idx="7">type</governor>
<dependent idx="12">progeny</dependent>
</dep>
<dep type="auxpass">
<governor idx="14">followed</governor>
<dependent idx="13">was</dependent>
</dep>
<dep type="case">
<governor idx="20">line</governor>
<dependent idx="15">in</dependent>
</dep>
<dep type="det">
<governor idx="20">line</governor>
<dependent idx="16">the</dependent>
</dep>
<dep type="compound">
<governor idx="20">line</governor>
<dependent idx="17">U937</dependent>
</dep>
<dep type="amod">
<governor idx="20">line</governor>
<dependent idx="18">promonocytic</dependent>
</dep>
<dep type="compound">
<governor idx="20">line</governor>
<dependent idx="19">cell</dependent>
</dep>
<dep type="nmod:in">
<governor idx="14">followed</governor>
<dependent idx="20">line</dependent>
</dep>
<dep type="case">
<governor idx="22">stimulation</governor>
<dependent idx="21">after</dependent>
</dep>
<dep type="nmod:after">
<governor idx="14">followed</governor>
<dependent idx="22">stimulation</dependent>
</dep>
<dep type="dep">
<governor idx="26">acid</governor>
<dependent idx="23">either</dependent>
</dep>
<dep type="case">
<governor idx="26">acid</governor>
<dependent idx="24">with</dependent>
</dep>
<dep type="amod">
<governor idx="26">acid</governor>
<dependent idx="25">retinoic</dependent>
</dep>
<dep type="nmod:with">
<governor idx="22">stimulation</governor>
<dependent idx="26">acid</dependent>
</dep>
<dep type="cc">
<governor idx="26">acid</governor>
<dependent idx="27">or</dependent>
</dep>
<dep type="nmod:with">
<governor idx="22">stimulation</governor>
<dependent idx="28">PMA</dependent>
</dep>
<dep type="conj:or">
<governor idx="26">acid</governor>
<dependent idx="28">PMA</dependent>
</dep>
<dep type="punct">
<governor idx="14">followed</governor>
<dependent idx="29">,</dependent>
</dep>
<dep type="cc">
<governor idx="14">followed</governor>
<dependent idx="30">and</dependent>
</dep>
<dep type="case">
<governor idx="34">monocytes</governor>
<dependent idx="31">in</dependent>
</dep>
<dep type="amod">
<governor idx="34">monocytes</governor>
<dependent idx="32">purified</dependent>
</dep>
<dep type="amod">
<governor idx="34">monocytes</governor>
<dependent idx="33">human</dependent>
</dep>
<dep type="conj:and">
<governor idx="14">followed</governor>
<dependent idx="34">monocytes</dependent>
</dep>
<dep type="cc">
<governor idx="34">monocytes</governor>
<dependent idx="35">and</dependent>
</dep>
<dep type="conj:and">
<governor idx="14">followed</governor>
<dependent idx="36">macrophages</dependent>
</dep>
<dep type="conj:and">
<governor idx="34">monocytes</governor>
<dependent idx="36">macrophages</dependent>
</dep>
<dep type="punct">
<governor idx="14">followed</governor>
<dependent idx="37">.</dependent>
</dep>
生产
这个
跟着
生产
类型
属于
类型
人类
类型
免疫缺陷
类型
病毒
生产
类型
类型
1.
HIV-1
-LRB-
类型
HIV-1
HIV-1
-RRB-
类型
后代
跟着
是
线
在里面
线
这个
线
U937
线
促分裂
线
细胞
跟着
线
刺激
之后
跟着
刺激
酸的
任何一个
酸的
具有
酸的
维甲酸
刺激
酸的
酸的
或
刺激
PMA
酸的
PMA
跟着
,
跟着
和
单核细胞
在里面
单核细胞
净化
单核细胞
人类
跟着
单核细胞
单核细胞
和
跟着
巨噬细胞
单核细胞
巨噬细胞
跟着
.
如果我在标记“governor”处,我如何访问标记“dependent”?因为我想获得一个单词的所有governor和所有dependent。我如何才能做到这一点?您似乎想要一个
governor/dependent/word
的集合。
您可以使用下面的代码来获取此类类的集合—我称之为GovernorDependentNode
class GovernorDependentNode
{
Node governor;
Node dependent;
String word;
}
List<GovernorDependentNode> getNodes( String word, InputSource is )
{
List<GovernorDependentNode> gdNodes = new ArrayList<GovernorDependentNode>();
try
{
Object govs = XPathFactory.newInstance().newXPath().evaluate("//dep/governor[.='" + word + "']", is, XPathConstants.NODESET );
if ( govs != null )
{
NodeList gNodes = (NodeList)govs;
for ( int i = 0; i < gNodes.getLength(); i++ )
{
GovernorDependentNode gdNode = new GovernorDependentNode();
Node gNode = gNodes.item(i);
gdNode.governor = gNode;
gdNode.word = word;
NodeList childNodes = gNode.getParentNode().getChildNodes();
for ( int j = 0; j < childNodes.getLength(); j++ )
{
Node n = childNodes.item(j);
if ( n.getNodeName().equals( "dependent" ) )
{
gdNode.dependent = n;
break;
}
}
gdNodes.add( gdNode );
}
}
}
catch ( Exception e )
{
e.printStackTrace();
}
return gdNodes;
}
输出为:
Word : followed
Governor : followed
Dependent : production
Word : followed
Governor : followed
Dependent : was
Word : followed
Governor : followed
Dependent : line
Word : followed
Governor : followed
Dependent : stimulation
Word : followed
Governor : followed
Dependent : ,
Word : followed
Governor : followed
Dependent : and
Word : followed
Governor : followed
Dependent : monocytes
Word : followed
Governor : followed
Dependent : macrophages
Word : followed
Governor : followed
Dependent : .
我想获得一个单词的所有调控者和所有依赖者,这里的word
是什么?它是governor
节点的文本吗?这个单词是我将要解析的句子中的当前单词。我必须保留这个句子,并为句子中的每个单词找到调控者和依赖者,我不应该解析xmxml文件中的lString?因为当我调用类似metod的输入源时,编译器知道xmlString不是xml?它只是xml格式的字符串,编译器不知道它是否是xml格式。如果字符串不是xml格式,则行XPathFactory.newInstance().newXPath().evaluate
将引发异常。在将字符串传递给方法之前,可以检查该字符串是否为xml。当我运行此代码时,异常显示在以下行:Object govs=XPathFactory.newInstance().newXPath().evaluate(//dep/governor[.='+word+']],is,XPathConstants.NODESET);例外情况是什么?你能至少发布一些stacktrace吗?哦,好的..现在我看到了你的评论:谢谢!我将尝试验证我的字符串是否为xml格式
List<GovernorDependentNode> nodes = getNodes( "followed", inputSource );
for ( GovernorDependentNode node : nodes )
{
System.out.println( "Word : " + node.word );
System.out.println( "Governor : " + node.governor.getTextContent() );
System.out.println( "Dependent : " + node.dependent.getTextContent());
}
Word : followed
Governor : followed
Dependent : production
Word : followed
Governor : followed
Dependent : was
Word : followed
Governor : followed
Dependent : line
Word : followed
Governor : followed
Dependent : stimulation
Word : followed
Governor : followed
Dependent : ,
Word : followed
Governor : followed
Dependent : and
Word : followed
Governor : followed
Dependent : monocytes
Word : followed
Governor : followed
Dependent : macrophages
Word : followed
Governor : followed
Dependent : .