Java 使用jsoup解析xml(同时避免使用<;p>;标记)

Java 使用jsoup解析xml(同时避免使用<;p>;标记),java,jsoup,Java,Jsoup,这个问题在本质上与java非常相似,但是对于java而不是python <body.content> <block class="lead_paragraph"> <p>LEAD: Two police officers responding to a reported robbery at a Brooklyn tavern early yesterday were themselves held up by the robbers, who t

这个问题在本质上与java非常相似,但是对于java而不是python

<body.content>
  <block class="lead_paragraph">
    <p>LEAD: Two police officers responding to a reported robbery at a Brooklyn tavern early yesterday were themselves held up by the robbers, who took their revolvers and herded them into a back room with patrons, the police said.</p>
  </block>
  <block class="full_text">
    <p>LEAD: Two police officers responding to a reported robbery at a Brooklyn tavern early yesterday were themselves held up by the robbers, who took their revolvers and herded them into a back room with patrons, the police said.</p>
  </block>

更新

事实上,我的情况有点不同,因为我有一些额外的XML格式,我想保留,即


您可以在中使用CSS选择器

更新:

LEAD: Two police officers responding to a reported robbery at a Brooklyn tavern early yesterday were themselves held up by the robbers, who took their revolvers and herded them into a back room with patrons, the police said.
String html = "<block class=\"full_text\">\n"
        + "    <p>SCHEINMAN</PERSON>--<PERSON>Alan</PERSON>. Happy Birthday. Thirteen years, many tears. Loving memories of your smile, humor, and laughter comfort us. You are always in our hearts. Love, <PERSON>Roni</PERSON>, <PERSON>Sandy</PERSON>, <PERSON>Jarret</PERSON>, <PERSON>Greg</PERSON>, <PERSON>Kate</PERSON>, and <PERSON>Auden Gray</PERSON></p></block></body.content></body></nitf>";
Document doc = Jsoup.parse(html);
String link = doc.select("block.full_text").html();
System.out.println(link);
<p>SCHEINMAN--
 <person>
  Alan
 </person>. Happy Birthday. Thirteen years, many tears. Loving memories of your smile, humor, and laughter comfort us. You are always in our hearts. Love, 
 <person>
  Roni
 </person>, 
 <person>
  Sandy
 </person>, 
 <person>
  Jarret
 </person>, 
 <person>
  Greg
 </person>, 
 <person>
  Kate
 </person>, and 
 <person>
  Auden Gray
 </person></p>
String html=“\n”
+“SCHEINMAN--Alan.生日快乐。十三年了,流了很多泪。对你的微笑、幽默和笑声的美好回忆安慰着我们。你永远在我们心中。爱、Roni、Sandy、Jarret、Greg、Kate和Auden Gray

”; Document doc=Jsoup.parse(html); String link=doc.select(“block.full_text”).html(); System.out.println(链接);
输出:

LEAD: Two police officers responding to a reported robbery at a Brooklyn tavern early yesterday were themselves held up by the robbers, who took their revolvers and herded them into a back room with patrons, the police said.
String html = "<block class=\"full_text\">\n"
        + "    <p>SCHEINMAN</PERSON>--<PERSON>Alan</PERSON>. Happy Birthday. Thirteen years, many tears. Loving memories of your smile, humor, and laughter comfort us. You are always in our hearts. Love, <PERSON>Roni</PERSON>, <PERSON>Sandy</PERSON>, <PERSON>Jarret</PERSON>, <PERSON>Greg</PERSON>, <PERSON>Kate</PERSON>, and <PERSON>Auden Gray</PERSON></p></block></body.content></body></nitf>";
Document doc = Jsoup.parse(html);
String link = doc.select("block.full_text").html();
System.out.println(link);
<p>SCHEINMAN--
 <person>
  Alan
 </person>. Happy Birthday. Thirteen years, many tears. Loving memories of your smile, humor, and laughter comfort us. You are always in our hearts. Love, 
 <person>
  Roni
 </person>, 
 <person>
  Sandy
 </person>, 
 <person>
  Jarret
 </person>, 
 <person>
  Greg
 </person>, 
 <person>
  Kate
 </person>, and 
 <person>
  Auden Gray
 </person></p>
SCHEINMAN--
艾伦
. 生日快乐。十三年,许多眼泪。对你的微笑、幽默和笑声的爱的回忆安慰着我们。你永远在我们心中。爱,
罗尼
, 
桑迪
, 
贾雷特
, 
格雷格
, 
凯特
,及
奥登·格雷


我添加了一些更新,似乎我的应用程序有点不同,我想这会产生一些影响,我不认为会,但显然会,因为您提供的解决方案似乎不适用于我的数据,请原谅我在我的原始问题中没有更清楚地说明差异,你的预期结果是什么?对于我来说,它适用于您更新的示例。产生的输出是
SCHEINMAN--Alan。生日快乐。十三年,许多眼泪。对你的微笑、幽默和笑声的爱的回忆安慰着我们。你永远在我们心中。Love、Roni、Sandy、Jarret、Greg、Kate和Auden Gray
这很奇怪,我把我正在使用的代码放进去,它什么也不输出。有没有办法保留那些
标记?在我的程序中它仍然没有输出任何东西。你知道为什么会这样吗?
String html = "<block class=\"full_text\">\n"
        + "    <p>SCHEINMAN</PERSON>--<PERSON>Alan</PERSON>. Happy Birthday. Thirteen years, many tears. Loving memories of your smile, humor, and laughter comfort us. You are always in our hearts. Love, <PERSON>Roni</PERSON>, <PERSON>Sandy</PERSON>, <PERSON>Jarret</PERSON>, <PERSON>Greg</PERSON>, <PERSON>Kate</PERSON>, and <PERSON>Auden Gray</PERSON></p></block></body.content></body></nitf>";
Document doc = Jsoup.parse(html);
String link = doc.select("block.full_text").html();
System.out.println(link);
<p>SCHEINMAN--
 <person>
  Alan
 </person>. Happy Birthday. Thirteen years, many tears. Loving memories of your smile, humor, and laughter comfort us. You are always in our hearts. Love, 
 <person>
  Roni
 </person>, 
 <person>
  Sandy
 </person>, 
 <person>
  Jarret
 </person>, 
 <person>
  Greg
 </person>, 
 <person>
  Kate
 </person>, and 
 <person>
  Auden Gray
 </person></p>