Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/345.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 围绕纯html文本包装一个标记_Java_Regex_Jsoup_Text Parsing_Tag Soup - Fatal编程技术网

Java 围绕纯html文本包装一个标记

Java 围绕纯html文本包装一个标记,java,regex,jsoup,text-parsing,tag-soup,Java,Regex,Jsoup,Text Parsing,Tag Soup,我的html文档中有以下结构: <p> "<em>You</em> began the evening well, Charlotte," said Mrs.&nbsp;Bennet with civil self–command to Miss Lucas. "<em>You</em> were Mr.&nbsp;Bingley's first choice." </p> “夏绿蒂,今

我的html文档中有以下结构:

<p>
"<em>You</em> began the evening well, Charlotte," said Mrs.&nbsp;Bennet with civil          self–command to Miss Lucas. "<em>You</em> were Mr.&nbsp;Bingley's first choice."
</p>

“夏绿蒂,今晚你开始得很好,”班纳特太太对卢卡斯小姐彬彬有礼地说。“你是彬格莱先生的第一选择。”

但我需要将我的“纯文本”包装在标签中,以便能够处理它:)


"
你
今晚开始得很好,夏绿蒂,”班纳特太太对卢卡斯小姐彬彬有礼地说。"
你
是彬格莱先生的第一选择。”

有没有办法做到这一点?我已经看过tagsoup和jsoup,但我似乎不是一个容易解决这个问题的方法。也许使用一些奇特的regexp

谢谢

这里有一个建议:

public static Node toTextElement(String str) {
    Element e = new Element(Tag.valueOf("text"), "");
    e.appendText(str);
    return e;
}

public static void replaceTextNodes(Node root) {
    if (root instanceof TextNode)
        root.replaceWith(toTextElement(((TextNode) root).text()));
    else
        for (Node child : root.childNodes())
            replaceTextNodes(child);
}
测试代码:

String html = "<p>\"<em>You</em> began the evening well, Charlotte,\" " +
         "said Mrs.&nbsp;Bennet with civil self–command to Miss Lucas." +
         " \"<em>You</em> were Mr.&nbsp;Bingley's first choice.\"</p>";

Document doc = Jsoup.parse(html);

for (Node n : doc.body().children())
    replaceTextNodes(n);

System.out.println(doc);
String html=“\”今晚你开始得很好,夏洛特,\”+
“班纳特太太对卢卡斯小姐彬彬有礼地说。”+
“你是彬格莱先生的第一选择。”;
Document doc=Jsoup.parse(html);
对于(节点n:doc.body().children())
替换文本节点(n);
系统输出打印项次(doc);
输出:

<html>
 <head></head>
 <body>
  <p>
   <text>
    &quot;
   </text><em>
    <text>
     You
    </text></em>
   <text>
     began the evening well, Charlotte,&quot; said Mrs.&nbsp;Bennet with civil self–command to Miss Lucas. &quot;
   </text><em>
    <text>
     You
    </text></em>
   <text>
     were Mr.&nbsp;Bingley's first choice.&quot;
   </text></p>
 </body>
</html>


"
你
今晚开始得很好,夏绿蒂,”班纳特太太对卢卡斯小姐彬彬有礼地说。"
你
是彬格莱先生的第一选择。”


工作正常!谢谢事实上,我正试图使用绘画和绘制文本的方法在画布上呈现html。这是一个好的开始吗?:)
<html>
 <head></head>
 <body>
  <p>
   <text>
    &quot;
   </text><em>
    <text>
     You
    </text></em>
   <text>
     began the evening well, Charlotte,&quot; said Mrs.&nbsp;Bennet with civil self–command to Miss Lucas. &quot;
   </text><em>
    <text>
     You
    </text></em>
   <text>
     were Mr.&nbsp;Bingley's first choice.&quot;
   </text></p>
 </body>
</html>