javajsoup-Element-isn';t从元素中删除

javajsoup-Element-isn';t从元素中删除,java,jsoup,Java,Jsoup,我将从头开始,有如下模式的html: <div id="post_message_(some numeric id)"> <div style="some style things"> <div class="smallfont" style="some style">useless text</div> <table cellpading="6" cellspaceing=.......> a

我将从头开始,有如下模式的html:

<div id="post_message_(some numeric id)">
    <div style="some style things">
        <div class="smallfont" style="some style">useless text</div>
        <table cellpading="6" cellspaceing=.......> a lot of text inside i dont need</table>
    </div>
    Text i need
</div>
我添加了这些打印行以检查它是否找到了它们,是的,它确实找到了正确的打印行,但稍后当我将其解析为字符串时:

String message = Jsoup.parse(divsInside.html().replaceAll("(?i)<br[^>]*>", "br2n")).text()
            .replaceAll("br2n", "\n");
String message=Jsoup.parse(divsInside.html().replaceAll(“(?i)]*>”,“br2n”)).text()
.replaceAll(“br2n”和“\n”);
由于某些原因,字符串再次包含所有已删除的内容


我试着用迭代器删除它们,或者用索引填充和删除元素,但结果是一样的。

所以你想得到我需要的
文本。使用
Element
ownText()
方法,该方法
仅获取此元素所拥有的文本;不获取所有子项的组合文本

 private static void test(String htmlFile) {
    File input = null;
    Document doc = null;
    Element specificIdDiv = null;

    try {
        input = new File(htmlFile);
        doc = Jsoup.parse(input, "ASCII", "");
        doc.outputSettings().charset("ASCII");
        doc.outputSettings().escapeMode(EscapeMode.base);

        /** Get Element id = post_message_1 **/
        specificIdDiv = doc.getElementById("post_message_1");

        if (specificIdDiv != null ) {
            System.out.println("content: " + specificIdDiv.ownText());
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
}

顺便问一下,是否可以像这样提取换行符?
提取
,您的意思是删除
String message = Jsoup.parse(divsInside.html().replaceAll("(?i)<br[^>]*>", "br2n")).text()
            .replaceAll("br2n", "\n");
 private static void test(String htmlFile) {
    File input = null;
    Document doc = null;
    Element specificIdDiv = null;

    try {
        input = new File(htmlFile);
        doc = Jsoup.parse(input, "ASCII", "");
        doc.outputSettings().charset("ASCII");
        doc.outputSettings().escapeMode(EscapeMode.base);

        /** Get Element id = post_message_1 **/
        specificIdDiv = doc.getElementById("post_message_1");

        if (specificIdDiv != null ) {
            System.out.println("content: " + specificIdDiv.ownText());
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
}