Java 如何使用Jsoup获取孤立文本?

Java 如何使用Jsoup获取孤立文本?,java,html,jsoup,html-parsing,Java,Html,Jsoup,Html Parsing,我有一个html: <span>This is the first text</span> More text here Another line of text <span>Text in the span</span> <span>Another text in span</span> This is another line 我将使用一个递归方法,它接受起始标记并迭代其子节点。对于每个TextNode,打印内容。对

我有一个html:

<span>This is the first text</span>
More text here 
Another line of text
<span>Text in the span</span>
<span>Another text in span</span>
This is another line

我将使用一个递归方法,它接受起始标记并迭代其子节点。对于每个TextNode,打印内容。对于每个元素,检查其子节点

public static void main(String[] args) throws ParseException, IOException
{
    //I put your HTML in the body tag in a local file
    Document doc = Jsoup.parse(new File("input/20160505.html"), "UTF-8");
    Elements elements = doc.getElementsByTag("body");
    Element rootTag = elements.get(0);
    printTextOfTag(rootTag);
}

public static void printTextOfTag(Element currentTag)
{
    List<Node> nodes = currentTag.childNodes();
    for(Node n : nodes)
    {
        if(n instanceof TextNode)
        {
            System.out.println(((TextNode)n).text());
        }
        else if(n instanceof Element)
        {
            printTextOfTag((Element)n);
        }
    }
}
public static void main(String[] args) throws ParseException, IOException
{
    //I put your HTML in the body tag in a local file
    Document doc = Jsoup.parse(new File("input/20160505.html"), "UTF-8");
    Elements elements = doc.getElementsByTag("body");
    Element rootTag = elements.get(0);
    printTextOfTag(rootTag);
}

public static void printTextOfTag(Element currentTag)
{
    List<Node> nodes = currentTag.childNodes();
    for(Node n : nodes)
    {
        if(n instanceof TextNode)
        {
            System.out.println(((TextNode)n).text());
        }
        else if(n instanceof Element)
        {
            printTextOfTag((Element)n);
        }
    }
}
This is the first text

 More text here Another line of text 

Text in the span



Another text in span

 This is another line