在JavaScript中迭代DOM时关闭标记事件_Javascript_Dom

在JavaScript中迭代DOM时关闭标记事件

javascript dom

在JavaScript中迭代DOM时关闭标记事件,javascript,dom,Javascript,Dom,我正在编写一个Chrome扩展来将HTML页面转换成不同的格式如果我使用document.getElementsByTagName（“*”）并迭代该集合，我可以看到所有标记。然而，它是一个平面表示。我需要检测开始和结束的“事件”，比如SAX解析器，以便翻译后的输出保持适当的包含/嵌套在JavaScript中实现这一点的正确方法是什么？手动操作似乎有点尴尬。还有别的办法吗为了说明我的意思 <html> <body> <h

我正在编写一个Chrome扩展来将HTML页面转换成不同的格式

如果我使用

document.getElementsByTagName（“*”）

并迭代该集合，我可以看到所有标记。然而，它是一个平面表示。我需要检测开始和结束的“事件”，比如SAX解析器，以便翻译后的输出保持适当的包含/嵌套

在JavaScript中实现这一点的正确方法是什么？手动操作似乎有点尴尬。还有别的办法吗

为了说明我的意思

   <html>
       <body>
           <h1>Header</h1>
           <div>
               <p>some text and a missing closing tag
               <p>some more text</p>
           </div>
           <p>some more dirty HTML
        </body>
    <html>

我觉得，作为迭代的一部分，跟踪类似SAX解析器的事件取决于我自己。我还有其他选择吗？如果没有，你能告诉我一些示例代码吗

谢谢

我认为没有工具可以使用，所以你应该编写一些递归函数，在其中你将

获取第一个子节点

，

获取下一个节点

，

以某种方式获取父节点

，等等。

只需遍历每个节点和每个节点的所有子节点。当某一级别的子项耗尽时，标记将关闭

function parseChildren(node) {

    // if this a text node, it has no children or open/close tags
    if(node.nodeType == 3) {
        console.log("text");
        return;
    }

    console.log(node.tagName.toLowerCase() + " open");

    // parse the child nodes of this node
    for(var i = 0; i < node.childNodes.length; ++i) {
        parseChildren(node.childNodes[i]);
    }

    // all the children are used up, so this tag is done
    console.log(node.tagName.toLowerCase() + " close");
}

正是我需要的。非常感谢。

function parseChildren(node) {

    // if this a text node, it has no children or open/close tags
    if(node.nodeType == 3) {
        console.log("text");
        return;
    }

    console.log(node.tagName.toLowerCase() + " open");

    // parse the child nodes of this node
    for(var i = 0; i < node.childNodes.length; ++i) {
        parseChildren(node.childNodes[i]);
    }

    // all the children are used up, so this tag is done
    console.log(node.tagName.toLowerCase() + " close");
}

    if(node.nodeType == 3) {
        // if this node is all whitespace, don't report it
        if(node.data.replace(/\s/g,'') == '') { return; }

        // otherwise, report it
        console.log("text");
        return;
    }