Php 是否有一种方法可以优化在页面上查找文本项（而不是正则表达式）_Php_Regex_Optimization_Parsing_Matching

Php 是否有一种方法可以优化在页面上查找文本项（而不是正则表达式）

php regex optimization parsing

Php 是否有一种方法可以优化在页面上查找文本项（而不是正则表达式）,php,regex,optimization,parsing,matching,Php,Regex,Optimization,Parsing,Matching,在看到几个线程破坏了在HTML文档中查找匹配项的regexp方法之后，我使用了简单的HTML DOM PHP解析器来获取我要查找的文本位，但我想知道我的代码是否最优。感觉好像我循环了太多次。有没有办法优化以下循环 //Get the HTML and look at the text nodes $html = str_get_html($buffer); //First we match the <body> tag as we don't want to change

在看到几个线程破坏了在HTML文档中查找匹配项的regexp方法之后，我使用了简单的HTML DOM PHP解析器来获取我要查找的文本位，但我想知道我的代码是否最优。感觉好像我循环了太多次。有没有办法优化以下循环

//Get the HTML and look at the text nodes
   $html = str_get_html($buffer);
   //First we match the <body> tag as we don't want to change the <head> items
   foreach($html->find('body') as $body) {
    //Then we get the text nodes, rather than any HTML
    foreach($body->find('text') as $text) {
     //Then we match each term
     foreach ($terms as $term) {
      //Match to the terms within the text nodes
      $text->outertext = str_replace($term, '<span class="highlight">'.$term.'</span>', $text->outertext);
     }       
    }
   }

例如，在开始循环之前确定是否有匹配项可能会有所不同吗？

您不需要外部foreach循环；格式良好的文档中通常只有一个body标记。相反，只需使用$body=$html->find'body'，0

然而，由于只有一次迭代的循环在运行时基本上等同于根本不循环，因此它可能不会对性能产生太大影响。因此，实际上，即使在原始代码中，也只有2个嵌套循环，而不是3个。

出于无知，find是否使用任意XPath表达式？如果有，您可以将两个外环折叠为一个：

foreach($html->find('body/text') as $body) {
    ...
}

不确定。它遵循jQueryCSS匹配方法。这有用吗？