PHP正则表达式_Php_Html_Regex_String_Str Replace

PHP正则表达式

php html regex string

PHP正则表达式,php,html,regex,string,str-replace,Php,Html,Regex,String,Str Replace,因此，我编写了一小部分代码，用于将移动站点的表转换为div 以下是代码的摘录： function replaceTables($table, $html) { $tempTable = preg_replace('/<table[^>]*>(.*?)<\/table>/is', '<div style="width: 90%; margin: auto;">$1</div><div style="clear:

因此，我编写了一小部分代码，用于将移动站点的表转换为div

以下是代码的摘录：

function replaceTables($table, $html) {

            $tempTable = preg_replace('/<table[^>]*>(.*?)<\/table>/is', '<div style="width: 90%; margin: auto;">$1</div><div style="clear: both;"></div>', $table);
            $html = str_replace($table, $tempTable, $html);

            preg_match_all('/(?!<table[^>]*>).*?<tr[^>]*>.*?<\/tr>.*?(?<!<\/table>)/is', $tempTable, $rows, PREG_OFFSET_CAPTURE);

            for ($i = 0; $i < count($rows[0]); $i++) {
                $tempRow = $rows[0][$i][0];

                preg_match_all('/(?!<table[^>]*>).*?<td[^>]*>.*?<\/td>.*?(?<!<\/table>)/is', $tempRow, $cols, PREG_OFFSET_CAPTURE);

                $numCols = count($cols[0]);
                $colWidth = 100/$numCols;

                for ($x = 0; $x < $numCols; $x++) {
                    $tempCol = $cols[0][$x][0];
                    $cols[0][$x][0] = preg_replace('/<td[^>]*>(.*?)<\/td>/is', '<div style="width: ' . $colWidth . '%; float: left;">$1</div>', $cols[0][$x][0]);
                    $tempRow = str_replace($tempCol, $cols[0][$x][0], $tempRow);
                }

                $tempRow = preg_replace('/<tr[^>]*>(.*?)<\/tr>/is', '<div style="clear: both;">$1</div>', $tempRow);
                $tempTable = str_replace($rows[0][$i][0], $tempRow, $tempTable);
            }

            $html = str_replace($table, $tempTable, $html);

            return $html;
        }

        if ($mobile && $page->type_id != 16) {
            // replace tables with divs for better mobile support

            preg_match_all('/<table[^>]*>.*?<\/table>/is', $this->html, $tables, PREG_OFFSET_CAPTURE);

            for ($y = 0; $y < count($tables[0]); $y++) {
                preg_match_all('/<table[^>]*>.*?<\/table>/is', $tables[0][$y][0], $nestedTables, PREG_OFFSET_CAPTURE);

                if (count($nestedTables[0]) > 0) {
                    //echo count($nestedTables[0]) . "<br />";
                    //print_r($nestedTables[0][0][0]);
                    for ($y = 0; $y < count($nestedTables[0]); $y++) {
                        $this->html = replaceTables($nestedTables[0][$y][0], $this->html);
                    }
                }
                $this->html = replaceTables($tables[0][$y][0], $this->html);
            }
            //$this->html = preg_replace('/<table[^>]*>(.*?)<\/table>/is', '<div style="width: 90%; margin: auto;">$1<div style="clear: both;"></div></div>', $this->html);
        }
        return $this->html;

函数替换表（$table，$html）{
$TENTABLE=preg_replace（'/]*>（.*？）/is'，'$1'，$table）；
$html=str_replace（$table，$tentable，$html）；
preg_match_all（'/（？！]*>）.*>。*？（？）/is'，$tentable，$rows，preg_OFFSET_CAPTURE）；
对于（$i=0；$i）.*>。*？（？）/is'，$tempRow，$cols，preg_OFFSET_CAPTURE）；
$numCols=count（$cols[0]）；
$colWidth=100/$numCols；
对于（$x=0；$x<$numCols；$x++）{
$tempCol=$cols[0][$x][0]；
$cols[0][$x][0]=preg_replace（'/]*>（.*？）/is'，'$1'，$cols[0][$x][0]）；
$tempRow=str_replace（$tempCol，$cols[0][$x][0]，$tempRow）；
}
$tempRow=preg_replace（'/]*>（.*？）/is'，'$1'，$tempRow）；
$tentrable=str_replace（$rows[0][$i][0]，$tempRow，$tentrable）；
}
$html=str_replace（$table，$tentable，$html）；
返回$html；
}
如果（$mobile&&$page->type_id！=16）{
//用div替换桌子，以获得更好的移动支持
preg_match_all（'/]*>.*？/is'，$this->html，$tables，preg_OFFSET_CAPTURE）；
对于（$y=0；$y.*？/is'，$tables[0][$y][0]，$nestedTables，preg_OFFSET_CAPTURE）；
如果（计数（$nestedTables[0]）>0）{
//回声计数（$nestedTables[0]）。“
”；
//打印（$nestedTables[0][0][0]）；
对于（$y=0；$yhtml=replaceTables（$nestedTables[0][$y][0]，$this->html）；
}
}
$this->html=replaceTables（$tables[0][$y][0]，$this->html）；
}
//$this->html=preg_replace（'/]*>（.*？/is'、'$1'、$this->html）；
}
返回$this->html；

我对嵌套表有问题，正则表达式正在查找第一个出现的结束表标记，而不是我需要它查找的标记

如果有人能引导我找到一个更好的正则表达式或者一个用div替换表的不同解决方案，那就太好了。解决方案必须是通过操纵字符串，这样就不必对模板化系统进行彻底检查

正如许多人所说，用正则表达式解析HTML不太可能是理想的方法。尽管如此，我还是做了一些研究试图提供帮助，假设您出于某种原因一定会使用这种方法

听起来您可能遇到了与PHP如何解释正则表达式模式的贪婪性相关的问题。我发现您使用了大量的？量词，这可能会使此运行非贪婪搜索（至少基于我所阅读的内容）。您可以通过对某些或所有正则表达式模式使用

修饰符来修复此问题。这将扭转贪婪，这可能会使你的

？

量词再次贪婪

这就是说，这里有一组复杂的正则表达式检查，因此这肯定也有可能导致一些意外行为。我建议你试试看

作为参考，您可以将

修饰符放在正则表达式的

结尾之后，调用它，就像您在某些地方使用

和

所做的那样。

对我来说效果最好的是使用以下步骤处理HTML内容：

使用

utf8\u encode（$s）

将内容转换为UTF-8（如果尚未使用UTF-8）

使用

$tidy->repairFile（$file，array（'output-XHTML'=>true）、'utf8'）转换为XHTML


使用$sx=simplexml\u load\u文件（$file，'simplexmlement'，LIBXML\u NOENT）构建DOM

使用$sx->xpath（$xpath）解析DOM

我希望这有帮助
 你试图用正则表达式解析HTML，你说这很痛苦？认真地此网站的其他一些用户对此有不同的定义：。我甚至还穿着一件T恤，上面写着这首诗，这首诗的灵感来自于像你这样的人。改用一个。用div而不是表来写网站不是更好吗？这样你就不必有一些功能来为你做这件事了？@Catfish:OP显然是用某种框架工作的，或者是在转换旧页面。（至少我希望如此）U
修饰符从来都不是个好主意这是不必要的，它的唯一效果是混淆。