Php 用BR标记替换换行符，但仅在PRE标记内_Php_Regex_Html Parsing

Php 用BR标记替换换行符，但仅在PRE标记内

php regex

Php 用BR标记替换换行符，但仅在PRE标记内,php,regex,html-parsing,Php,Regex,Html Parsing,在库存PHP5中，进行此转换的好的preg_replace表达式是什么：用替换换行符，但仅在块内 <div><pre class='some class'>1 2 3 </pre> <pre>line 1 line 2 line 3 </pre> </div> （请随意做出简化的假设，并忽略角落案例。例如，我们可以假设标签是一行，而不是像这样的东西）输入文本： <div><pre>1<b

在库存PHP5中，进行此转换的好的

preg_replace

表达式是什么：

用
替换换行符，但仅在
块内

<div><pre class='some class'>1 2 3 </pre> <pre>line 1 line 2 line 3 </pre> </div>
（请随意做出简化的假设，并忽略角落案例。例如，我们可以假设标签是一行，而不是像这样的东西）
输入文本：

<div><pre>1<br />2<br />3<br /></pre> <pre>line 1<br />line 2<br />line 3<br /></pre> </div>

<?php $str = "<div><pre class='some class' >1 2 3 < / pre> <pre>line 1 line 2 line 3 </pre> </div>"; $out = "<div><pre class='some class' >1<br />2<br />3<br />< / pre> <pre>line 1<br />line 2<br />line 3<br /></pre> </div>"; function protect_newlines($str) { // \n -> <br />, but only if it's in a pre block // protects newlines from Parser::doBlockLevels() /* split on <pre ... /pre>, basically. probably good enough */ $str = " ".$str; // guarantee split will be in even positions //$parts = preg_split('/(<pre .* pre>)/Umsxu',$str,-1,PREG_SPLIT_DELIM_CAPTURE); $parts = preg_split("/(< \s* pre .* \/ \s* pre \s* >)/Umsxu",$str,-1,PREG_SPLIT_DELIM_CAPTURE); foreach ($parts as $idx=>$part) { if ($idx % 2) { $parts[$idx] = preg_replace("/\n/", "<br />", $part); } } $str = implode('',$parts); /* chop off the first space, that we had added */ return substr($str,1); } assert(protect_newlines($str) === $out); ?>

1 2. 3. 第1行第2行第3行
输出：

1 2 3 第1行第2行第3行
（激发动机的背景：试图在wikimedia SyntaxHighlight_GeSHI扩展中消除bug 20760，并发现我的PHP技能（我主要使用python）不符合要求）
除了regexen之外，我对其他解决方案持开放态度，但小型解决方案更受欢迎（例如，构建html解析机制就太过了）。
类似这样的解决方案

基于SilentGhost所说的话（出于某种原因，这里没有显示）：用html\u entity\u decode更新了答案，如果您不需要它，请将其删除。我刚刚抛出了一个用于换行符的快速正则表达式，如果您看到任何问题，请让我知道，perl正则表达式向导：）这对我来说是失败的，因为html\u entity\u decode在元素之间添加了换行符。别怪我，怪维基媒体的解析器类：）更正：saveHtml添加了换行符。不过，我确实喜欢这种方法，一般来说，它不适合我的应用程序。 <?php $str = "<div><pre class='some class' >1 2 3 < / pre> <pre>line 1 line 2 line 3 </pre> </div>"; $out = "<div><pre class='some class' >1<br />2<br />3<br />< / pre> <pre>line 1<br />line 2<br />line 3<br /></pre> </div>"; function protect_newlines($str) { // \n -> <br />, but only if it's in a pre block // protects newlines from Parser::doBlockLevels() /* split on <pre ... /pre>, basically. probably good enough */ $str = " ".$str; // guarantee split will be in even positions //$parts = preg_split('/(<pre .* pre>)/Umsxu',$str,-1,PREG_SPLIT_DELIM_CAPTURE); $parts = preg_split("/(< \s* pre .* \/ \s* pre \s* >)/Umsxu",$str,-1,PREG_SPLIT_DELIM_CAPTURE); foreach ($parts as $idx=>$part) { if ($idx % 2) { $parts[$idx] = preg_replace("/\n/", "<br />", $part); } } $str = implode('',$parts); /* chop off the first space, that we had added */ return substr($str,1); } assert(protect_newlines($str) === $out); ?>