PHP-复杂正则表达式提取_Php_Regex_Preg Match All

PHP-复杂正则表达式提取

php regex

PHP-复杂正则表达式提取,php,regex,preg-match-all,Php,Regex,Preg Match All,我有一些字符串要解析，它变得有点复杂了 <?php $notecomments = ' This is the first of the notes, and so whatever comes later is appended. (John Smith) at 2012-02-07 00:00:20 GMT<hr>This is a com

我有一些字符串要解析，它变得有点复杂了

<?php
$notecomments = '
This is the first of the notes, and so whatever comes later is appended.<br>
(<b>John Smith</b>) at <b class="datetimeGMT">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>(<b>Alex Boom</b>) at <b class="datetimeGMT">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.';

if(preg_match_all('/\(<b>(?:(?!\(<b>).)*/s', $notecomments, $matches)){
print_r($matches);
}

/* result of code:
Array
(
    [0] => Array
        (
            [0] => (<b>John Smith</b>) at <b class="datetimeGMT">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>
            [1] => (<b>Alex Boom</b>) at <b class="datetimeGMT">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.
        )

)
*/
?>

我使用preg\u replace\u回调和两个正则表达式来实现此目的
像
你能用明文明确地说明你的匹配/提取标准区域是什么吗？（即我想在第一个<代码> > BR> < /代码>之前或之前的所有东西之前捕获所有的东西（）或任何东西。因为在这种情况下，它同样麻烦，你可以考虑DOM方法迭代节点（文本，标签）。然后查找b
，然后是最后的br标记或换行符。
 $notecomments = "This is the first of the notes, and so whatever comes later is appended.<br>(<b>John Smith</b>) at <b class=\"datetimeGMT\">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>(<b>Alex Boom</b>) at <b class=\"datetimeGMT\">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.";
 $output=preg_replace_callback(array("~<b (.*?)>(.+?)</b>~si","~<b>(.+?)</b>~si"),function($matches){
if(isset($matches[2])){
  print_r($matches[2]."\n");
}else{
  print_r($matches[1]."\n");
}
return '';},' '.$notecomments.' ');

 2012-02-07 00:00:20 GMT
 2013-02-07 00:08:06 GMT
 John Smith
 Alex Boom