PHP-复杂正则表达式提取
我有一些字符串要解析,它变得有点复杂了PHP-复杂正则表达式提取,php,regex,preg-match-all,Php,Regex,Preg Match All,我有一些字符串要解析,它变得有点复杂了 <?php $notecomments = ' This is the first of the notes, and so whatever comes later is appended.<br> (<b>John Smith</b>) at <b class="datetimeGMT">2012-02-07 00:00:20 GMT</b><hr>This is a com
<?php
$notecomments = '
This is the first of the notes, and so whatever comes later is appended.<br>
(<b>John Smith</b>) at <b class="datetimeGMT">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>(<b>Alex Boom</b>) at <b class="datetimeGMT">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.';
if(preg_match_all('/\(<b>(?:(?!\(<b>).)*/s', $notecomments, $matches)){
print_r($matches);
}
/* result of code:
Array
(
[0] => Array
(
[0] => (<b>John Smith</b>) at <b class="datetimeGMT">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>
[1] => (<b>Alex Boom</b>) at <b class="datetimeGMT">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.
)
)
*/
?>
我使用preg\u replace\u回调和两个正则表达式来实现此目的
像
你能用明文明确地说明你的匹配/提取标准区域是什么吗?(即我想在第一个<代码> > BR> < /代码>之前或之前的所有东西之前捕获所有的东西()或任何东西。因为在这种情况下,它同样麻烦,你可以考虑DOM方法迭代节点(文本,标签)。然后查找b
,然后是最后的br
标记或换行符。
$notecomments = "This is the first of the notes, and so whatever comes later is appended.<br>(<b>John Smith</b>) at <b class=\"datetimeGMT\">2012-02-07 00:00:20 GMT</b><hr>This is a comment posted<br><br>(<b>Alex Boom</b>) at <b class=\"datetimeGMT\">2013-02-07 00:08:06 GMT</b><hr>And let's put some more in here<br />with a new line.";
$output=preg_replace_callback(array("~<b (.*?)>(.+?)</b>~si","~<b>(.+?)</b>~si"),function($matches){
if(isset($matches[2])){
print_r($matches[2]."\n");
}else{
print_r($matches[1]."\n");
}
return '';},' '.$notecomments.' ');
2012-02-07 00:00:20 GMT
2013-02-07 00:08:06 GMT
John Smith
Alex Boom