PHP-如何在空白和html标记处通过正则表达式将字符串拆分为数组?

PHP-如何在空白和html标记处通过正则表达式将字符串拆分为数组?,php,arrays,regex,string,preg-split,Php,Arrays,Regex,String,Preg Split,下面是一个示例字符串: $string = '<strong>Lorem ipsum dolor</strong> sit <img src="test.png" /> amet <span class="test" style="color:red">consec<i>tet</i>uer</span>.'; $string='Lorem ipsum dolorsit amet concetetuer';

下面是一个示例字符串:

$string = '<strong>Lorem ipsum dolor</strong> sit <img src="test.png" /> amet <span class="test" style="color:red">consec<i>tet</i>uer</span>.';
$string='Lorem ipsum dolorsit amet concetetuer';
我希望将字符串拆分为数组,以便在命中空白或html标记时拆分字符串(忽略html标记中的空白)。例如:

Array
(
    [0] => <strong>
    [1] => Lorem
    [2] => ipsum
    [3] => dolor
    [4] => </strong>
    [5] => sit
    [6] => <img src="test.png" />
    [7] => amet
    [8] => <span class="test" style="color:red">
    [9] => consec
    [10] => <i>
    [11] => tet
    [12] => </i>
    [13] => uer
    [14] => </span>
    [15] => .
)
数组
(
[0]=>
[1] =>洛雷姆
[2] =>同侧
[3] =>多洛
[4] =>
[5] =>坐
[6] => 
[7] =>amet
[8] => 
[9] =>奉献
[10] => 
[11] =>春节
[12] => 
[13] =>uer
[14] => 
[15] => .
)
但我无法做到这一点。我曾经实现过这个想法,但我认为我的正则表达式是错误的。下面是我尝试过的一些表达式,但结果不是我想要的

$chars = preg_split('/(<[^>]*[^\/]>)/i', $string, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);

/* Results */

Array
(
    [0] => <strong>
    [1] => Lorem ipsum dolor
    [2] => </strong>
    [3] =>  sit <img src="test.png" /> amet 
    [4] => <span class="test" style="color:red">
    [5] => consec
    [6] => <i>
    [7] => tet
    [8] => </i>
    [9] => uer
    [10] => </span>
    [11] => .
)
$chars=preg_split('/(]*[^\/]>)/i',$string,-1,preg_split_NO_EMPTY | preg_split_DELIM_CAPTURE);
/*结果*/
排列
(
[0]=>
[1] =>Lorem ipsum dolor
[2] =>
[3] =>坐在阿梅特
[4] => 
[5] =>奉献
[6] => 
[7] =>春节
[8] => 
[9] =>uer
[10] => 
[11] => .
)
其他正则表达式的结果是:

$chars = preg_split('/\s+(?![^<>]*>)/x', $string, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);

/* Results */
Array
(
    [0] => <strong>Lorem
    [1] => ipsum
    [2] => dolor</strong>
    [3] => sit
    [4] => <img src="test.png" />
    [5] => amet
    [6] => <span class="test" style="color:red">consec<i>tet</i>uer</span>.
)
$chars=preg_split('/\s+(?![^]*>)/x',$string,-1,preg_split_NO_EMPTY | preg_split_DELIM_CAPTURE);
/*结果*/
排列
(
[0]=>Lorem
[1] =>同侧
[2] =>多洛
[3] =>坐
[4] => 
[5] =>amet
[6] =>连续图。
)
另一个表达式的结果是(非常接近):

$chars=preg_split('/\s*(]*>)/i',$string,-1,preg_split_NO_EMPTY | preg_split_DELIM_CAPTURE);
/*结果*/
排列
(
[0]=>
[1] =>Lorem ipsum dolor
[2] =>
[3] =>坐
[4] => 
[5] =>amet
[6] => 
[7] =>奉献
[8] => 
[9] =>春节
[10] => 
[11] =>uer
[12] => 
[13] => .
)

您就快拿到了。但是您需要将
]*>
更改为更具体的正则表达式
,然后需要为空格
\s+
设置一个替代项。您也不需要
i
标志:

preg_split('/(<\/?\w+[^<>]*>)|\s+/', $string, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE)
preg_split('/()|\s+/',$string,null,preg_split_NO_EMPTY | preg_split_DELIM_CAPTURE)

非常感谢。得到了想要的结果。:)
preg_split('/(<\/?\w+[^<>]*>)|\s+/', $string, null, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE)