Php预匹配可选组
我写了一个正则表达式:Php预匹配可选组,php,regex,Php,Regex,我写了一个正则表达式: (^.*)(\[{1}[0-9]+:[0-9]+:[0-9]+:[0-9]+\]{1}) (\"{1}.+\"{1}) ([0-9]+) ([0-9-]+) 要匹配字符串,请执行以下操作: 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233 并使用php preg_匹配 例如,当我从字符串中删除第一部分141.243.1.172时,preg_match返回我: array(6 0 =&
(^.*)(\[{1}[0-9]+:[0-9]+:[0-9]+:[0-9]+\]{1}) (\"{1}.+\"{1}) ([0-9]+) ([0-9-]+)
要匹配字符串,请执行以下操作:
141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
并使用php preg_匹配
例如,当我从字符串中删除第一部分141.243.1.172时,preg_match返回我:
array(6
0 => [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
1 => // correctly empty
2 => [29:23:53:25]
3 => "GET /Software.html HTTP/1.0"
4 => 200
5 => 233
)
其中索引1正确为空。
但是如果我从字符串[29:23:53:25]中删除,我会从preg_match中得到一个空数组。我怎样才能得到与上面相同的结果,只得到相关的索引为空而不是全部 对于由于
*
而起作用的第一个零件。如果您还想删除第二部分,可以将两个组都设置为可选组,将第一个组设置为非贪婪组。将空间也移动到第二组中
请注意,您不必转义双引号,并且量词{1}
是多余的,因此可以省略它
第一次匹配后只有一个双引号,但为了防止可能的过度匹配,您可以将该匹配也设为非贪婪匹配,或者使用否定字符类(“[^”]+”
)来防止不必要的回溯
(^.*?)?(\[[0-9]+:[0-9]+:[0-9]+:[0-9]+\] )?(".+?") ([0-9]+) ([0-9-]+)
比如说
$strings = [
'141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233',
'[29:23:53:25] "GET /Software.html HTTP/1.0" 200 233',
'"GET /Software.html HTTP/1.0" 200 233'
];
$pattern = '/(^.*?)?(\[[0-9]+:[0-9]+:[0-9]+:[0-9]+\] )?(".+?") ([0-9]+) ([0-9-]+)/';
foreach ($strings as $string) {
preg_match($pattern, $string, $matches);
print_r($matches);
}
结果
Array
(
[0] => 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
[1] => 141.243.1.172
[2] => [29:23:53:25]
[3] => "GET /Software.html HTTP/1.0"
[4] => 200
[5] => 233
)
Array
(
[0] => [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
[1] =>
[2] => [29:23:53:25]
[3] => "GET /Software.html HTTP/1.0"
[4] => 200
[5] => 233
)
Array
(
[0] => "GET /Software.html HTTP/1.0" 200 233
[1] =>
[2] =>
[3] => "GET /Software.html HTTP/1.0"
[4] => 200
[5] => 233
)
将正则表达式更改为此
((^.*)(\[{1}[0-9]+:[0-9]+:[0-9]+:[0-9]+\]{1}) )?(\"{1}.+\"{1}) ([0-9]+) ([0-9-]+)
对于141.243.1.172[29:23:53:25]“GET/Software.html HTTP/1.0”200 233
结果将是
Array
(
[0] => 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
[1] => 141.243.1.172 [29:23:53:25]
[2] => 141.243.1.172
[3] => [29:23:53:25]
[4] => "GET /Software.html HTTP/1.0"
[5] => 200
[6] => 233
)
Array
(
[0] => [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
[1] => [29:23:53:25]
[2] =>
[3] => [29:23:53:25]
[4] => "GET /Software.html HTTP/1.0"
[5] => 200
[6] => 233
)
对于[29:23:53:25]“GET/Software.html HTTP/1.0”200 233
结果将是
Array
(
[0] => 141.243.1.172 [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
[1] => 141.243.1.172 [29:23:53:25]
[2] => 141.243.1.172
[3] => [29:23:53:25]
[4] => "GET /Software.html HTTP/1.0"
[5] => 200
[6] => 233
)
Array
(
[0] => [29:23:53:25] "GET /Software.html HTTP/1.0" 200 233
[1] => [29:23:53:25]
[2] =>
[3] => [29:23:53:25]
[4] => "GET /Software.html HTTP/1.0"
[5] => 200
[6] => 233
)
请注意,您确实应该有一个PHP演示设置,因为纯正则表达式与正常工作的PHP脚本不同。我尝试了您的解决方案,但它放在同一个索引141.243.1.172[29:23:53:25].我想把它们分开放indexes@StefanoMaglione你说得对,我已经更新了答案。@TimBiegeleisen很好的建议,谢谢!我看到第一个模式不正确。我看到它是可行的,但是如果我删除'200'或'233'或'GET/Software.html HTTP/1.0',我会遇到与空数组相同的问题。您的正则表达式不再匹配在最后一种情况下,
preg_match()
返回您似乎没有检查的0
。必须将可选匹配项标记为这样,但在此之前,请解释您使用的是冗余的{1}
。我使用了{1}要指定“我只想要一个”是的,preg_匹配不再匹配,是否有方法创建可选组?可能重复