Php 正则表达式从具有可预测格式的字符串中提取多个子字符串
我在下面找到了这根绳子Php 正则表达式从具有可预测格式的字符串中提取多个子字符串,php,regex,string,Php,Regex,String,我在下面找到了这根绳子 string(49) "02/12/2018 (Assessment 2) = /86= | Weight: 50.00%" string(49) "02/12/2018 (Assessment 2) = 50.83/86= | Weight: 50.00%"" 第一个示例在/之前没有显示任何数字,在这种情况下,我需要使用00.00作为默认值 我需要获得这些信息,然后放入如下数组: $dados[ "date" ] = "02/12/2018" $dados[ "mar
string(49) "02/12/2018 (Assessment 2) = /86= | Weight: 50.00%"
string(49) "02/12/2018 (Assessment 2) = 50.83/86= | Weight: 50.00%""
第一个示例在/
之前没有显示任何数字,在这种情况下,我需要使用00.00
作为默认值
我需要获得这些信息,然后放入如下数组:
$dados[ "date" ] = "02/12/2018"
$dados[ "markOK" ] = "50"
$dados[ "markTotal" ] = "86"
$dados[ "weight" ] = "50.00"
其他例子:
string(49) "02/12/2018 (Assessment 2) = /86= | Weight: 50.00%"
string(59) "06/11/2018 (Assessment 2) = 22.40/35=32.00 | Weight: 50.00%"
string(49) "04/12/2018 (Assessment 2) = /60= | Weight: 50.00%"
string(59) "11/09/2018 (Assessment 2) = 27.00/40=33.75 | Weight: 50.00%"
string(59) "09/09/2018 (Assessment 2) = 30.00/30=50.00 | Weight: 50.00%"
string(59) "14/08/2018 (Assessment 2) = 31.00/40=38.75 | Weight: 50.00%"
string(59) "19/06/2018 (Assessment 2) = 63.00/72=43.75 | Weight: 50.00%"
string(59) "17/06/2018 (Assessment 2) = 45.00/45=50.00 | Weight: 50.00%"
string(59) "22/05/2018 (Assessment 2) = 11.00/55=10.00 | Weight: 50.00%"
像这样:
(?P<date>\d{2}\/\d{2}\/\d{4})[^=]+=\s(?P<markOK>\d+(?:\.\d+)?)?\/(?P<markTotal>\d+)[^:]+:\s(?P<weight>\d+(?:\.\d+)?)
如果要清理matches数组,可以按如下方式执行:
$str = '04/12/2018 (Assessment 2) = /60= | Weight: 50.00%';
$pattern = '/(?P<date>\d{2}\/\d{2}\/\d{4})[^=]+=\s(?P<markOK>\d+(?:\.\d+)?)?\/(?P<markTotal>\d+)[^:]+:\s(?P<weight>\d+(?:\.\d+)?)/';
if(preg_match($pattern, $str, $matches)){
$default = [
'date' => '00/00/0000',
'markOK' => '0.00',
'markTotal' => '0',
'weight' => '0.00'
];
$matches = array_filter($matches);
$matches = array_merge($default, array_intersect_key($matches, $default));
var_export($matches);
}
虽然使用命名捕获组在这种情况下肯定有其用途,但我发现语法会使模式变得不必要的臃肿和难以阅读。匹配子字符串(特别是可选的第二个子字符串)后,需要编写一个条件,将零长度字符串替换为
00.00
以下是我的建议:
代码:()()
输出:
array (
0 =>
array (
'date' => '02/12/2018',
'markOK' => '00.00',
'markTotal' => '86',
'weight' => '50.00',
),
1 =>
array (
'date' => '06/11/2018',
'markOK' => '22.40',
'markTotal' => '35',
'weight' => '50.00',
),
2 =>
array (
'date' => '04/12/2018',
'markOK' => '00.00',
'markTotal' => '60',
'weight' => '50.00',
),
3 =>
array (
'date' => '11/09/2018',
'markOK' => '27.00',
'markTotal' => '40',
'weight' => '50.00',
),
4 =>
array (
'date' => '09/09/2018',
'markOK' => '30.00',
'markTotal' => '30',
'weight' => '50.00',
),
5 =>
array (
'date' => '14/08/2018',
'markOK' => '31.00',
'markTotal' => '40',
'weight' => '50.00',
),
6 =>
array (
'date' => '19/06/2018',
'markOK' => '63.00',
'markTotal' => '72',
'weight' => '50.00',
),
7 =>
array (
'date' => '17/06/2018',
'markOK' => '45.00',
'markTotal' => '45',
'weight' => '50.00',
),
8 =>
array (
'date' => '22/05/2018',
'markOK' => '11.00',
'markTotal' => '55',
'weight' => '50.00',
),
)
即使未使用正则表达式,也应该能够描述如何提取字符串部分。比如➊ 首先是数字+斜杠的混合,➋ 然后找到第一个
=
,➌ 在句点或正斜杠之前获取数字,➍ 斜杠后面的数字,➎ 以及文字权重后的数字部分:␣代码>。一旦你用这样的词来表达,正则表达式通常是自己写的。这个问题似乎根本不包括任何解决问题的尝试。StackOverflow希望您能这样做,因为您的尝试有助于我们更好地了解您的需求。请编辑问题以显示您尝试了什么,并显示您遇到的具体障碍。有关更多信息,请参阅。这是在线网站中的工作。。。但在代码中不起作用$结果=preg\u match\u all(“/(?P\d{2}\/\d{2}\/\d{4})[^=]+=\s(?P\d+(?:\.\d+)?\/(?P\d+)[^:::+:\s(?P\d+(?:\.\d+)/”,$contentInfo);变量转储($result)。“\n”;preg match返回一个布尔值,尝试使用匹配的第三个参数,查看它的工作是否完美:)如果“markOk”为空,有任何更改,请替换为00.00
array (
'date' => '04/12/2018',
'markOK' => '0.00',
'markTotal' => '60',
'weight' => '50.00',
)
$strings = [
"02/12/2018 (Assessment 2) = /86= | Weight: 50.00%",
"06/11/2018 (Assessment 2) = 22.40/35=32.00 | Weight: 50.00%",
"04/12/2018 (Assessment 2) = /60= | Weight: 50.00%",
"11/09/2018 (Assessment 2) = 27.00/40=33.75 | Weight: 50.00%",
"09/09/2018 (Assessment 2) = 30.00/30=50.00 | Weight: 50.00%",
"14/08/2018 (Assessment 2) = 31.00/40=38.75 | Weight: 50.00%",
"19/06/2018 (Assessment 2) = 63.00/72=43.75 | Weight: 50.00%",
"17/06/2018 (Assessment 2) = 45.00/45=50.00 | Weight: 50.00%",
"22/05/2018 (Assessment 2) = 11.00/55=10.00 | Weight: 50.00%"
];
foreach ($strings as $string) {
if (!preg_match('~(\d\d/\d\d/\d{4})[^=]+= (\d+(?:\.\d+)?)?/(\d+)[^:]+: (\d+(?:\.\d+)?)~', $string, $m)) {
echo "No match for string: $string\n";
} else {
$results[] = ['date' => $m[1], 'markOK' => strlen($m[2]) ? $m[2] : '00.00', 'markTotal' => $m[3], 'weight' => $m[4]];
}
}
var_export($results);
array (
0 =>
array (
'date' => '02/12/2018',
'markOK' => '00.00',
'markTotal' => '86',
'weight' => '50.00',
),
1 =>
array (
'date' => '06/11/2018',
'markOK' => '22.40',
'markTotal' => '35',
'weight' => '50.00',
),
2 =>
array (
'date' => '04/12/2018',
'markOK' => '00.00',
'markTotal' => '60',
'weight' => '50.00',
),
3 =>
array (
'date' => '11/09/2018',
'markOK' => '27.00',
'markTotal' => '40',
'weight' => '50.00',
),
4 =>
array (
'date' => '09/09/2018',
'markOK' => '30.00',
'markTotal' => '30',
'weight' => '50.00',
),
5 =>
array (
'date' => '14/08/2018',
'markOK' => '31.00',
'markTotal' => '40',
'weight' => '50.00',
),
6 =>
array (
'date' => '19/06/2018',
'markOK' => '63.00',
'markTotal' => '72',
'weight' => '50.00',
),
7 =>
array (
'date' => '17/06/2018',
'markOK' => '45.00',
'markTotal' => '45',
'weight' => '50.00',
),
8 =>
array (
'date' => '22/05/2018',
'markOK' => '11.00',
'markTotal' => '55',
'weight' => '50.00',
),
)