Php 正则表达式从具有可预测格式的字符串中提取多个子字符串

Php 正则表达式从具有可预测格式的字符串中提取多个子字符串,php,regex,string,Php,Regex,String,我在下面找到了这根绳子 string(49) "02/12/2018 (Assessment 2) = /86= | Weight: 50.00%" string(49) "02/12/2018 (Assessment 2) = 50.83/86= | Weight: 50.00%"" 第一个示例在/之前没有显示任何数字,在这种情况下,我需要使用00.00作为默认值 我需要获得这些信息,然后放入如下数组: $dados[ "date" ] = "02/12/2018" $dados[ "mar

我在下面找到了这根绳子

string(49) "02/12/2018 (Assessment 2) = /86= | Weight: 50.00%"
string(49) "02/12/2018 (Assessment 2) = 50.83/86= | Weight: 50.00%""
第一个示例在
/
之前没有显示任何数字,在这种情况下,我需要使用
00.00
作为默认值

我需要获得这些信息,然后放入如下数组:

$dados[ "date" ] = "02/12/2018"
$dados[ "markOK" ] = "50"
$dados[ "markTotal" ] = "86"
$dados[ "weight" ] = "50.00"
其他例子:

string(49) "02/12/2018 (Assessment 2) = /86= | Weight: 50.00%"
string(59) "06/11/2018 (Assessment 2) = 22.40/35=32.00 | Weight: 50.00%"
string(49) "04/12/2018 (Assessment 2) = /60= | Weight: 50.00%"
string(59) "11/09/2018 (Assessment 2) = 27.00/40=33.75 | Weight: 50.00%"
string(59) "09/09/2018 (Assessment 2) = 30.00/30=50.00 | Weight: 50.00%"
string(59) "14/08/2018 (Assessment 2) = 31.00/40=38.75 | Weight: 50.00%"
string(59) "19/06/2018 (Assessment 2) = 63.00/72=43.75 | Weight: 50.00%"
string(59) "17/06/2018 (Assessment 2) = 45.00/45=50.00 | Weight: 50.00%"
string(59) "22/05/2018 (Assessment 2) = 11.00/55=10.00 | Weight: 50.00%"
像这样:

(?P<date>\d{2}\/\d{2}\/\d{4})[^=]+=\s(?P<markOK>\d+(?:\.\d+)?)?\/(?P<markTotal>\d+)[^:]+:\s(?P<weight>\d+(?:\.\d+)?)
如果要清理matches数组,可以按如下方式执行:

$str = '04/12/2018 (Assessment 2) = /60= | Weight: 50.00%';
$pattern = '/(?P<date>\d{2}\/\d{2}\/\d{4})[^=]+=\s(?P<markOK>\d+(?:\.\d+)?)?\/(?P<markTotal>\d+)[^:]+:\s(?P<weight>\d+(?:\.\d+)?)/';
if(preg_match($pattern, $str, $matches)){
    $default = [
        'date' => '00/00/0000',
        'markOK' => '0.00',
        'markTotal' => '0',
        'weight' => '0.00'
    ];

    $matches = array_filter($matches);
    $matches = array_merge($default, array_intersect_key($matches, $default));


   var_export($matches);
}

虽然使用命名捕获组在这种情况下肯定有其用途,但我发现语法会使模式变得不必要的臃肿和难以阅读。匹配子字符串(特别是可选的第二个子字符串)后,需要编写一个条件,将零长度字符串替换为
00.00

以下是我的建议:

代码:()()

输出:

array (
  0 => 
  array (
    'date' => '02/12/2018',
    'markOK' => '00.00',
    'markTotal' => '86',
    'weight' => '50.00',
  ),
  1 => 
  array (
    'date' => '06/11/2018',
    'markOK' => '22.40',
    'markTotal' => '35',
    'weight' => '50.00',
  ),
  2 => 
  array (
    'date' => '04/12/2018',
    'markOK' => '00.00',
    'markTotal' => '60',
    'weight' => '50.00',
  ),
  3 => 
  array (
    'date' => '11/09/2018',
    'markOK' => '27.00',
    'markTotal' => '40',
    'weight' => '50.00',
  ),
  4 => 
  array (
    'date' => '09/09/2018',
    'markOK' => '30.00',
    'markTotal' => '30',
    'weight' => '50.00',
  ),
  5 => 
  array (
    'date' => '14/08/2018',
    'markOK' => '31.00',
    'markTotal' => '40',
    'weight' => '50.00',
  ),
  6 => 
  array (
    'date' => '19/06/2018',
    'markOK' => '63.00',
    'markTotal' => '72',
    'weight' => '50.00',
  ),
  7 => 
  array (
    'date' => '17/06/2018',
    'markOK' => '45.00',
    'markTotal' => '45',
    'weight' => '50.00',
  ),
  8 => 
  array (
    'date' => '22/05/2018',
    'markOK' => '11.00',
    'markTotal' => '55',
    'weight' => '50.00',
  ),
)

即使未使用正则表达式,也应该能够描述如何提取字符串部分。比如➊ 首先是数字+斜杠的混合,➋ 然后找到第一个
=
,➌ 在句点或正斜杠之前获取数字,➍ 斜杠后面的数字,➎ 以及文字
权重后的数字部分:␣。一旦你用这样的词来表达,正则表达式通常是自己写的。这个问题似乎根本不包括任何解决问题的尝试。StackOverflow希望您能这样做,因为您的尝试有助于我们更好地了解您的需求。请编辑问题以显示您尝试了什么,并显示您遇到的具体障碍。有关更多信息,请参阅。这是在线网站中的工作。。。但在代码中不起作用$结果=preg\u match\u all(“/(?P\d{2}\/\d{2}\/\d{4})[^=]+=\s(?P\d+(?:\.\d+)?\/(?P\d+)[^:::+:\s(?P\d+(?:\.\d+)/”,$contentInfo);变量转储($result)。“\n”;preg match返回一个布尔值,尝试使用匹配的第三个参数,查看它的工作是否完美:)如果“markOk”为空,有任何更改,请替换为00.00
array (
  'date' => '04/12/2018',
  'markOK' => '0.00',
  'markTotal' => '60',
  'weight' => '50.00',
)
$strings = [
    "02/12/2018 (Assessment 2) = /86= | Weight: 50.00%",
    "06/11/2018 (Assessment 2) = 22.40/35=32.00 | Weight: 50.00%",
    "04/12/2018 (Assessment 2) = /60= | Weight: 50.00%",
    "11/09/2018 (Assessment 2) = 27.00/40=33.75 | Weight: 50.00%",
    "09/09/2018 (Assessment 2) = 30.00/30=50.00 | Weight: 50.00%",
    "14/08/2018 (Assessment 2) = 31.00/40=38.75 | Weight: 50.00%",
    "19/06/2018 (Assessment 2) = 63.00/72=43.75 | Weight: 50.00%",
    "17/06/2018 (Assessment 2) = 45.00/45=50.00 | Weight: 50.00%",
    "22/05/2018 (Assessment 2) = 11.00/55=10.00 | Weight: 50.00%"
];

foreach ($strings as $string) {
    if (!preg_match('~(\d\d/\d\d/\d{4})[^=]+= (\d+(?:\.\d+)?)?/(\d+)[^:]+: (\d+(?:\.\d+)?)~', $string, $m)) {
        echo "No match for string: $string\n";
    } else {
        $results[] = ['date' => $m[1], 'markOK' => strlen($m[2]) ? $m[2] : '00.00', 'markTotal' => $m[3], 'weight' => $m[4]];
    }
}
var_export($results);
array (
  0 => 
  array (
    'date' => '02/12/2018',
    'markOK' => '00.00',
    'markTotal' => '86',
    'weight' => '50.00',
  ),
  1 => 
  array (
    'date' => '06/11/2018',
    'markOK' => '22.40',
    'markTotal' => '35',
    'weight' => '50.00',
  ),
  2 => 
  array (
    'date' => '04/12/2018',
    'markOK' => '00.00',
    'markTotal' => '60',
    'weight' => '50.00',
  ),
  3 => 
  array (
    'date' => '11/09/2018',
    'markOK' => '27.00',
    'markTotal' => '40',
    'weight' => '50.00',
  ),
  4 => 
  array (
    'date' => '09/09/2018',
    'markOK' => '30.00',
    'markTotal' => '30',
    'weight' => '50.00',
  ),
  5 => 
  array (
    'date' => '14/08/2018',
    'markOK' => '31.00',
    'markTotal' => '40',
    'weight' => '50.00',
  ),
  6 => 
  array (
    'date' => '19/06/2018',
    'markOK' => '63.00',
    'markTotal' => '72',
    'weight' => '50.00',
  ),
  7 => 
  array (
    'date' => '17/06/2018',
    'markOK' => '45.00',
    'markTotal' => '45',
    'weight' => '50.00',
  ),
  8 => 
  array (
    'date' => '22/05/2018',
    'markOK' => '11.00',
    'markTotal' => '55',
    'weight' => '50.00',
  ),
)