当有日期和时间时,td中的php正则表达式

当有日期和时间时,td中的php正则表达式,php,regex,preg-match-all,pcre,Php,Regex,Preg Match All,Pcre,我需要用reg exp提取日期和时间,但不起作用,我不知道为什么 <tr> <td align="center">13.44.333-3</td> <td align="center">asdf3</td> <td align="center">15/01/2016 00:22:16</td> <td align="center">

我需要用reg exp提取日期和时间,但不起作用,我不知道为什么

    <tr>
        <td align="center">13.44.333-3</td>
        <td align="center">asdf3</td>
        <td align="center">15/01/2016 00:22:16</td>
        <td align="center">$ 1531</td>
    </tr>
 <tr>
        <td align="center">13.333.333-3</td>
        <td align="center">asdf3</td>
        <td align="center">16/01/2016 00:22:16</td>
        <td align="center">$ 1531</td>
    </tr>
 <tr>
        <td align="center">13.333.333-3</td>
        <td align="center">asdf3</td>
        <td align="center">11/01/2015 00:22:16</td>
        <td align="center">$ 1531</td>
    </tr>

13.44.333-3
asdf3
15/01/2016 00:22:16
$ 1531
13.333.333-3
asdf3
16/01/2016 00:22:16
$ 1531
13.333.333-3
asdf3
11/01/2015 00:22:16
$ 1531
我使用的注册表exp:

preg_match_all("/<td align=\"center\"\>[\s]*([^\s\<\/]*)<\/td>[\s]*<td align=\"center\"\>/is",$content, $matches, null, 0);

preg\u match\u all(“/[\s]*”([^\s\这里是一种解析器/正则表达式方法:

$html = '<tr>
                            <td align="center">13.333.333-3</td>
                            <td align="center">asdf3</td>
                            <td align="center">15/01/2016 00:22:16</td>
                            <td align="center">$ 1531</td>
                        </tr>';
$thedoc = new DOMDocument();
$thedoc->loadHTML($html);
$cells = $thedoc->getElementsByTagName('td');
foreach($cells as $cell){
    if(preg_match('~^(\d{2}/\d{2}/\d{4})\h(\d{2}:\d{2}:\d{2})$~', $cell->nodeValue, $matches)) {
         echo 'Date:' . $matches[1] . ' Time:'. $matches[2];
    }
}
$html='1!'
13.333.333-3
asdf3
15/01/2016 00:22:16
$ 1531
';
$thedoc=新的DOMDocument();
$thedoc->loadHTML($html);
$cells=$thedoc->getElementsByTagName('td');
foreach($cells作为$cell){
if(preg_match(“~^(\d{2}/\d{2}/\d{4})\h(\d{2}:\d{2}:\d{2})$~,$cell->nodeValue,$matches)){
回显“日期:”.$matches[1]。“时间:”.$matches[2];
}
}
PHP演示:
Regex101演示:


这也将允许无效的时间/日期,但它们必须正确格式化,例如
22/22/2222 25:61:62
。根据需要,您可以使其工作,也可以制作零件(秒)如果需要,可以选择。您还可以将日、月、年、小时、分钟和秒分别分组。

使用适当的DOM解析器解析HTML比在其上使用正则表达式更好,因此我将首先给出该解决方案:

1.有文件 为此,请结合使用

以下代码仅获取第三列的内容,其中包含日期/时间:

$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$elements = $xpath->query('//td[3]');
$matches = array_map(function($td) {
    return $td->textContent;
}, iterator_to_array($elements));
这段代码将执行XPath查询,在给定的HTML中查找td元素,这些元素是它们各自父级(tr)的第三个子元素,然后将找到的每个td的文本内容映射到一个数组中

如果$html变量包含以下字符串:

<table width="100%" border="0" cellspacing="0" cellpadding="0" id="facturas">
<tr>
    <td align="center">13.44.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">15/01/2016 00:22:16</td>
    <td align="center">$ 1531</td>
 </tr>
 <tr>
    <td align="center">13.333.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">16/01/2016 00:22:16</td>
    <td align="center">$ 1531</td>
 </tr>
 <tr>
    <td align="center">13.333.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">11/01/2015 00:22:16</td>
    <td align="center">$ 1531</td>
</tr>
</table>
请参阅在打开输出的情况下运行的代码

一些可选的XPath查询: 如果$html可能有其他表,则应将搜索限制在感兴趣的表,例如id等于facturas的表:

//*[@id=“facturas”]///td[3]
要确保每个匹配的td都将“对齐”属性设置为“中心”:

//td[@align=“center”]
要查找具有特定文本的元素,如“/2016”:

//td[包含(,“/2016”)]
2.使用正则表达式 虽然不建议使用正则表达式,但也可以使用正则表达式

如果您仍要执行此操作,请使用以下代码:

preg_match_all("/<td[^>]*\>\s*(\d\d\/\d\d\/\d{4}\b[^<]*)<\/td\s*>/mis",
               $html, $matches);

preg\u match\u all(“/]*\>\s*(\d\d\/\d\d\/\d{4}\b[^您找到解决方案了吗,我想帮您

<?php

$html=<<<HEREDOC
  <tr>
    <td align="center">13.44.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">15/01/2016 00:22:16</td>
    <td align="center">$ 1531</td>
</tr>
<tr>
    <td align="center">13.333.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">16/01/2016 00:22:16</td>
    <td align="center">$ 1531</td>
</tr>
 <tr>
    <td align="center">13.333.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">11/01/2015 00:22:16</td>
    <td align="center">$ 1531</td>
</tr>
HEREDOC;

if(preg_match_all('~<td\s+[^>]*>((?:\d+(?:\/\d+){2})\s+(?:\d+(?:\:\d+){2}))<\/td>~mi',$html,$matchall)){
    print_r($matchall);
}
?>

输出将是

Array
(
[0] => Array
    (
        [0] => <td align="center">15/01/2016 00:22:16</td>
        [1] => <td align="center">16/01/2016 00:22:16</td>
        [2] => <td align="center">11/01/2015 00:22:16</td>
    )

[1] => Array
    (
        [0] => 15/01/2016 00:22:16
        [1] => 16/01/2016 00:22:16
        [2] => 11/01/2015 00:22:16
    )

)
数组
(
[0]=>阵列
(
[0] => 15/01/2016 00:22:16
[1] => 16/01/2016 00:22:16
[2] => 11/01/2015 00:22:16
)
[1] =>阵列
(
[0] => 15/01/2016 00:22:16
[1] => 16/01/2016 00:22:16
[2] => 11/01/2015 00:22:16
)
)

你永远不应该用正则表达式解析HTML。改用正则表达式。请不要死。我创建了一个机器人,我没有其他模式,它是一个客户端请求,必须链接到一个著名的相关答案:@JayBlanchard:它并不总是正确的,所以有比那个更好的帖子。@JayBlanchard:是的,但我不清楚OP是否只想要e第三个元素的值。问题中提供的
preg\u match\u all
语句甚至不返回日期,只返回第一个元素的内容。可能是对问题进行编码时出错。我认为目的是获取所有四个元素。您可以这样告诉我:@amachree我可以提供帮助,但找不到链接
array (
  0 => 
  array (
    0 => '<td align="center">15/01/2016 00:22:16</td>',
    1 => '<td align="center">16/01/2016 00:22:16</td>',
    2 => '<td align="center">11/01/2015 00:22:16</td>',
  ),
  1 => 
  array (
    0 => '15/01/2016 00:22:16',
    1 => '16/01/2016 00:22:16',
    2 => '11/01/2015 00:22:16',
  ),
)
<?php

$html=<<<HEREDOC
  <tr>
    <td align="center">13.44.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">15/01/2016 00:22:16</td>
    <td align="center">$ 1531</td>
</tr>
<tr>
    <td align="center">13.333.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">16/01/2016 00:22:16</td>
    <td align="center">$ 1531</td>
</tr>
 <tr>
    <td align="center">13.333.333-3</td>
    <td align="center">asdf3</td>
    <td align="center">11/01/2015 00:22:16</td>
    <td align="center">$ 1531</td>
</tr>
HEREDOC;

if(preg_match_all('~<td\s+[^>]*>((?:\d+(?:\/\d+){2})\s+(?:\d+(?:\:\d+){2}))<\/td>~mi',$html,$matchall)){
    print_r($matchall);
}
?>
Array
(
[0] => Array
    (
        [0] => <td align="center">15/01/2016 00:22:16</td>
        [1] => <td align="center">16/01/2016 00:22:16</td>
        [2] => <td align="center">11/01/2015 00:22:16</td>
    )

[1] => Array
    (
        [0] => 15/01/2016 00:22:16
        [1] => 16/01/2016 00:22:16
        [2] => 11/01/2015 00:22:16
    )

)