PHP抓取页面_Php_Dom_Xpath_Screen Scraping

PHP抓取页面

php dom xpath

PHP抓取页面,php,dom,xpath,screen-scraping,Php,Dom,Xpath,Screen Scraping,我正试图在一个页面上搜索我要查找的信息： <tr class="defRowEven"> <td align="right">label</td> <td>info</td> </tr> 这就是我获取URL的方式。有没有办法抓取tr信息？使用正则表达式还是使用DOMXPath更好？我非常不熟悉DOMXPath，任何信息都会非常有用。谢谢大家! 我不熟悉xpath，但使用它可以做到： foreach($htm

我正试图在一个页面上搜索我要查找的信息：

 <tr class="defRowEven">
   <td align="right">label</td>
   <td>info</td>
 </tr>

这就是我获取URL的方式。有没有办法抓取

tr

信息？使用正则表达式还是使用

DOMXPath

更好？我非常不熟悉

DOMXPath

，任何信息都会非常有用。谢谢大家!

我不熟悉xpath，但使用它可以做到：

foreach($html->find('tr.defRowEven') as $row) {

    //get the 'label' (first cell)
    echo $row->find('td', 0)->innerText;

    //get the 'info' (second cell)
    echo $row->find('td', 1)->innerText;
}

我不熟悉xpath，但使用它可以做到：

foreach($html->find('tr.defRowEven') as $row) {

    //get the 'label' (first cell)
    echo $row->find('td', 0)->innerText;

    //get the 'info' (second cell)
    echo $row->find('td', 1)->innerText;
}

最近有人在SO给了一个链接。。一种针对php/服务器端的jQuery。。这会让这件事变得容易。我还没有试过，所以无法直接发表评论。最近在so的某个人给了我一个链接。。一种针对php/服务器端的jQuery。。这会让这件事变得容易。我还没有尝试过，所以不能直接评论，XPath可以根据属性进行选择。要查找您所在的行，请使用：

$rows = $xpath->query("//tr[@class='defRowEven']");

这将返回一个行列表，以便您可以为每个行选择标签和信息，而不会混淆它们：

foreach ($rows as $row) {
    $label = $xpath->evaluate("td[@align='right']", $row);
    $info = $xpath->evaluate("td[2]", $row);
}

如果不起作用，您可以尝试使用正则表达式路由：

preg_match_all('/<tr class="defRowEven">\s*<td align="right">(.*?)<\/td>\s*<td>(.*?)<\/td>/',
    $html, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
    list($full, $label, $info) = $match;
}

preg_match_all（'/\s*（.*？）\s*（.*？/”，
$html、$matches、PREG_SET_顺序）；
foreach（$matches作为$match进行匹配）{
列表（$full，$label，$info）=$match；
}

XPath可以根据属性进行选择。要查找您所在的行，请使用：

$rows = $xpath->query("//tr[@class='defRowEven']");

这将返回一个行列表，以便您可以为每个行选择标签和信息，而不会混淆它们：

foreach ($rows as $row) {
    $label = $xpath->evaluate("td[@align='right']", $row);
    $info = $xpath->evaluate("td[2]", $row);
}

如果不起作用，您可以尝试使用正则表达式路由：

preg_match_all('/<tr class="defRowEven">\s*<td align="right">(.*?)<\/td>\s*<td>(.*?)<\/td>/',
    $html, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
    list($full, $label, $info) = $match;
}

preg_match_all（'/\s*（.*？）\s*（.*？/”，
$html、$matches、PREG_SET_顺序）；
foreach（$matches作为$match进行匹配）{
列表（$full，$label，$info）=$match；
}

试过了，运气不好。只是一个空白屏幕。不过我会继续和全班同学一起学习，谢谢@Frederico-可以尝试echo$row->find（'td'，0）->纯文本；相反，我试过了，运气不好。只是一个空白屏幕。不过我会继续和全班同学一起学习，谢谢@Frederico-可以尝试echo$row->find（'td'，0）->纯文本；相反，您尝试了第二个示例，但无法使其正常工作。不过我会继续努力的。非常感谢。尝试了第二个示例，但无法使其正常工作。不过我会继续努力的。非常感谢。