使用PHP从HTML中提取数据_Php_Html_Extract_Html Content Extraction

使用PHP从HTML中提取数据

php html

使用PHP从HTML中提取数据,php,html,extract,html-content-extraction,Php,Html,Extract,Html Content Extraction,以下是我想要的：我有一个链接，显示一些HTML格式的数据：数据格式如下： <div class="searchResult regular"> 约翰鸟利兹韦特路56号伦敦 SW11 6RS 020 7228 5576 我希望我的PHP页面执行上面的URL，并根据上面的标记从结果HTML页面提取/解析数据，如下所示 h2=名称地址电话号码并以表格格式显示它们我知道了，但它只显示HTML页面的文本格式，但在一定程度上起作用： <? function ge

以下是我想要的：

我有一个链接，显示一些HTML格式的数据：

数据格式如下：

<div class="searchResult regular">

约翰鸟利兹韦特路56号
伦敦
SW11 6RS 020 7228 5576

我希望我的PHP页面执行上面的URL，并根据上面的标记从结果HTML页面提取/解析数据，如下所示 h2=名称地址电话号码

并以表格格式显示它们

我知道了，但它只显示HTML页面的文本格式，但在一定程度上起作用：

<?
function get_content($url) 
{ 
$ch = curl_init(); 

curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_HEADER, 0); 

ob_start(); 

curl_exec ($ch); 
curl_close ($ch); 
$string = ob_get_contents(); 

ob_end_clean(); 

return $string; 

} 


$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=1"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=2"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=3"); 
echo $content;
$content = get_content("http://www.118.com/people-search.mvc?Supplied=true&Name=william&Location=Crabtree&pageSize=50&pageNumber=4"); 
echo $content;

?>

您需要使用dom解析器或类似工具

将文件读入dom对象并使用适当的选择器对其进行解析：

$html = new simple_html_dom("http://www.118.com/people-search.mvc...0&pageNumber=1");

foreach($html->find(.searchResult+regular) as $div) {
  //parse div contents here to extract name and address etc.
}
$html->clear();
unset($html);

有关更多信息，请参阅文档。

您需要使用dom解析器或类似工具

将文件读入dom对象并使用适当的选择器对其进行解析：

$html = new simple_html_dom("http://www.118.com/people-search.mvc...0&pageNumber=1");

foreach($html->find(.searchResult+regular) as $div) {
  //parse div contents here to extract name and address etc.
}
$html->clear();
unset($html);

有关更多信息，请参阅文档。

使用DOM解析器。这里有一个很好的例子：使用DOM解析器。这里有一个很好的例子：