PHP cURL网站抓取不起作用_Php_Curl_Web Scraping

PHP cURL网站抓取不起作用

php curl web-scraping

PHP cURL网站抓取不起作用,php,curl,web-scraping,Php,Curl,Web Scraping,我有一个基于cURL的代码从网站上获取产品的价格。我想把结果从你那里拿来价格为： <div class="prodbuy-price"> <div id="mrp-price-outer" class=""> <div id="seller-price-outer" class=""> <div id="offer-price-id"> <meta content="INR" itemprop="priceCurrency"> <

我有一个基于cURL的代码从网站上获取产品的价格。我想把结果从你那里拿来

价格为：

<div class="prodbuy-price">
<div id="mrp-price-outer" class="">
<div id="seller-price-outer" class="">
<div id="offer-price-id">
<meta content="INR" itemprop="priceCurrency">
<strong class="voucherPrice">
Rs
<span id="selling-price-id" itemprop="price">36500</span>
</strong>



Rs
36500

我获取价格的代码是：

<?php
$curl = curl_init('http://www.snapdeal.com/product/apple-iphone-5s-16-gb/1302850866');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');

$page = curl_exec($curl);

if(!empty($curl)){ //if any html is actually returned

    $pokemon_doc = new DOMDocument;
    libxml_use_internal_errors(true);
    $pokemon_doc->loadHTML($page);
    libxml_clear_errors(); //remove errors for yucky html

    $pokemon_xpath = new DOMXPath($pokemon_doc);

   // $price = $pokemon_xpath->evaluate('string(//div[@class="prices"]/meta[@itemprop="price"]/@content)');
   // echo $price;

    $rupees = $pokemon_xpath->evaluate('string(//div[@class="prodbuy-price"]/span[@itemprop="price"])');
    echo $rupees;
}
else {
    print "Not found";
}
?>

loadHTML（$page）；
libxml_clear_errors（）//删除讨厌的html的错误
$pokemon_xpath=newdomxpath（$pokemon_doc）；
//$price=$pokemon\u xpath->evaluate（/string（//div[@class=“prices”]/meta[@itemprop=“price”]/@content））；
//回声$price；
$rupes=$pokemon_xpath->evaluate（/string（//div[@class=“prodbuy price”]/span[@itemprop=“price”]）；
埃科美元卢比；
}
否则{
打印“未找到”；
}
?>

我没有收到任何错误，也没有显示任何数据（价格）。我无法跟踪任何错误。

我犯了一个非常愚蠢的错误：添加一个额外的“/”解决了问题。感谢@DaveCast。新的代码是

<?php
$curl = curl_init('http://www.snapdeal.com/product/apple-iphone-5s-16-gb/1302850866');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');

$page = curl_exec($curl);

if(!empty($curl)){ //if any html is actually returned

    $pokemon_doc = new DOMDocument;
    libxml_use_internal_errors(true);
    $pokemon_doc->loadHTML($page);
    libxml_clear_errors(); //remove errors for yucky html

    $pokemon_xpath = new DOMXPath($pokemon_doc);

   // $price = $pokemon_xpath->evaluate('string(//div[@class="prices"]/meta[@itemprop="price"]/@content)');
   // echo $price;

    $rupees = $pokemon_xpath->evaluate('string(//div[@class="prodbuy-price"]//span[@itemprop="price"])');
    echo $rupees;
}
else {
    print "Not found";
}
?>

loadHTML（$page）；
libxml_clear_errors（）//删除讨厌的html的错误
$pokemon_xpath=newdomxpath（$pokemon_doc）；
//$price=$pokemon\u xpath->evaluate（/string（//div[@class=“prices”]/meta[@itemprop=“price”]/@content））；
//回声$price；
$rupes=$pokemon_xpath->evaluate（/string（//div[@class=“prodbuy price”]///span[@itemprop=“price”]）；
埃科美元卢比；
}
否则{
打印“未找到”；
}
?>

希望这对别人有帮助

在span之前尝试额外的斜杠以获得任何死者。一个斜杠是直系死者。是的，它工作得很好。谢谢你，我可能没有发现那个愚蠢的错误。非常感谢！