Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/php/242.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Php 预匹配模式以扫描这些价格_Php_Sql_Preg Match - Fatal编程技术网

Php 预匹配模式以扫描这些价格

Php 预匹配模式以扫描这些价格,php,sql,preg-match,Php,Sql,Preg Match,我正在尝试从页面中扫描价格,我想使用此preg_匹配从该div中提取价格:519,00€。正确的预匹配是什么 这是我的提取器脚本: <?php echo "funziona!"; if(!$fp = fopen("https://www.google.it/webhp?sourceid=chrome-instant&ion=1&espv=2&es_th=1&ie=UTF-8#tbs=vw:l,mr:1&tbm=shop&q=sam

我正在尝试从页面中扫描价格,我想使用此preg_匹配从该div中提取价格:
519,00€
。正确的预匹配是什么

这是我的提取器脚本:

<?php
echo  "funziona!";

    if(!$fp = fopen("https://www.google.it/webhp?sourceid=chrome-instant&ion=1&espv=2&es_th=1&ie=UTF-8#tbs=vw:l,mr:1&tbm=shop&q=samsung+galaxy+note+4&tbas=0" ,"r" )) {
        return false;
    } //our fopen is right, so let's go
    $content = "";

    while(!feof($fp)) { //while it is not the last line, we will add the current line to our $content
        $content .= fgets($fp, 1024);
    }
    fclose($fp); //we are done here, don't need the main source anymore
?>

<?php
//our fopen, fgets here

//our magic regex here
preg_match_all('/<span class=\"price">(.*?)<\/span>/s',$content, $prices); //THIS IS PREG_MATCH 
    echo $prices[0][0]."<br />";
?>


您应该使用解析器而不是正则表达式来完成此任务。下面是如何使用
简单HTMLDOM解析器
实现这一点的示例

include_once 'simple_html_dom.php';
$html = file_get_html('http://www.example.com');
foreach($html->find('span') as $element) {
    if(strpos($element->class, 'price')){
        echo $element->innertext . "\n";
    }
}
这也是一个相当松散的检查,你可能会得到比你想要的更多的结果。它只是检查span的类是否包含单词
price

其他方法,请看一下:

<?php
function getUrl($Url,$Options = array(),&$optOut = array())
{

    $CURL_DEFAULT_SETTINGS  = array
    (
        CURLOPT_FOLLOWLOCATION => true,
        CURLOPT_AUTOREFERER => true,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_CONNECTTIMEOUT => 10,
        CURLOPT_MAXREDIRS => 10,
        CURLOPT_TIMEOUT => 10,
        CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8'
    );

    if (!($ch = curl_init($Url)))
        throw new Exception("Couldn't initialize cURL library",100);

    if (is_array($CURL_DEFAULT_SETTINGS) && count($CURL_DEFAULT_SETTINGS) > 0)
        curl_setopt_array($ch,$CURL_DEFAULT_SETTINGS);

    if (is_array($Options) && count($Options) > 0)
    {
        foreach ($Options as $k => $v)
        {
            curl_setopt($ch,$k,$v);
        }
    }

    $Data = curl_exec($ch);
    $Error = curl_error($ch);

    $optOut['CURLINFO_HEADER_OUT'] = curl_getinfo($ch, CURLINFO_HEADER_OUT );

    curl_close($ch);

    if (!$Data)
    {
        if ($Error)
            throw new Exception($Error);

        return false;
    }

    return $Data;
}

function getPriceFor($query) {
    $data = getUrl('https://www.google.it/search?tbs=vw:l,mr:1&tbm=shop&q='.rawurlencode($query).'&tbas=0&bav=on.2,or.&cad=b&fp=6a24b60e09fe0b18&biw=1196&bih=703&dpr=2&ion=1&espv=2&tch=1&ech=1&psi=byWgVee9A4TNeIXRgLAK.1436558704099.3');
    $data = '['.preg_replace('/\/\*""\*\//msi',',',preg_replace('/\/\*""\*\/[\s]*$/msi','',$data)).']';
    $data = json_decode($data,true);
    preg_match_all('/<div[\s]+class="_OA"><div><b>([^<]+)[\s]*<\/b><\/div><div>([^<]+)<\/div><\/div>/msi',$data[3]['d'],$res);

    $re = array();

    foreach ($res[1] as $k=>$r)
        $re[] = array('price'=>$r,'from'=>$res[2][$k]);

    return $re;
}

print_r(getPriceFor('samsung galaxy note 4'));

当前代码会发生什么情况?您不需要转义双引号,
\“
。您还需要第一个索引,而不是价格的零索引。这应该可以从网页打印价格,但有错误。完整代码在本指南中,没有“正确的”选项“怀孕。regexes+html=坏主意。使用DOM解析器。该内容通过javascript加载。检查您重新注册的内容是否不在PHP中。有“价格”吗?在尝试使用解析器时也会遇到同样的问题,首先需要内容正确。我不确定它是如何填充的。您需要对此进行研究,并找到一种使PHP模拟该过程的方法。谢谢tin和Chris,我非常感谢您的支持。Tin,当我尝试您的代码时,我得到以下错误:致命错误:
Uncaught异常“exception”,消息为“SSL证书问题:无法获取本地颁发者证书”,位于C:\xampp\htdocs\index.php:41堆栈跟踪:#0 C:\xampp\htdocs\index.php(50):getUrl('https://www.goo...“)#1c:\xampp\htdocs\index.php(63):getPriceFor('samsung galaxy…'))#2{main}在第41行的C:\xampp\htdocs\index.php中抛出
我看到您正在使用windows。您必须为curl设置ssl证书,或者使用file\u get\u contents而不是我正在调用的getUrl函数。我可以在几个小时内给你进一步的指导。哦,好的,你的程序在ubuntu中运行得很好。是的,最初我在XAMPP机器上使用了您的代码,但是因为我想在基于Ubuntu的机器上使用它,所以我不需要修改windows的代码。我真的很感谢你的支持,你一直都很清楚和直接。最后一个问题,你是如何构建用于扫描的url的?如果我想知道另一种产品的价格,我该怎么办?投票表决并将你的问题标记为已回答。这是表达感谢的最好方式。
Array
(
    [0] => Array
        (
            [price] => 515,00 €
            [from] => phoneshopping.it
        )

    [1] => Array
        (
            [price] => 519,00 €
            [from] => Smartyrama
        )

    [2] => Array
        (
            [price] => 519,00 €
            [from] => Smartyrama
        )

    [3] => Array
        (
            [price] => 519,00 €
            [from] => Smartyrama
        )

    [4] => Array
        (
            [price] => 690,45 €
            [from] => Amazon.it - Seller
        )

    [5] => Array
        (
            [price] => 673,99 €
            [from] => da 2 negozi
        )

    [6] => Array
        (
            [price] => 345,00 €
            [from] => da 2 negozi
        )

    [7] => Array
        (
            [price] => 342,00 €
            [from] => Amazon.it - Seller
        )

    [8] => Array
        (
            [price] => 699,99 €
            [from] => ePRICE.it
        )

    [9] => Array
        (
            [price] => 730,00 €
            [from] => in oltre 5 negozi
        )

    [10] => Array
        (
            [price] => 20,00 €
            [from] => Amazon.it - Seller
        )

    [11] => Array
        (
            [price] => 208,99 €
            [from] => eGlobal Central Italia
        )

    [12] => Array
        (
            [price] => 711,00 €
            [from] => in oltre 5 negozi
        )

    [13] => Array
        (
            [price] => 322,99 €
            [from] => eGlobal Central Italia
        )

    [14] => Array
        (
            [price] => 40,09 €
            [from] => da 4 negozi
        )

    [15] => Array
        (
            [price] => 15,99 €
            [from] => acadattatore.com
        )

    [16] => Array
        (
            [price] => 339,99 €
            [from] => ePRICE.it
        )

    [17] => Array
        (
            [price] => 412,90 €
            [from] => da 3 negozi
        )

    [18] => Array
        (
            [price] => 343,33 €
            [from] => Amazon.it - Seller
        )

    [19] => Array
        (
            [price] => 629,00 €
            [from] => BestPriceStore
        )

)