Php 发送INIT_DB数据包时出错。PID=7060_Php_Exception_Web Crawler

Php 发送INIT_DB数据包时出错。PID=7060

php exception web-crawler

Php 发送INIT_DB数据包时出错。PID=7060,php,exception,web-crawler,Php,Exception,Web Crawler,在执行以下函数时，引发了一个异常，之后出现了此错误：发送INIT_DB数据包时出错。PID=7060 然后脚本给出了以下几个警告： mysql_fetch_row（）希望参数1是资源功能： private function initiateChildCrawler($parent_Url_Html) { global $CFG; static $foundLink; static $parentID; static $urlT

在执行以下函数时，引发了一个异常，之后出现了此错误：

发送INIT_DB数据包时出错。PID=7060

然后脚本给出了以下几个警告：

mysql_fetch_row（）希望参数1是资源

功能：

private function initiateChildCrawler($parent_Url_Html) {

        global $CFG;
        static $foundLink;
        static $parentID;
        static $urlToCrawl_InstanceOfChildren;

        $foundLinkArray = array();
        $tempHtml = $parent_Url_Html->getHTML();
        $foundLinkCount = 0;
        foreach($tempHtml->find('a') as $foundLinkArray[$foundLinkCount++]);

        $anotherArray = array();
        $x = 0;     
        for($i = 0; $i<$foundLinkCount; $i++) {
            $anotherArray[$x++] = $foundLinkArray[$i]->href;            
        }

        foreach($anotherArray as $foundLink) 
        {
            $foundLink = url_to_absolute($parent_Url_Html->getURL(), $foundLink);

            if($this->validateEduDomain($foundLink)) 
            {
                $parentID = $this->loadSaveInstance->parentExists_In_URL_DB_CRAWL($this->returnParentDomain($foundLink));
                if($parentID != FALSE) 
                {
                    if($this->loadSaveInstance->checkUrlDuplication_In_URL_DB_CRAWL($foundLink) == FALSE)
                    {
                        $urlToCrawl_InstanceOfChildren = new urlToCrawl($foundLink);
                        if($urlToCrawl_InstanceOfChildren->getSimpleDomSource($CFG->finalContext)!= FALSE)
                        {                           
                            try {
                                $this->loadSaveInstance->url_db_html($urlToCrawl_InstanceOfChildren->getURL(), $urlToCrawl_InstanceOfChildren->getHTML());
                                $this->loadSaveInstance->saveCrawled_To_URL_DB_CRAWL(NULL, $foundLink, "crawled", $parentID);
                            } catch (DbException $e) {
                                echo "<br><br>Exception Catched on line 303!!!<br><br>";
                                echo "The link where the exception was thrown was: {$foundLink}<br>";
                                if(strstr($e->getMessage(), 'MySQL server has gone away')) {
                                    $this->connection = mysql_connect("localhost", "root", "");
                                    mysql_select_db("crawler1", $this->connection);
                                }                               
                            }                                               
                        }
                    }
                }
            }
        }   
    }

private function initiateChildCrawler（$parent\u Url\u Html）{
全球$CFG；
静态链接；
静态$parentID；
静态$urlToCrawl_InstanceOfChildren；
$foundLinkArray=array（）；
$tempHtml=$parent\u Url\u Html->getHTML（）；
$foundLinkCount=0；
foreach（$tempHtml->find（'a'）作为$foundLinkArray[$foundLinkCount++]）；
$anotherArray=array（）；
$x=0；
对于（$i=0；$ihref；
}
foreach（$foundLink中的另一个数组）
{
$foundLink=url\u to\u absolute（$parent\u url\u Html->getURL（），$foundLink）；
如果（$this->validatedudomain（$foundLink））
{
$parentID=$this->loadSaveInstance->parentExists\u In\u URL\u DB\u CRAWL（$this->returnParentDomain（$foundLink））；
如果（$parentID！=FALSE）
{
如果（$this->loadSaveInstance->checkurldeplication\u In\u URL\u DB\u CRAWL（$foundLink）==FALSE）
{
$urlToCrawl\u InstanceOfChildren=新的urlToCrawl（$foundLink）；
if（$urlToCrawl\u InstanceOfChildren->getSimpleDomainSource（$CFG->finalContext）！=FALSE）
{                           
试一试{
$this->loadSaveInstance->url\u db\u html（$urlToCrawl\u InstanceOfChildren->getURL（），$urlToCrawl\u InstanceOfChildren->getHTML（））；
$this->loadSaveInstance->saveCrawled\u To\u URL\u DB\u CRAWL（NULL，$foundLink，“crawled”，$parentID）；
}捕获（DbException$e）{
echo“

在第303行捕获异常！！！

”；
echo“引发异常的链接是：{$foundLink}
”；
if（strstr（$e->getMessage（），'MySQL服务器已经离开了'））{
$this->connection=mysql\u connect（“本地主机”、“根目录”、“根目录”）；
mysql_select_db（“crawler1”，$this->connection）；
}                               
}                                               
}
}
}
}
}   
}

导致异常的特定URL是：

我对此一无所知。请帮助。

我在对其他东西进行相同类型的爬网时遇到了相同的错误。@Fabrizio？当服务器有大约100-150个并发连接时（限制设置为1000，CPU使用率低于2-3%），我会随机得到它你找到了有效的解决方案吗？我在MySQL中调整了很多设置，它似乎在工作。也许你想检查你的索引，看看查询是否花费了太长时间，或者你有太多打开的连接，这就是为什么它开始给你这个错误。我注意到的另一点是/tmp中的空间被占用了在这些错误期间变得非常低（临时表）