Php 发送INIT_DB数据包时出错。PID=7060
在执行以下函数时,引发了一个异常,之后出现了此错误: 发送INIT_DB数据包时出错。PID=7060 然后脚本给出了以下几个警告: mysql_fetch_row()希望参数1是资源 功能:Php 发送INIT_DB数据包时出错。PID=7060,php,exception,web-crawler,Php,Exception,Web Crawler,在执行以下函数时,引发了一个异常,之后出现了此错误: 发送INIT_DB数据包时出错。PID=7060 然后脚本给出了以下几个警告: mysql_fetch_row()希望参数1是资源 功能: private function initiateChildCrawler($parent_Url_Html) { global $CFG; static $foundLink; static $parentID; static $urlT
private function initiateChildCrawler($parent_Url_Html) {
global $CFG;
static $foundLink;
static $parentID;
static $urlToCrawl_InstanceOfChildren;
$foundLinkArray = array();
$tempHtml = $parent_Url_Html->getHTML();
$foundLinkCount = 0;
foreach($tempHtml->find('a') as $foundLinkArray[$foundLinkCount++]);
$anotherArray = array();
$x = 0;
for($i = 0; $i<$foundLinkCount; $i++) {
$anotherArray[$x++] = $foundLinkArray[$i]->href;
}
foreach($anotherArray as $foundLink)
{
$foundLink = url_to_absolute($parent_Url_Html->getURL(), $foundLink);
if($this->validateEduDomain($foundLink))
{
$parentID = $this->loadSaveInstance->parentExists_In_URL_DB_CRAWL($this->returnParentDomain($foundLink));
if($parentID != FALSE)
{
if($this->loadSaveInstance->checkUrlDuplication_In_URL_DB_CRAWL($foundLink) == FALSE)
{
$urlToCrawl_InstanceOfChildren = new urlToCrawl($foundLink);
if($urlToCrawl_InstanceOfChildren->getSimpleDomSource($CFG->finalContext)!= FALSE)
{
try {
$this->loadSaveInstance->url_db_html($urlToCrawl_InstanceOfChildren->getURL(), $urlToCrawl_InstanceOfChildren->getHTML());
$this->loadSaveInstance->saveCrawled_To_URL_DB_CRAWL(NULL, $foundLink, "crawled", $parentID);
} catch (DbException $e) {
echo "<br><br>Exception Catched on line 303!!!<br><br>";
echo "The link where the exception was thrown was: {$foundLink}<br>";
if(strstr($e->getMessage(), 'MySQL server has gone away')) {
$this->connection = mysql_connect("localhost", "root", "");
mysql_select_db("crawler1", $this->connection);
}
}
}
}
}
}
}
}
private function initiateChildCrawler($parent\u Url\u Html){
全球$CFG;
静态链接;
静态$parentID;
静态$urlToCrawl_InstanceOfChildren;
$foundLinkArray=array();
$tempHtml=$parent\u Url\u Html->getHTML();
$foundLinkCount=0;
foreach($tempHtml->find('a')作为$foundLinkArray[$foundLinkCount++]);
$anotherArray=array();
$x=0;
对于($i=0;$ihref;
}
foreach($foundLink中的另一个数组)
{
$foundLink=url\u to\u absolute($parent\u url\u Html->getURL(),$foundLink);
如果($this->validatedudomain($foundLink))
{
$parentID=$this->loadSaveInstance->parentExists\u In\u URL\u DB\u CRAWL($this->returnParentDomain($foundLink));
如果($parentID!=FALSE)
{
如果($this->loadSaveInstance->checkurldeplication\u In\u URL\u DB\u CRAWL($foundLink)==FALSE)
{
$urlToCrawl\u InstanceOfChildren=新的urlToCrawl($foundLink);
if($urlToCrawl\u InstanceOfChildren->getSimpleDomainSource($CFG->finalContext)!=FALSE)
{
试一试{
$this->loadSaveInstance->url\u db\u html($urlToCrawl\u InstanceOfChildren->getURL(),$urlToCrawl\u InstanceOfChildren->getHTML());
$this->loadSaveInstance->saveCrawled\u To\u URL\u DB\u CRAWL(NULL,$foundLink,“crawled”,$parentID);
}捕获(DbException$e){
echo“
在第303行捕获异常!!!
”;
echo“引发异常的链接是:{$foundLink}
”;
if(strstr($e->getMessage(),'MySQL服务器已经离开了')){
$this->connection=mysql\u connect(“本地主机”、“根目录”、“根目录”);
mysql_select_db(“crawler1”,$this->connection);
}
}
}
}
}
}
}
}
导致异常的特定URL是:
我对此一无所知。请帮助。我在对其他东西进行相同类型的爬网时遇到了相同的错误。@Fabrizio?当服务器有大约100-150个并发连接时(限制设置为1000,CPU使用率低于2-3%),我会随机得到它你找到了有效的解决方案吗?我在MySQL中调整了很多设置,它似乎在工作。也许你想检查你的索引,看看查询是否花费了太长时间,或者你有太多打开的连接,这就是为什么它开始给你这个错误。我注意到的另一点是/tmp中的空间被占用了在这些错误期间变得非常低(临时表)