密码站点上的PHP curl multi
我目前正在使用以下(旧)代码登录网站密码站点上的PHP curl multi,php,curl,curl-multi,Php,Curl,Curl Multi,我目前正在使用以下(旧)代码登录网站 public function login() { $url1 = 'https://...'; /* Initial page load to collect initial session cookie data */ $url2 = 'https://...'; /* The page to POST login data to */ $url3 = 'https://...'; /* The page redirecte
public function login() {
$url1 = 'https://...'; /* Initial page load to collect initial session cookie data */
$url2 = 'https://...'; /* The page to POST login data to */
$url3 = 'https://...'; /* The page redirected to to test for success */
$un = 'user';
$pw = 'pass';
$post_data = array(
'authmethod' => 'on',
'username' => $un,
'password' => $pw,
'hrpwd' => $pw
);
$curlOpt1 = array(
CURLOPT_URL => $url1,
CURLOPT_COOKIEJAR => self::COOKIEFILE,
CURLOPT_COOKIEFILE => self::COOKIEFILE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_HEADER => FALSE,
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_SSL_VERIFYPEER => FALSE
);
$curlOpt2 = array(
CURLOPT_URL => $url2,
CURLOPT_COOKIEJAR => self::COOKIEFILE,
CURLOPT_COOKIEFILE => self::COOKIEFILE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_POST => TRUE,
CURLOPT_POSTFIELDS => http_build_query($post_data)
);
$this->ch = curl_init();
if ( !$this->ch ) {
throw new Exception('Unable to init curl. ' . curl_error($curl));
}
/* Load the login page once to get the session ID cookies */
curl_setopt_array( $this->ch, $curlOpt1 );
if ( !curl_exec( $this->ch ) ) {
throw new Exception( 'Unable to retrieve initial auth cookie.' );
}
/* POST the login data to the login page */
curl_setopt_array($this->ch, $curlOpt2);
if ( !curl_exec( $this->ch ) ) {
throw new Exception( 'Unable to post login data.' );
}
/* Verify the login by checking the redirected url. */
$header = curl_getinfo( $this->ch );
$retUrl = $header['url'];
if ( $retUrl == $url3 ) {
/* Reload the login page to get the auth cookies */
curl_setopt_array( $this->ch, $curlOpt1 );
if ( curl_exec( $this->ch ) ) {
return true;
} else {
throw new Exception( 'Unable to retrieve final auth cookie.' );
}
} else {
throw new Exception( 'Login validation failure.' );
}
return false;
}
然后我用
public function getHtml($url) {
$html = FALSE;
try {
curl_setopt($this->ch, CURLOPT_URL, $url);
$page = curl_exec($this->ch);
} catch (Exception $e) {
...
}
/* Remove all tabs and newlines from the HTML */
$rmv = array("\n","\t");
$html = str_replace($rmv, '', $page);
return $html;
}
…对于每个页面请求。我的问题是,如何将其转换为使用curl\u multi\u exec来加快数百次查找?我找不到任何使用curl\u multi登录的示例。我是否简单地用curl\u multi\u exec替换所有curl\u exec?
此外,如果你看到任何其他明显的错误,评论当然是受欢迎的
说清楚一点,我想用一个用户/通行证登录,然后将这些凭证用于多个页面请求。已经有一段时间了,但我想发布我的最终解决方案。我发现了一个很棒的多重卷曲库,这很有帮助。基本上,在收集了登录cookie(如我的原始问题所示)之后,我将其和其他选项反馈到每个多请求的rolling curl实例中,然后执行批处理。工作起来很有魅力
public function getMultiPage(array $urls, $url_prepend=NULL, $callback) {
$rc = new RollingCurl(array('Att_Screen_Scraper', $callback));
$rc->window_size = 15; /* number of threads to run */
$rc->options = array(
CURLOPT_COOKIEJAR => self::COOKIEFILE,
CURLOPT_COOKIEFILE => self::COOKIEFILE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_HEADER => FALSE,
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_SSL_VERIFYPEER => FALSE
);
foreach ($urls as $i=>$url) {
$request = new RollingCurlRequest($url_prepend . $url);
echo $url_prepend . $url . "<br>\n";
$rc->add($request);
}
if(!$rc->execute()) {
throw new Exception('RollingCurl execute failed');
}
return TRUE;
}
公共函数getMultiPage(数组$url,$url\u prepend=NULL,$callback){
$rc=新的RollingCurl(数组('Att_Screen_Scraper',$callback));
$rc->window_size=15;/*要运行的线程数*/
$rc->options=array(
CURLOPT_COOKIEJAR=>self::COOKIEFILE,
CURLOPT_COOKIEFILE=>self::COOKIEFILE,
CURLOPT_FOLLOWLOCATION=>TRUE,
CURLOPT_头=>FALSE,
CURLOPT_RETURNTRANSFER=>TRUE,
CURLOPT_SSL_VERIFYPEER=>FALSE
);
foreach($i=>$url的url){
$request=newrollingcurlrequest($url\u prepend.$url);
回显$url\u前缀。$url。“
\n”;
$rc->add($request);
}
如果(!$rc->execute()){
抛出新异常(“RollingCurl执行失败”);
}
返回TRUE;
}
请注意,此解决方案需要回调来处理每个请求的返回。RollingCurl的文档很好地描述了这一点,所以我在这里不再重复。为什么这看起来很邪恶?别担心,PHP不会造成任何真正的损害。因为curl中的所有东西看起来都很邪恶?我们在工作中使用它从其他部门收集产品数据,而不直接访问数据库。我同意这毫无意义,但这正是他们想要的。我们必须查找数百条记录,不想手动填写每一条记录的web表单。@isius,这很有意义,如果需要,数据库可以更改,“web服务”可以保持不变