elasticsearch 如何使用Curl将scroll_id发送到ElasticSearch
我不知道如何使用Curl将scroll_id发送到ElasticSearch 这是我迄今为止尝试过的,但似乎不起作用elasticsearch 如何使用Curl将scroll_id发送到ElasticSearch,elasticsearch,curl,elasticsearch,Curl,我不知道如何使用Curl将scroll_id发送到ElasticSearch 这是我迄今为止尝试过的,但似乎不起作用 $url = "http://distribution.virk.dk/cvr-permanent/virksomhed/_search?scroll=2m&_scroll_id=".$_POST["scroll_id"]; $data = array( "_scroll_id&qu
$url = "http://distribution.virk.dk/cvr-permanent/virksomhed/_search?scroll=2m&_scroll_id=".$_POST["scroll_id"];
$data = array(
"_scroll_id" => $_POST["scroll_id"],
"scroll_id" => $_POST["scroll_id"],
"size" => 10,
"_source" => array(
"Vrvirksomhed.cvrNummer",
"Vrvirksomhed.elektroniskPost",
"Vrvirksomhed.livsforloeb",
"Vrvirksomhed.hjemmeside",
"Vrvirksomhed.virksomhedMetadata.nyesteNavn.navn",
"Vrvirksomhed.hovedbranche",
"Vrvirksomhed.penheder",
"Vrvirksomhed.telefonnummer",
"Vrvirksomhed.virksomhedMetadata.nyesteBeliggenhedsadresse"
),
"query" => array (
"bool" => array (
"must_not" => array (
"exists" => array (
"field" => "Vrvirksomhed.livsforloeb.periode.gyldigTil"
)
)
)
)
);
ElasticSearch每次都返回相同的10篇文章,所以我认为它没有获得正确的滚动id
尝试Val的建议后更新了代码。使用setHosts,我在很长一段时间后得到一个超时。不考虑setHosts,我得到的错误是在集群中找不到活动节点
use Elasticsearch\ClientBuilder;
require 'vendor/autoload.php';
$username = "MY_USERNAME";
$password = "MY_PASSWORD";
$hosts = [
'host' => 'distribution.virk.dk',
'scheme' => 'http',
'path' => '/cvr-permanent',
'port' => '80',
'user' => $username,
'pass' => $password
];
$client = ClientBuilder::create()->setHosts($hosts)->build();
$params = [
'scroll' => '30s',
'size' => 50,
'type' => '/cvr-permanent/virksomhed',
'index' => 'virksomhed',
'body' => [
'query' => [
'match_all' => new \stdClass()
]
]
];
// Execute the search
// The response will contain the first batch of documents
// and a scroll_id
$response = $client->search($params);
// Now we loop until the scroll "cursors" are exhausted
while (isset($response['hits']['hits']) && count($response['hits']['hits']) > 0) {
// **
// Do your work here, on the $response['hits']['hits'] array
// **
// When done, get the new scroll_id
// You must always refresh your _scroll_id! It can change sometimes
$scroll_id = $response['_scroll_id'];
// Execute a Scroll request and repeat
$response = $client->scroll([
'body' => [
'scroll_id' => $scroll_id, //...using our previously obtained _scroll_id
'scroll' => '30s' // and the same timeout window
]
]);
}
使用scroll API有两个步骤 在第一步中,您需要发送查询和滚动上下文的持续时间 在第二步中,您不需要再次发送查询,只需发送从上一次滚动搜索中获得的滚动id即可
您可以找到一个完整的示例我已尝试按照您链接中的示例进行操作。首先,我得到“在集群中找不到活动节点”,然后我想我需要在某个地方填写用户名和密码,我将setHosts添加到ClientBuilder行,但过了很长一段时间后,我得到了一个超时错误。根据您的第一个代码片段,可能会用您正在尝试的新代码更新您的问题,我认为您需要使用端口80,而不是9200,因为在第二个代码段中,使用端口80不会带来太多变化。POST/cvr permanent/virksomhed/_search?search_type=scan&scroll=1m{“\u source”:[“Vrvirksomhed.cvrNummer”,“Vrvirksomhed.virksomhedMetadata.nyesteNavn.navn”,“Vrvirksomhed.virksomhedMetadata.nyesteBeliggenhedsadresse”],“query:{“全部匹配”:{},“大小”:200}这是否暗示了我应该尝试什么?我在说:
'port'=>9200',
应该是'port'=>80',