<img src="//i.stack.imgur.com/RUiNP.png" height="16" width="18" alt="" class="sponsor tag img">elasticsearch Courier获取：碎片失败_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch_Kibana_Kibana 4

elasticsearch Courier获取：碎片失败

kibana

elasticsearch Courier获取：碎片失败,elasticsearch,kibana,kibana-4,elasticsearch,Kibana,Kibana 4,为什么在向elasticsearch添加更多数据后会收到这些警告？每次我浏览仪表板时，警告都不一样 “Courier获取：60个碎片中有30个失败。” 更多详情：它是CentOS 7.1上的唯一节点 /etc/elasticsearch/elasticsearch.yml index.number_of_shards: 3 index.number_of_replicas: 1 bootstrap.mlockall: true threadpool.bulk.queue_size:

为什么在向elasticsearch添加更多数据后会收到这些警告？每次我浏览仪表板时，警告都不一样

“Courier获取：60个碎片中有30个失败。”

更多详情：

它是CentOS 7.1上的唯一节点

/etc/elasticsearch/elasticsearch.yml

index.number_of_shards: 3
index.number_of_replicas: 1

bootstrap.mlockall: true

threadpool.bulk.queue_size: 1000
indices.fielddata.cache.size: 50%
threadpool.index.queue_size: 400
index.refresh_interval: 30s

index.number_of_shards: 5
index.number_of_replicas: 1

/usr/share/elasticsearch/bin/elasticsearch.in.sh

ES_HEAP_SIZE=3G

#I use this Garbage Collector instead of the default one.

JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC"

群集状态

{
  "cluster_name" : "my_cluster",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 61,
  "active_shards" : 61,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 61
}

{
  "cluster_name" : "my_cluster",
  "nodes" : {
    "some weird number" : {
      "name" : "ES 1",
      "transport_address" : "inet[localhost/127.0.0.1:9300]",
      "host" : "some host",
      "ip" : "150.244.58.112",
      "version" : "1.4.4",
      "build" : "c88f77f",
      "http_address" : "inet[localhost/127.0.0.1:9200]",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 7854,
        "max_file_descriptors" : 65535,
        "mlockall" : false
      }
    }
  }
}

群集详细信息

{
  "cluster_name" : "my_cluster",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 61,
  "active_shards" : 61,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 61
}

{
  "cluster_name" : "my_cluster",
  "nodes" : {
    "some weird number" : {
      "name" : "ES 1",
      "transport_address" : "inet[localhost/127.0.0.1:9300]",
      "host" : "some host",
      "ip" : "150.244.58.112",
      "version" : "1.4.4",
      "build" : "c88f77f",
      "http_address" : "inet[localhost/127.0.0.1:9200]",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 7854,
        "max_file_descriptors" : 65535,
        "mlockall" : false
      }
    }
  }
}

我对“mlockall”：false很好奇，因为我确实在yml上写了bootstrap.mlockall:true

日志

{
  "cluster_name" : "my_cluster",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 61,
  "active_shards" : 61,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 61
}

{
  "cluster_name" : "my_cluster",
  "nodes" : {
    "some weird number" : {
      "name" : "ES 1",
      "transport_address" : "inet[localhost/127.0.0.1:9300]",
      "host" : "some host",
      "ip" : "150.244.58.112",
      "version" : "1.4.4",
      "build" : "c88f77f",
      "http_address" : "inet[localhost/127.0.0.1:9200]",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 7854,
        "max_file_descriptors" : 65535,
        "mlockall" : false
      }
    }
  }
}

很多行，比如：

org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@a9a34f5

这可能表明群集的运行状况有问题。如果不了解更多关于集群的信息，就没有什么可说的了。

对于我来说，调整线程池搜索队列大小解决了这个问题。我尝试了很多其他的方法，这就是解决这个问题的方法

我将此添加到elasticsearch.yml中

threadpool.search.queue_size: 10000

然后重新启动elasticsearch

推理。。。（来自文件）

一个节点拥有多个线程池，以改进线程的运行方式内存消耗在节点内进行管理。许多这样的游泳池也是如此具有与之关联的队列，允许处理挂起的请求保留而不是丢弃

特别是在搜索方面

用于计数/搜索操作。默认为“固定”，大小为int((# 可用处理器数量*3）/2）+1，队列大小为1000

有关更多信息，请参阅elasticsearch

我很难找到这些信息，所以我希望这对其他人有帮助

我同意@Philip的观点，但至少在elasticsearch>=1.5.2时需要重新启动elasticsearch，因为您可以动态设置

threadpool.search.queue\u size

curl -XPUT http://your_es:9200/_cluster/settings
{
    "transient":{
        "threadpool.search.queue_size":10000
    }
}

使用Elasticsearch 5.4 thread_池时，它有一个下划线

thread_pool.search.queue_size: 10000

请参阅Elasticsearch>=版本5中的文档，无法使用_cluster/settings API更新线程_pool.search.queue_大小的群集设置。在我的例子中，更新ElasticSearch节点yml文件也不是一个选项，因为如果节点失败，那么自动缩放代码将为其他ES节点带来默认yml设置

我有一个有3个节点的集群，有400个活动主碎片和7个活动线程，队列大小为1000。将具有类似配置的节点数增加到5已解决此问题，因为查询将水平分布到更多可用节点。

当查询缺少结束引号时，我遇到此错误：

字段：“值

在ElasticSearch日志中，我看到以下例外情况：

Caused by: org.elasticsearch.index.query.QueryShardException:
    Failed to parse query [field:"value]
...
Caused by: org.apache.lucene.queryparser.classic.ParseException: 
    Cannot parse 'field:"value': Lexical error at line 1, column 13.  
    Encountered: <EOF> after : "\"value"

原因：org.elasticsearch.index.query.QueryShardException:
无法分析查询[字段：“值]
...
原因：org.apache.lucene.queryparser.classic.ParseException:
无法分析“字段：“值”：第1行第13列出现词法错误。
遇到：在“\”值之后

这在elasticsearch 5.6上不起作用

{
"error": {
    "root_cause": [
        {
            "type": "remote_transport_exception",
            "reason": "[colmbmiscxx.xx][172.29.xx.xx:9300][cluster:admin/settings/update]"
        }
    ],
    "type": "illegal_argument_exception",
    "reason": "transient setting [threadpool.search.queue_size], not dynamically updateable"
},
"status": 400

}

我不知道集群的哪些细节对解决此问题有用。有什么想法吗？它只是一个单独的节点。我将为这个问题添加更多细节。您需要显示群集状态、分配给群集的内存、可用的文件描述符、操作系统等。查看elasticsearch日志，看看是否有任何明显的问题（如内存不足、打开的文件太多等）我添加了更多细节。关于这些异常，我可能需要增加一些线程池或yml文件中的某些内容。感谢您的帮助。同样，在单节点系统上，拥有副本是毫无意义的，因为从未分配碎片，因此您可能希望更新索引映射以拥有0个副本（这是您可以更改的设置）。还有

index.number\u of_shard

，其中有两次，这意味着将使用第二个值（尽管在已经创建索引后这并不重要），谢谢，我使用以下方法解决了这个问题：#不要使用所有处理器处理器：6线程池：get:type:fixed size:30 queue_size:3000搜索：type:fixed size:30 queue_size:3000 index.number_of_shard:2 index.number_of_复制副本：0如果Elasticsearch>=版本5，则不可能-。你必须使用yaml配置文件。谢谢，它为我工作。配置键是thread\u pool.search.queue\u size而不是threadpool.search.queue\u size这是一个问题而不是答案吗？查看您正在使用的Kibana查询，它似乎没有正确地“引用”

无法解析查询[field:“value]

。您能提供更多详细信息吗？这是一个答案；此错误可能是因为错误的查询而发生的，而不仅仅是因为队列大小等。正如其他答案所示。完全正确。任何格式错误的查询都会导致Kibana上打印的此警告