Elasticsearch Python客户端重新索引Timedout
我正在尝试使用Elasticsearch python客户端重新编制索引,使用。但我一直得到以下异常:Elasticsearch Python客户端重新索引Timedout,python,
elasticsearch,Python,
elasticsearch,我正在尝试使用Elasticsearch python客户端重新编制索引,使用。但我一直得到以下异常:elasticsearch.exceptions.ConnectionTimeout:ConnectionTimeout由-ReadTimeout引起 错误的堆栈跟踪是 Traceback (most recent call last): File "~/es_test.py", line 33, in <module> main() File "~/es_test.
elasticsearch.exceptions.ConnectionTimeout:ConnectionTimeout由-ReadTimeout引起
错误的堆栈跟踪是
Traceback (most recent call last):
File "~/es_test.py", line 33, in <module>
main()
File "~/es_test.py", line 30, in main
target_index='users-2')
File "~/ENV/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 306, in reindex
chunk_size=chunk_size, **kwargs)
File "~/ENV/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 182, in bulk
for ok, item in streaming_bulk(client, actions, **kwargs):
File "~/ENV/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 124, in streaming_bulk
raise e
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeout(HTTPSConnectionPool(host='myhost', port=9243): Read timed out. (read timeout=10))
这可能是因为Java堆空间的OutOfMemoryError导致的,这意味着您没有为elasticsearch提供足够的内存来执行您想要执行的操作。
如果有任何类似的异常,请尝试查看您的/var/log/elasticsearch
我已经受此问题困扰了几天,我将request\u timeout参数更改为30(即30秒)不起作用。
最后,我必须在elasticsearch.py中编辑stream_bulk和reindex API
将chunk_size参数从默认的500(处理500个文档)更改为每批更少的文档数。我把我的换成了50,这对我来说很好。不再有读取超时错误
def streaming_bulk(客户端,操作,块大小=50,启动时出错=True,
expand\u action\u callback=expand\u action,raise\u on\u exception=True,
**kwargs):
def reindex(客户端、源索引、目标索引、查询=无、目标客户端=无、,
chunk\u size=50,scroll='5m',scan\u-kwargs={},bulk\u-kwargs={}):你能给你看python代码吗?@Val我包括了我的代码。你可以尝试在重新索引调用中添加参数(可能值=100)?使用chunk\u-size
,你会没事的。我已经能够使用一个简单的重新索引调用来重新索引数百万个文档。示例:helpers.reindex(es,源索引=旧索引,目标索引=新索引,块大小=1000)
我在哪里可以找到elasticsearch,py?
from elasticsearch import Elasticsearch, RequestsHttpConnection, helpers
es = Elasticsearch(connection_class=RequestsHttpConnection,
host='myhost',
port=9243,
http_auth=HTTPBasicAuth(username, password),
use_ssl=True,
verify_certs=True,
timeout=600)
helpers.reindex(es, source_index=old_index, target_index=new_index)