使用python api在弹性搜索中转储批量数据
我想使用其python api在弹性搜索中索引莎士比亚数据。我正在犯错误使用python api在弹性搜索中转储批量数据,python,
elasticsearch,Python,
elasticsearch,我想使用其python api在弹性搜索中索引莎士比亚数据。我正在犯错误 PUT http://localhost:9200/shakes/play/3 [status:400 request:0.098s] {'error': {'root_cause': [{'type': 'mapper_parsing_exception', 'reason': 'failed to parse'}], 'type': 'mapper_parsing_exception', 'reason': '
PUT http://localhost:9200/shakes/play/3 [status:400 request:0.098s]
{'error': {'root_cause': [{'type': 'mapper_parsing_exception', 'reason': 'failed to parse'}], 'type': 'mapper_parsing_exception', 'reason': 'failed to parse', 'caused_by': {'type': 'not_x_content_exception', 'reason': 'Compressor detection can only be called on some xcontent bytes or compressed xcontent bytes'}}, 'status': 400}
python脚本
from elasticsearch import Elasticsearch
from elasticsearch import TransportError
import json
data = []
for line in open('shakespeare.json', 'r'):
data.append(json.loads(line))
es = Elasticsearch()
res = 0
cl = []
# filtering data which i need
for d in data:
if res == 0:
res = 1
continue
cl.append(data[res])
res = 0
try:
res = es.index(index = "shakes", doc_type = "play", id = 3, body = cl)
print(res)
except TransportError as e:
print(e.info)
我还尝试使用json.dumps,但仍然得到相同的错误。但是,当只向下面的弹性搜索添加列表的一个元素时,代码就起作用了。您并没有向es发送批量请求,而只是发送一个简单的创建请求-请看一看。此方法适用于表示新文档的dict,而不适用于文档列表。如果您在创建请求上放置了一个id,那么您需要将该值设置为动态值,否则每个文档都将在最后一个指定文档的id上被覆盖。如果在json中,每行都有一条记录,您应该尝试此操作-请阅读大量文档:
from elasticsearch import helpers
es = Elasticsearch()
op_list = []
with open("C:\ElasticSearch\shakespeare.json") as json_file:
for record in json_file:
op_list.append({
'_op_type': 'index',
'_index': 'shakes',
'_type': 'play',
'_source': record
})
helpers.bulk(client=es, actions=op_list)