elasticsearch,Indexing,Attachment,elasticsearch" /> elasticsearch,Indexing,Attachment,elasticsearch" />

Indexing 将附件文件索引到弹性搜索

Indexing 将附件文件索引到弹性搜索,indexing,attachment,elasticsearch,Indexing,Attachment,elasticsearch,我已键入此命令以在Elasticsearch中索引文档 创建索引 curl -X PUT "localhost:9200/test_idx_1x" 创建映射 curl -X PUT "localhost:9200/test_idx_1x/test_mapping_1x/_mapping" -d '{ "test_mapping_1x": { "properties": { "my_attachments": { "type": "attachment"

我已键入此命令以在Elasticsearch中索引文档

创建索引

curl -X PUT "localhost:9200/test_idx_1x"
创建映射

curl -X PUT "localhost:9200/test_idx_1x/test_mapping_1x/_mapping" -d '{
  "test_mapping_1x": {
    "properties": {
      "my_attachments": {
        "type": "attachment"
      }
    }
  }
}'
为该文档编制索引

curl -XPUT 'http://localhost:9200/test_idx_1x/test_mapping_1x/4' -d '{
  "post_date": "2009-11-15T14:12:12",
  "message": "test Elastic Search",
  "name": "N1"
}'
这三个命令都很有用。 但当我键入此命令时:

curl -XPOST 'http://localhost:9200/test_idx_1x/test_mapping_1x/1' -d '{
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elastic Search",
  "name": "N2",
  "my_attachments": {
    "type": "attachment",
    "_content_type": "text/plain",
    "file": "http://localhost:5984/my_test_couch_db_7/ID2/test.txt"
  }
}'
我收到此错误消息:

{
  "error": "NullPointerException[null]",
  "status": 500
}
{
  "error": "MapperParsingException[Failed to parse]; nested: JsonParseException[Unexpected character ('h' (code 104)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')\n at [Source: [B@1ae9565; line: 1, column: 132]]; ",
  "status": 400
}
我把它变成

curl -XPOST 'http://localhost:9200/test_idx_1x/test_mapping_1x/1bis' -d '{
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elastic Search",
  "name": "N2",
  "my_attachments": {
    "type": "attachment",
    "_content_type": "text/plain",
    "_name": "/inf/bd/my_home_directory/test.txt"
  }
}'

curl -XPUT 'http://localhost:9200/test_idx_1x/test_mapping_1x/1' -d '{
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elastic Search",
  "name": "N2",
  "my_attachments": {
    "file": "http://localhost:5984/my_test_couch_db_7/ID2/test.txt"
  }
}'

curl -XPUT 'http://localhost:9200/test_idx_1x/test_mapping_1x/1' -d '{
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elastic Search",
  "name": "N2",
  "my_attachments": {
    "file": "http://localhost:5984/my_test_couch_db_7/ID2/test.txt",
    "_content_type": "text/plain"
  }
}'
输出是相同的错误

我就这样改变它

curl -XPUT 'http://localhost:9200/test_idx_1x/test_mapping_1x/1' -d '{
  "user": "kimchy",
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elastic Search",
  "name": "N2",
  "my_attachments": {
    "file": "http://localhost:5984/my_test_couch_db_7/ID2/test.txt",
    "_content_type": "text/plain",
    "content": "... base64 encoded attachment ..."
  }
}'
错误是

{
  "error": "MapperParsingException[Failed to parse]; nested: JsonParseException[Failed to decode VALUE_STRING as base64 (MIME-NO-LINEFEEDS): Illegal character '.' (code 0x2e) in base64 content\n at [Source: [B@159b3; line: 1, column: 241]]; ",
  "status": 400
}

curl -XPUT 'http://localhost:9200/test_idx_1x/test_mapping_1x/1' -d '{
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elastic Search",
  "name": "N2",
  "my_attachments": "http://localhost:5984/my_test_couch_db_7/ID2/test.txt"
}'
我收到此错误消息:

{
  "error": "NullPointerException[null]",
  "status": 500
}
{
  "error": "MapperParsingException[Failed to parse]; nested: JsonParseException[Unexpected character ('h' (code 104)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')\n at [Source: [B@1ae9565; line: 1, column: 132]]; ",
  "status": 400
}
如果我打字

curl -XPUT 'http://localhost:9200/test_idx_1x/test_mapping_1x/1' -d '{
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elastic Search",
  "name": "N2",
  "my_attachments": "http://localhost:5984/my_test_couch_db_7/ID2/test.txt"
}'
我收到一个错误。我能理解

{
  "error": "MapperParsingException[Failed to parse]; nested: JsonParseException[Failed to decode VALUE_STRING as base64 (MIME-NO-LINEFEEDS): Illegal character ':' (code 0x3a) in base64 content\n at [Source: [B@1ffb7d4; line: 1, column: 137]]; ",
  "status": 400
}
如何将文件附加到ES以便ES可以对其进行索引


谢谢你的回答。我键入这些命令时已经安装的附件插件。文本文件的内容是用Base64编码的,所以我不再对它进行编码。如果我不使用文件的路径,而是直接使用base64中的内容,例如

curl -XPUT 'http://localhost:9200/test_idx_1x/test_mapping_1x/' -d '{
  "post_date": "2009-11-15T14:12:12",
  "message": "trying out Elastic Search",
  "name": "N2",
  "my_attachments": "file's content string encoded in base64"
}'
一切都很好,我已经成功地张贴文件和搜索其内容后


但如果我用path的文件替换它,我会得到负面结果。所以我想知道如何在命令行中,在ES索引的命令中对Base64文件进行编码(当然,我不想在ES中键入第二个命令对文件进行索引之前键入Base64命令对文件进行编码)。作为您的回答,我是否必须安装类似“Perl library”的东西才能执行您的命令?

首先,您不需要指定是否安装了
附件
插件。如果没有,您可以通过以下方式执行此操作:

./bin/plugin -install mapper-attachments
您需要重新启动ElasticSearch以加载插件

然后,如上所述,将字段映射为具有类型
附件

curl -XPUT 'http://127.0.0.1:9200/foo/?pretty=1'  -d '
{
   "mappings" : {
      "doc" : {
         "properties" : {
            "file" : {
               "type" : "attachment"
            }
         }
      }
   }
}
'
尝试为文档编制索引时,需要将文件内容编码为Base64。您可以使用
base64
命令行实用程序在命令行上执行此操作。但是,要成为合法的JSON,还需要对新行进行编码,可以通过管道将
base64
的输出通过Perl:

curl -XPOST 'http://127.0.0.1:9200/foo/doc?pretty=1'  -d '
{
   "file" : '`base64 /path/to/file | perl -pe 's/\n/\\n/g'`'
}
'
现在,您可以搜索您的文件:

curl -XGET 'http://127.0.0.1:9200/foo/doc/_search?pretty=1'  -d '
{
   "query" : {
      "text" : {
         "file" : "text to look for"
      }
   }
}
'

有关详细信息,请参阅。

这是一个完整的shell脚本实现:

file_path='/path/to/file'
file=$(base64 $file_path | perl -pe 's/\n/\\n/g')
curl -XPUT "http://eshost.com:9200/index/type/" -d '{
    "file" : "content" : "'$file'"
}'


有一个替代解决方案-插件在。您可以使用_ewupload?上传二进制文件,读取新生成的ID,并使用此引用更新不同的索引

安装插件:

plugin -install elasticwarehouseplugin -u http://elasticwarehouse.org/elasticwarehouse/elasticsearch-elasticwarehouseplugin-1.2.2-1.7.0-with-dependencies.zip
重新启动群集,然后:

curl -XPOST "http://127.0.0.1:9200/_ewupload?folder=/myfolder&filename=mybinaryfile.bin" --data-binary @mybinaryfile.bin
样本响应:

{"id":"nWvrczBcSEywHRBBBwfy2g","version":1,"created":true}

是的,你能行。无论使用哪种编程语言,都要使用Base64编码器。我之所以使用命令行版本,是因为您的示例在键入命令时使用了命令行中的curl:curl-XPOST''{“post_date”:“2009-11-15T14:12:12”,“message”:“尝试弹性搜索”,“name”:“N2”,“my_附件”:“
base64/inf/bd/tran_chi/test4.txt”;“perl-pe's/\n/\\n/g'
”'我已收到此消息:{“error”:“MapperParsingException[未能分析];嵌套:JsonParseException[意外字符('T'(代码84)):在[源代码]处应为有效值(数字、字符串、数组、对象、'true'、'false'或'null')\n[B@8aeedc;行:1,列:134]];“,”状态:400}此时,我必须始终使用Base64命令在Base64中对文件内容进行编码。然后我键入第2个命令和,以便在弹性搜索中将此Base64编码的内容发布到my_附件的字段中。我不知道如何在命令行中只键入一个命令来索引ES中的附件。无论如何,谢谢您,我得到了
-bash:/usr/bin/curl:参数列表太长
尝试运行post命令时出错。
curl
命令将单引号中的文本处理为两个不同的参数。