elasticsearch 查询子属性,elasticsearch,kibana,elasticsearch,Kibana" /> elasticsearch 查询子属性,elasticsearch,kibana,elasticsearch,Kibana" />

elasticsearch 查询子属性

elasticsearch 查询子属性,elasticsearch,kibana,elasticsearch,Kibana,我想复制索引,但在属性与特定值匹配时跳过该属性。我发现了如何一起排除属性,但我需要这样的东西: 排除“术语属性”,其中“源”\u术语子属性”与“%earthMaterials%”不同 这在ElasticSearch中是可能的,还是我应该以不同的方式处理它 POST _reindex { "source" : { "index" : "documents3", "_source":{ "excludes": [ "terms" ]

我想复制索引,但在属性与特定值匹配时跳过该属性。我发现了如何一起排除属性,但我需要这样的东西:

排除“术语属性”,其中“源”\u术语子属性”与“%earthMaterials%”不同

这在ElasticSearch中是可能的,还是我应该以不同的方式处理它

POST _reindex
{
  "source" : {
    "index" : "documents3",
    "_source":{
      "excludes": [
        "terms"   
      ]
    }
  },
  "dest" : {
    "index" : "documents4"
  }
} 



这是我的映射的简化版本:

{
  "documents4": {
    "mappings": {
      "doc": {
        "properties": {
          "abstract": {
            "type": "text"
          },
          "author": {
            "type": "text",
            "fields": {
              "raw": {
                "type": "keyword"
              }
            }
          },
          "terms": {
            "properties": {
              "source_terminology": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                  }
                }
              },
              "uri": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}
这是我的数据现在的样子:

      {
        "_index": "documents4",
        "_type": "doc",
        "_id": "6bf03d1e-f7dc-40c6-a32d-c9aa09e7b051",
        "_score": 1,
        "_source": {
          "terms": [
            {
              "source_terminology": "exploration-activity-type",
              "label": "feasibility study",
              "uri": "http://resource.geosciml.org/classifier/cgi/exploration-activity-type/feasibility-study"
            },
                        {
              "source_terminology": "earthMaterialsAT",
              "label": "rock",
              "uri": "http://www.similarto.com/ontologies/lithology/2010/12/earthMaterialsAT#rock"
            },
            "title": "Miguel Auza Initial Prospectus"
        }
      }

您可以使用无痛脚本添加所需的条件

POST _reindex
{
  "source" : {
    "index" : "documents4"
  },
  "dest" : {
    "index" : "documents4-copy3"
  },
  "script": {
    "source": "int index = 0; def list = new ArrayList(); for(term in ctx._source.terms) { if(term.source_terminology =~ /^(?:(?!exploration).)+$/) { list.add(0, index) } index++;} for(item in list) { ctx._source.terms.remove(item)}",
    "lang": "painless"
  }
} 
您需要在
elasticsearch.yml
文件中将
script.painless.regex.enabled
值设置为
true
,才能工作

无痛脚本的格式化版本

int index = 0;
def list = new ArrayList();
for (term in ctx._source.terms) {
  if (term.source_terminology = ~ /^(?:(?!earthMaterials).)+$/) {
    // Need to add matched index at start to avoid
    // index_out_of_bounds_exception when removing items later
    list.add(0, index)
    // If you try to remove item as soon as match is found,
    // you will get concurrent_modification_exception
  }
  index++;
}
for (item in list) {
  ctx._source.terms.remove(item)
}

这个答案很简洁,肯定是把我带到了正确的方向。我还发现我将不得不使用Split或StringTokenizer,因为我使用的是AWS ES集群。谢谢!