elasticsearch,Lucene,elasticsearch" /> elasticsearch,Lucene,elasticsearch" />

Lucene 查询嵌套文档中缺少的字段

Lucene 查询嵌套文档中缺少的字段,lucene,elasticsearch,Lucene,elasticsearch,我有一个包含许多标签的用户文档 以下是映射: { "user" : { "properties" : { "tags" : { "type" : "nested", "properties" : { "id" : { "type" : "string", "index" : "not_analyzed", "store" : "yes"

我有一个包含许多标签的用户文档
以下是映射:

{
  "user" : {
    "properties" : {
      "tags" : {
        "type" : "nested",
        "properties" : {
          "id" : {
            "type" : "string",
            "index" : "not_analyzed",
            "store" : "yes"
          },
          "current" : {
            "type" : "boolean"
          },
          "type" : {
            "type" : "string"
          },
          "value" : {
            "type" : "multi_field",
            "fields" : {
              "value" : {
                "type" : "string",
                "analyzer" : "name_analyzer"
              },
              "value_untouched" : {
                "type" : "string",
                "index" : "not_analyzed",
                "include_in_all" : false
              }
            }
          }
        }
      }
    }
  }
}
以下是示例用户文档:
用户1

用户2:

{
  "created_at": 1318513355000,
  "updated_at": 1364888695000,
  "tags": [
    {
      "type": "college",
      "value": "Dhirubhai Ambani Institute of Information and Communication Technology",
      "id": "a6f51ef8b34eb8f24d1c5be5e4ff509e2a361829"
    },
    {
      "type": "college",
      "value": "Bharatiya Vidya Bhavan's Public School, Jubilee hills, Hyderabad",
      "id": "d20730345465a974dc61f2132eb72b04e2f5330c"
    },
    {
      "type": "company",
      "value": "Alma Connect",
      "id": "93bc8199c5fe7adfd181d59e7182c73fec74eab5"
    },
    {
      "type": "sector",
      "value": "Website and Software Development",
      "id": "dc387d78fc99ab43e6ae2b83562c85cf3503a8a4"
    }    
  ]
}
用户3:

{
  "created_at": 1318513355001,
  "updated_at": 1364888695010,
  "tags": [
    {
      "type": "college",
      "value": "Dhirubhai Ambani Institute of Information and Communication Technology",
      "id": "a6f51ef8b34eb8f24d1c5be5e4ff509e2a361821"
    },
    {
      "type": "sector",
      "value": "Website and Software Development",
      "id": "dc387d78fc99ab43e6ae2b83562c85cf3503a8a1"
    }    
  ]
}
使用上面的ES文档进行搜索,我想构造一个查询,在这里我需要获取嵌套标记文档中具有公司标记的用户或没有任何公司标记的用户。我的搜索查询是什么


例如,在上述情况下,如果搜索google标记,则返回的文档应为“用户1”和“用户3”(因为用户1有公司标记google,用户3没有公司标记)。用户2没有返回,因为它也有谷歌以外的公司标签

一点也不琐碎,主要是因为nothaveatype:companytag子句。以下是我的想法:

{
  "or" : {
    "filters" : [ {
      "nested" : {
        "filter" : {
          "and" : {
            "filters" : [ {
              "term" : {
                "tags.value" : "google"
              }
            }, {
              "term" : {
                "tags.type" : "company"
              }
            } ]
          }
        },
        "path" : "tags"
      }
    }, {
      "not" : {
        "filter" : {
          "nested" : {
            "filter" : {
              "term" : {
                "tags.type" : "company"
              }
            },
            "path" : "tags"
          }
        }
      }
    } ]
  }
}
它包含包含两个嵌套子句的:第一个子句查找具有标记的文档。键入:company和tags。value:google,而第二个子句查找不具有任何标记的所有文档。键入:company

但是,这需要优化,因为和/或/或非过滤器没有利用与位集一起工作的过滤器的缓存,就像。最好花更多的时间来找到一种使用a的方法并获得相同的结果。想知道更多

{
  "or" : {
    "filters" : [ {
      "nested" : {
        "filter" : {
          "and" : {
            "filters" : [ {
              "term" : {
                "tags.value" : "google"
              }
            }, {
              "term" : {
                "tags.type" : "company"
              }
            } ]
          }
        },
        "path" : "tags"
      }
    }, {
      "not" : {
        "filter" : {
          "nested" : {
            "filter" : {
              "term" : {
                "tags.type" : "company"
              }
            },
            "path" : "tags"
          }
        }
      }
    } ]
  }
}