<img src="//i.stack.imgur.com/RUiNP.png" height="16" width="18" alt="" class="sponsor tag img">elasticsearch 嵌套属性中具有函数_分数的弹性搜索增强_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch

elasticsearch 嵌套属性中具有函数_分数的弹性搜索增强

elasticsearch 嵌套属性中具有函数_分数的弹性搜索增强,elasticsearch,elasticsearch,在Elasticsearch中，给定以下文档结构： "workhistory": { "positions": [{ "company": "Some company", "position": "Some Job Title", "start": 1356998400, "end": 34546576576, "description": "", "source": [ "some source", "some

在Elasticsearch中，给定以下文档结构：

"workhistory": {
  "positions": [{
    "company": "Some company",
    "position": "Some Job Title",
    "start": 1356998400,
    "end": 34546576576,
    "description": "",
    "source": [
       "some source", 
       "some other source"
    ]
  },
  {
    "company": "Some other company",
    "position": "Job Title",
    "start": 1356998400,
    "end": "",
    "description": "",
    "source": [
       "some other source"
    ]
  }]
}

和此结构的映射：

  workhistory: {
    properties: {    
      positions: {
        type: "nested", 
        include_in_parent: true, 
        properties: {                 
          company: {
            type: "multi_field",
            fields: {
              company: {type: "string"},
              original: {type : "string", analyzer : "string_lowercase"} 
            }              
          }, 
          position: {
            type: "multi_field",
            fields: {
              position: {type: "string"},
              original: {type : "string", analyzer : "string_lowercase"} 
            }              
          }                                                       
        }
      }        
    }
  }

我希望能够搜索“company”并匹配文档（如果company=“some company”等），然后我希望获得tf idf\u分数。我还想创建一个函数\u score查询，以根据“source”字段数组的值提高此匹配的分数。基本上，如果源包含“某个源”，则使用x数量提高分数。如果需要，我可以更改“源”属性的结构

到目前为止，我得到的是：

{
   "bool": {
      "should": [
         {
            "filtered": {
               "query": {
                  "bool": {
                     "should": [
                        {
                           "bool": {
                              "should": [
                                 {
                                    "match": {
                                       "workhistory.positions.company.original": "some company"
                                    }
                                 }
                              ]
                           }
                        }
                     ],
                     "minimum_should_match": "100%"
                  }
               },
               "filter": {
                  "and": [
                     {
                        "bool": {
                           "should": [
                              {
                                 "term": {
                                    "workhistory.positions.company.original": "some company"
                                 }
                              }
                           ]
                        }
                     }
                  ]
               }
            }
         },
         {
            "function_score": {
               "query": {
                  "bool": {
                     "should": [
                        {
                           "bool": {
                              "should": [
                                 {
                                    "match": {
                                       "workhistory.positions.company.original": "some company"
                                    }
                                 }
                              ]
                           }
                        }
                     ],
                     "minimum_should_match": "100%"
                  }
               },
               "filter": {
                  "and": [
                     {
                        "bool": {
                           "should": [
                              {
                                 "term": {
                                    "workhistory.positions.company.original": "some company"
                                 }
                              }
                           ]
                        }
                     }
                  ]
               }
            }
         }
      ]
   }
}

这里也有一些过滤器，因为我只想返回带有过滤值的文档。在这个例子中，过滤器和查询基本相同，但是在这个查询的更大版本中，我有一些其他的“可选”匹配来提升可选值等。函数的分数现在没有做太多，因为我真的不知道如何使用它。目标是能够在我的应用程序代码中调整boost的数量，并将其传递给搜索查询

我使用的是Elasticsearch 1.3.4版。

老实说，我不知道你为什么在那里重复这些过滤器和查询。也许我遗漏了什么，但根据你的描述，我相信你所需要的只是一个“功能评分”。从：

函数_score允许您修改查询检索到的文档的分数

因此，您定义了一个查询（例如，匹配公司名称），然后定义了一个函数列表，这些函数应该提高某个子文档集的_分数。来自同一文件：

此外，还可以组合多个功能。在这种情况下，您可以选择仅在文档与给定筛选器匹配时应用该函数

因此，您可以使用查询查找具有特定名称的公司，然后使用函数的筛选器来操作与筛选器匹配的文档的_分数。在本例中，您的过滤器是应该包含某些内容的“源”。函数本身是一个脚本：

\u分数+2

。最后，这将是我的想法：

    {
      "query": {
        "bool": {
          "should": [
            {
              "function_score": {
                "query": {
                  "bool": {
                    "should": [
                      {
                        "bool": {
                          "should": [
                            {
                              "match": {
                                "workhistory.positions.company.original": "some company"
                              }
                            }
                          ]
                        }
                      }
                    ],
                    "minimum_should_match": "100%"
                  }
                },
                "functions": [
                  {
                    "filter": {
                      "nested": {
                        "path": "workhistory.positions",
                        "query": {
                          "bool": {
                            "should": [
                              {
                                "match": {
                                  "workhistory.positions.source": "some source"
                                }
                              }
                            ]
                          }
                        }
                      }
                    },
                    "script_score": {
                  "script": "_score + 2"
                }
              },
              {
                "filter": {
                  "nested": {
                    "path": "workhistory.positions",
                    "query": {
                      "bool": {
                        "should": [
                          {
                            "match": {
                              "workhistory.positions.source": "xxx"
                            }
                          }
                        ]
                      }
                    }
                  }
                },
                "script_score": {
                  "script": "_score + 4"
                }
              }
            ],
            "max_boost": 5,
            "score_mode": "sum",
            "boost_mode": "sum"
          }
        }
      ]
    }
  }
}

这似乎是可行的，我将脚本切换为脚本文件夹中的脚本文件，以便不必启用动态脚本。我就快到了，但是我需要首先在常规查询中打分，然后应用这个查询，用我的函数\u score query提高分数。因此，结果将首先根据文档中的匹配项进行排序，然后通过此函数进行增强。我也不知道如果源名为其他名称，或者根本不存在，如何应用不同的增强。这将匹配同一嵌套元素，还是该数组中的任何元素？另一个“源”值的不同提升很简单：您只需定义另一组

{“filter”：{}，“script_score”：{}}

。我更新了我的答案。。。。aa我不理解这样一句话：“我需要首先在常规查询中打分，然后使用我的函数\u score query应用此查询来提高分数。这样，结果将首先根据文档中的匹配项排序，然后通过此函数提高分数。”我明白了，太好了！假设我有10个不同的源，然后我为每个源编写一个函数？这会对性能产生很大影响吗？此外，这些源是否与嵌套集合/数组中与实际查询相同的元素匹配，或者仅与该集合中的任何命中匹配？