String Elasticsearch没有'；不要一开始就用斜杠_String_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch_Search_Filter_Nosql

String Elasticsearch没有'；不要一开始就用斜杠

string search filter nosql

String Elasticsearch没有'；不要一开始就用斜杠,string,elasticsearch,search,filter,nosql,String,elasticsearch,Search,Filter,Nosql,我的数据库中充满了如下文档： { _index: "bla_bla", . . . _source: { domain: "somedomain.extension", path: "/you/know/the/path", lang: "en", keywords: ["yeah", "you", "rock", "dude", "help", "me", "good", "samaritan"] } } { "query":

我的数据库中充满了如下文档：

{
  _index: "bla_bla",
  .
  .
  .
  _source: {
    domain: "somedomain.extension",
    path: "/you/know/the/path",
    lang: "en",
    keywords: ["yeah", "you", "rock", "dude", "help", "me", "good", "samaritan"]
  }
}

{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "should": {
                        "terms": {
                            "keywords": ["stackoverflow", "rocks", "!"]
                        }
                    },
                    "must_not": {
                        "term": {
                            "path": "/"
                            // This works, i.e -> "lang": "en"
                        }
                    }
                }       
            }
        }
    },
    "from": 0,
    "size": 9
}

当我搜索时——不管我在寻找什么——它就像一个符咒，但是，如果我试图使用名为pathit的字段来过滤某个东西——简单地说——不起作用；没有引发任何错误或警告。经过一番令人筋疲力尽的研究，我想这是因为路径开头的斜线，可能我是对的，也可能不是，但无论如何，我需要这样过滤：

{
  _index: "bla_bla",
  .
  .
  .
  _source: {
    domain: "somedomain.extension",
    path: "/you/know/the/path",
    lang: "en",
    keywords: ["yeah", "you", "rock", "dude", "help", "me", "good", "samaritan"]
  }
}

{
    "query": {
        "filtered": {
            "filter": {
                "bool": {
                    "should": {
                        "terms": {
                            "keywords": ["stackoverflow", "rocks", "!"]
                        }
                    },
                    "must_not": {
                        "term": {
                            "path": "/"
                            // This works, i.e -> "lang": "en"
                        }
                    }
                }       
            }
        }
    },
    "from": 0,
    "size": 9
}

TL；DR：拥有一个包含URL的数据库，我如何才能只获取非根目录[路径长于“/”]个？

免责声明：我不是ES方面的专家，但如果理解正确，您想要的是排除所有只有

的文档。最后既然您总是将数据存储为

/path

，那么如果您有一个包含1个字符的字符串，那么它应该总是

，那么为什么不使用正则表达式呢

我认为，像这样的事情应该会奏效：

    {
    "query": {
        "filtered": {
            "filter": {
                "and": [
                    {
                        "bool": {
                            "should": {
                                "terms": {
                                    "keywords": [
                                        "stackoverflow",
                                        "rocks",
                                        "!"
                                    ]
                                }
                            }
                        }
                    },
                    {
                        "filter": {
                            "regexp": {
                                "path": ".{1,}"
                            }
                        }
                    }
                ]
            }
        }
    },
    "from": 0,
    "size": 9
}

在ElasticSearch中，文本被拆分为许多字符，包括斜杠。您需要做的是使用“未分析”索引。这是一个工作示例，请注意“路径”字段上的索引规范：

我可以很容易地更改映射的这个参数吗@jhildenIf您放置了一个更新的映射，它将应用于所有新数据，但为了应用于现有数据，您需要重新编制索引。有许多帖子可以帮助完成这项任务。