Python 如何在Elasticsearch中索引对象列表？_Python_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch

Python 如何在Elasticsearch中索引对象列表？

python

Python 如何在Elasticsearch中索引对象列表？,python,elasticsearch,Python,elasticsearch,我吸收到ElasticSearch中的文档格式如下所示： { 'id':'514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0' 'created':'2019-09-06 06:09:33.044433', 'meta':{ 'userTags':[ { 'intensity':'1', 'sentiment':'0.84', 'keyword':'tra

我吸收到ElasticSearch中的文档格式如下所示：

{
   'id':'514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0'
   'created':'2019-09-06 06:09:33.044433',
   'meta':{
      'userTags':[
         {
            'intensity':'1',
            'sentiment':'0.84',
            'keyword':'train'
         },
         {
            'intensity':'1',
            'sentiment':'-0.76',
            'keyword':'amtrak'
         }
      ]
   }
}

…通过python摄取：

r = requests.put(itemUrl, auth = authObj, json = document, headers = headers)

这里的想法是ElasticSearch将

关键字

、

强度

和

情绪

视为以后可以查询的字段。然而，在ElasticSearch方面，我可以观察到这并没有发生（我在搜索UI中使用Kibana）——相反，我看到字段“meta.userTags”，其值是整个对象列表

如何在列表中创建ElasticSearch索引元素？

我使用您提供的文档体创建了一个新索引“testind”，并使用Postman REST客户端键入“testTyp”：

POST http://localhost:9200/testind/testTyp
{
   "id":"514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0",
   "created":"2019-09-06 06:09:33.044433",
   "meta":{
      "userTags":[
         {
            "intensity":"1",
            "sentiment":"0.84",
            "keyword":"train"
         },
         {
            "intensity":"1",
            "sentiment":"-0.76",
            "keyword":"amtrak"
         }
      ]
   }
}

当我查询索引的映射时，我得到的是：

GET http://localhost:9200/testind/testTyp/_mapping
{  
  "testind":{  
    "mappings":{  
      "testTyp":{  
        "properties":{  
          "created":{  
            "type":"text",
            "fields":{  
             "keyword":{  
                "type":"keyword",
                "ignore_above":256
              }
            }
          },
          "id":{  
            "type":"text",
            "fields":{  
              "keyword":{  
                "type":"keyword",
                "ignore_above":256
              }
            }
          },
          "meta":{  
            "properties":{  
              "userTags":{  
                "properties":{  
                  "intensity":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  },
                  "keyword":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  },
                  "sentiment":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

正如您在映射中所看到的，字段是映射的一部分，将来可以根据需要进行查询，因此只要字段名不是其中一个，我就看不出问题所在-（您可能希望避免使用术语“关键字”，因为在编写搜索查询时可能会混淆，因为字段名和类型都是相同的-“关键字”）。另外，请注意，映射是通过Elasticsearch中的dynamic mapping（）创建的，因此数据类型由Elasticsearch根据您提供的值确定。但是，这可能并不总是准确的，因此为了防止使用PUT _MappingAPI为索引定义您自己的映射，然后防止将类型中的新字段添加到映射中。

我使用您提供的文档体使用Postman REST客户端创建了新索引“testind”并键入“testTyp”：

POST http://localhost:9200/testind/testTyp
{
   "id":"514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0",
   "created":"2019-09-06 06:09:33.044433",
   "meta":{
      "userTags":[
         {
            "intensity":"1",
            "sentiment":"0.84",
            "keyword":"train"
         },
         {
            "intensity":"1",
            "sentiment":"-0.76",
            "keyword":"amtrak"
         }
      ]
   }
}

当我查询索引的映射时，我得到的是：

GET http://localhost:9200/testind/testTyp/_mapping
{  
  "testind":{  
    "mappings":{  
      "testTyp":{  
        "properties":{  
          "created":{  
            "type":"text",
            "fields":{  
             "keyword":{  
                "type":"keyword",
                "ignore_above":256
              }
            }
          },
          "id":{  
            "type":"text",
            "fields":{  
              "keyword":{  
                "type":"keyword",
                "ignore_above":256
              }
            }
          },
          "meta":{  
            "properties":{  
              "userTags":{  
                "properties":{  
                  "intensity":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  },
                  "keyword":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  },
                  "sentiment":{  
                    "type":"text",
                    "fields":{  
                      "keyword":{  
                        "type":"keyword",
                        "ignore_above":256
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

不需要特殊映射来索引列表-每个字段都可以包含一个或多个相同类型的值。看

对于对象列表，可以将其索引为

对象

或

嵌套

数据类型。默认情况下，弹性使用

对象

数据类型。在这种情况下，您可以查询

meta.userTags.keyword

或/和

meta.userTags.touction

。结果将始终包含具有独立匹配值的完整文档，即搜索

关键字=train

和

情绪=-0.76

您将找到id=514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0的文档

如果这不是您想要的，您需要为字段

userTags

定义数据类型映射，并使用。

您不需要特殊映射来索引列表-每个字段可以包含一个或多个相同类型的值。看

对于对象列表，可以将其索引为

对象

或

嵌套

数据类型。默认情况下，弹性使用

对象

数据类型。在这种情况下，您可以查询

meta.userTags.keyword

或/和

meta.userTags.touction

。结果将始终包含具有独立匹配值的完整文档，即搜索

关键字=train

和

情绪=-0.76

您将找到id=514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0的文档

如果这不是您想要的，则需要为字段

userTags

定义数据类型映射，并使用