在Python中，使用哪种数据结构来检索基于多个proerty的字典条目？_Python_Data Structures

在Python中，使用哪种数据结构来检索基于多个proerty的字典条目？

python data-structures

在Python中，使用哪种数据结构来检索基于多个proerty的字典条目？,python,data-structures,Python,Data Structures,我想知道最有效的数据结构是什么来表示以下内容：本质上，我想创建一个字典来表示具有特定语法属性的词汇条目。每组属性由属性值对组成示例： dictionary = [ {'lexeme:'goes', 'person':'3', 'number':'sg', 'tense':'present'}, {'lexeme':'go', 'person':'3', 'number':'pl', 'tense':'present'}, {'lexeme':'went', 'per

我想知道最有效的数据结构是什么来表示以下内容：本质上，我想创建一个字典来表示具有特定语法属性的词汇条目。每组属性由属性值对组成

示例：

dictionary = [
    {'lexeme:'goes', 'person':'3', 'number':'sg', 'tense':'present'}, 
    {'lexeme':'go', 'person':'3', 'number':'pl', 'tense':'present'}, 
    {'lexeme':'went', 'person':'3', 'number':'sg', 'tense':'past'},
    ... 
]

现在，我希望能够检索具有给定属性集的所有词典条目，例如，具有

person=3

或

tense=pass

或

person=3和tense=pass

的所有词典条目

在Python中实现这一点的合适且有效的方法是什么

您可以使用列表理解来获取相关条目：

[entry for entry in dictionary if entry['person'] == '3' and entry['tense'] == 'past']

您可以使用列表理解来获取相关条目：

[entry for entry in dictionary if entry['person'] == '3' and entry['tense'] == 'past']

您考虑过数据帧吗。它旨在高效地存储和操作表格数据。

您是否考虑过

DataFrame

。它设计用于高效地存储和操作表格数据。

我不是python爱好者，使用python语法可能非常容易。这是我试图解决这个问题的方法

从数据结构的角度来看，您可以创建一个

结构

，以表示字典中的每个条目

class Entry {
  String lexeme;
  Integer person;
  String number;
  String tense;
}

dictionary = {
    'goes':['3','sg','present'],
    'go':['3','pl','present'],
    'went':['3','sg','past'],
    ...
}

现在，要对

数据结构执行快速查询

，您可以创建相应

条目

属性的HashMap

例如：为

和

创建映射，这将为您提供复杂度为O（1）的

条目

对象。值部分将是所有

条目

对象的列表，您可以将它们按顺序排序，以便在线性时间内执行交集（

和

操作）

我不是python爱好者，使用python语法可能非常容易。这是我试图解决这个问题的方法

从数据结构的角度来看，您可以创建一个

结构

，以表示字典中的每个条目

class Entry {
  String lexeme;
  Integer person;
  String number;
  String tense;
}

dictionary = {
    'goes':['3','sg','present'],
    'go':['3','pl','present'],
    'went':['3','sg','past'],
    ...
}

现在，要对

数据结构执行快速查询

，您可以创建相应

条目

属性的HashMap

例如：为

和

创建映射，这将为您提供复杂度为O（1）的

条目

对象。值部分将是所有

条目

对象的列表，您可以将它们按顺序排序，以便在线性时间内执行交集（

和

操作）

对于您所描述的数据来说，存储字典列表似乎有些过分。如果每个字典都具有与您描述的相同的结构，那么就不需要使用字典，因为散列数据不会带来任何好处。我认为你可以用一本字典来满足你的需要

class Entry {
  String lexeme;
  Integer person;
  String number;
  String tense;
}

dictionary = {
    'goes':['3','sg','present'],
    'go':['3','pl','present'],
    'went':['3','sg','past'],
    ...
}

通过使用单个词典，您可以在词典中搜索单个单词

class Entry {
  String lexeme;
  Integer person;
  String number;
  String tense;
}

dictionary = {
    'goes':['3','sg','present'],
    'go':['3','pl','present'],
    'went':['3','sg','past'],
    ...
}

如果要返回某个值的所有字典，仍然需要使用for循环和If-then语句遍历字典

tmp_list = []
for word in dictionary:
    if dictionary[word][0] == '3' ^ dictionary[word][2] == 'past':
        tmp_list.append(word)

^运算符是python内置的and/or运算符。

存储字典列表对于您描述的数据来说似乎有些过分。如果每个字典都具有与您描述的相同的结构，那么就不需要使用字典，因为散列数据不会带来任何好处。我认为你可以用一本字典来满足你的需要

class Entry {
  String lexeme;
  Integer person;
  String number;
  String tense;
}

dictionary = {
    'goes':['3','sg','present'],
    'go':['3','pl','present'],
    'went':['3','sg','past'],
    ...
}

通过使用单个词典，您可以在词典中搜索单个单词

class Entry {
  String lexeme;
  Integer person;
  String number;
  String tense;
}

dictionary = {
    'goes':['3','sg','present'],
    'go':['3','pl','present'],
    'went':['3','sg','past'],
    ...
}

如果要返回某个值的所有字典，仍然需要使用for循环和If-then语句遍历字典

tmp_list = []
for word in dictionary:
    if dictionary[word][0] == '3' ^ dictionary[word][2] == 'past':
        tmp_list.append(word)

^运算符是python内置的and/or运算符。

itemgetter方法熊猫的解决方案很好。如果您想要纯python标准库解决方案，一个选项是使用

operator.itemgetter

（）。您向itemgetter传递一个您关心的字典键列表，它返回一个函数，该函数将从字典中获取这些键（也适用于具有数字索引的列表）

这允许您处理任意数量的要匹配的密钥。例如，您可以指定几组要匹配的内容：

match_keys = (
    'person',            # must not be a tuple if it's a single item
                         # itemgetter will return a single value and not a tuple
    ('person', 'tense')
)
match_values = (
    '3',
    ('3', 'past')
)

matches = []
for mk, mv in zip(match_keys, match_values):
    getter = itemgetter(*mk) if isinstance(mk, tuple) else itemgetter(mk)
    matches.extend(
        [row for row in dictionary if getter(row) == mv]
    )

这种方法将返回重复的。理想情况下，您可以将matches对象设置为

集

，而不是

列表

，并在循环中更新它。不幸的是，

dict

无法添加到

集合

，因此无法直接工作

吸引子法最后，如果您有一个类实例列表，其中您关心的字段是对象的属性，那么您可以使用

操作符.attrgetter

方法，方法与上述类似。然后，您可以使用

set

删除重复项。