Python 当某些原始值相同时反转字典_Python_Dictionary

Python 当某些原始值相同时反转字典

python dictionary

Python 当某些原始值相同时反转字典,python,dictionary,Python,Dictionary,假设我有一个名为word\u counter\u dictionary的字典，它以{'word'：number}的形式计算文档中有多少单词。例如，单词“secondary”出现一次，因此键/值对将是{'secondary'：1}。我想做一个倒排的列表，这样数字将成为键，单词将成为这些键的值，这样我就可以绘制出前25个最常用的单词。我看到了setdefault（）函数可能会派上用场的地方，但无论如何，我不能使用它，因为到目前为止，在我所在的类中，我们只讨论了get（）到目前为止，使用上面的这个方

假设我有一个名为

word\u counter\u dictionary

的字典，它以

{'word'：number}

的形式计算文档中有多少单词。例如，单词“secondary”出现一次，因此键/值对将是

{'secondary'：1}

。我想做一个倒排的列表，这样数字将成为键，单词将成为这些键的值，这样我就可以绘制出前25个最常用的单词。我看到了

setdefault（）

函数可能会派上用场的地方，但无论如何，我不能使用它，因为到目前为止，在我所在的类中，我们只讨论了

get（）

到目前为止，使用上面的这个方法，它工作得很好，直到它到达另一个具有相同值的单词为止。例如，单词

“saves”

也会在文档中出现一次，因此Python会添加新的键/值对。但是它会用新的对擦除

{1:'secondary'}

，这样字典中只有

{1:'saves'}

所以，归根结底，我的目标是在这个名为

inversed\u dictionary

的新字典中获取所有单词及其各自的重复次数。Python dicts不允许重复键，因此不能使用简单的字典来存储具有相同键的多个元素（在您的例子中是

）。例如，我希望有一个

列表

作为反向字典的值，并在该列表中存储共享出现次数的单词，如：

inverted_dictionary = {}
for key in word_counter_dictionary:
    new_key = word_counter_dictionary[key]
    if new_key in inverted_dictionary:
        inverted_dictionary[new_key].append(key)
    else:
        inverted_dictionary[new_key] = [key]

为了获得重复次数最多的25个单词，您应该在

倒排字典

中迭代（排序）键并存储单词：

common_words = []
for key in sorted(inverted_dictionary.keys(), reverse=True):
    if len(common_words) < 25:
        common_words.extend(inverted_dictionary[key])
    else: 
        break

common_words = common_words[:25] # In case there are more than 25 words

common_words=[]
对于已排序的键（反转的_dictionary.keys（），反转=True）：
如果len（常用词）<25：
常用词.扩展（倒排词典[关键字]）
其他：
打破
常用词=常用词[：25]#如果超过25个词

您可以使用相同的键转换单词列表中的值：

word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}

inverted_dictionary = {}
for key in word_counter_dictionary:
    new_key = word_counter_dictionary[key]
    if new_key in inverted_dictionary:
        inverted_dictionary[new_key].append(str(key))
    else:
        inverted_dictionary[new_key] = [str(key)]

print inverted_dictionary

>>> {1: ['first'], 2: ['second', 'fourth'], 3: ['third']}

以下是一个不会“颠倒”字典的版本：

>>> import operator
>>> A = {'a':10, 'b':843, 'c': 39, 'd': 10}
>>> B = sorted(A.iteritems(), key=operator.itemgetter(1), reverse=True)
>>> B
[('b', 843), ('c', 39), ('a', 10), ('d', 10)]

相反，它创建一个按值从高到低排序的列表

要获得前25名，只需将其切片：

B[：25]

这里有一种方法可以将键和值分开（在将它们放入元组列表之后）：

或

请注意，如果您只想提取键或值（而不是同时提取两者），您应该早一点这样做。这只是如何处理元组列表的示例。

A非常适合于此

word_counter_dictionary = {'first':1, 'second':2, 'third':3, 'fourth':2}
from collections import defaultdict

d = defaultdict(list)
for key, value in word_counter_dictionary.iteritems():
    d[value].append(key)

print(d)

输出：

defaultdict（，{1:['first']，2:['second'，'fourth']，3:['third']}）

要获取某些数据集的最大元素，倒排字典可能不是最好的数据结构

将项目放入排序列表中（示例假设您希望得到两个最常用的单词）：

结果:

>>> print(counter_word_list[-2:])
[(2, 'second'), (3, 'third')]

[('third', 3), ('second', 2)]

或者使用Python自带的电池（

heapq.nlargest

在本例中）：

结果:

>>> print(counter_word_list[-2:])
[(2, 'second'), (3, 'third')]

[('third', 3), ('second', 2)]

我想您已经意识到，您的问题是一个字典不能为一个键包含多个值，例如数字1。但是，作为单个值，它可能包含其他值的集合。好的，每次尝试执行此操作之前，您都需要查看字典

键。如果这句话已经存在，就增加out的数量。冲洗并无限重复。如果你想做的只是提取25个最大值的键，你不必先创建这个反转的dict@ᴋᴇʏsᴇʀ请原谅我的新手专长，哈哈，我是个新手。那么，我如何提取25个最大值的键及其值，以便在不创建反向字典的情况下绘制直方图？@UnworthyToast我发布了一个采用另一种方法的答案。我也是一个新手，所以请记住这一点。谢谢你，这非常有效！现在，我如何只抓取最大的25个键，以便绘制它们的图形？除了切片操作之外，我想不出其他方法，但很明显，对于字典，我无法做到这一点：P
>>> print(counter_word_list[-2:])
[(2, 'second'), (3, 'third')]

import heapq, operator
print(heapq.nlargest(2, word_counter_dictionary.items(), key=operator.itemgetter(1)))

[('third', 3), ('second', 2)]