python列表中不太常见的单词
我对最常见的单词进行了计数,以便只保留列表中128个最常见的单词的顺序:python列表中不太常见的单词,python,list,counter,Python,List,Counter,我对最常见的单词进行了计数,以便只保留列表中128个最常见的单词的顺序: words = my_list mcommon_words = [word for word, word_count in Counter(words).most_common(128)] my_list = [x for x in my_list if x in mcommon_words] my_list = OrderedDict.fromkeys(my_list) my_list = list(my_list.ke
words = my_list
mcommon_words = [word for word, word_count in Counter(words).most_common(128)]
my_list = [x for x in my_list if x in mcommon_words]
my_list = OrderedDict.fromkeys(my_list)
my_list = list(my_list.keys())
但现在我想用同样的方法计算128个不太常见的单词。更快的解决方案也会对我有很大帮助您可以尝试以下方法:
from collections import Counter
def common_words(words, number_of_words, reverse=False):
counter = Counter(words)
return sorted(counter, key = counter.get, reverse=reverse)[:number_of_words]
这里我们确保计数器字典按其值排序。排序后,我们返回最少最多的单词。下面是一个测试示例:
words = []
for i,num in enumerate('one two three four five six seven eight nine ten'.split()):
words.extend([num]*(i+1))
print(common_words(words,5))
本例从单词列表中获得了5个最不常见的单词
结果:
['one', 'two', 'three', 'four', 'five']
['ten', 'nine', 'eight', 'seven', 'six']
我们还可以得到最常用的词:
print(common_words(words,5, reverse=True))
结果:
['one', 'two', 'three', 'four', 'five']
['ten', 'nine', 'eight', 'seven', 'six']
most_common
将单词及其计数作为元组列表返回。此外
该方法返回列表的事实意味着您可以使用切片来获取第一个和最后一个n
元素
演示:
可能重复的