List 使用Python和NLP从列表中获取最频繁的POS标记_List_Python 2.7_Nlp_Nltk_Pos Tagger

List 使用Python和NLP从列表中获取最频繁的POS标记

list python-2.7 nlp

List 使用Python和NLP从列表中获取最频繁的POS标记,list,python-2.7,nlp,nltk,pos-tagger,List,Python 2.7,Nlp,Nltk,Pos Tagger,我正在尝试从列表中获取最常见的POS标签（前五名） pos_list = nltk.pos_tag(list) #pos_list = [('caught', 'NN'), ('black', 'NN'), ('a', 'DT'), ('striped', 'JJ'), ('eel', 'NN')] tag_fd = nltk.FreqDist(tag for (word, tag) in pos_list) 我也尝试过通过pos\u list循环来计算标签的数量，但是似乎有一种方法可以使用N

我正在尝试从列表中获取最常见的POS标签（前五名）

pos_list = nltk.pos_tag(list)
#pos_list = [('caught', 'NN'), ('black', 'NN'), ('a', 'DT'), ('striped', 'JJ'), ('eel', 'NN')]
tag_fd = nltk.FreqDist(tag for (word, tag) in pos_list)

我也尝试过通过

pos\u list

循环来计算标签的数量，但是似乎有一种方法可以使用

NLTK

来完成这项工作。我还尝试从列表中创建一个字符串，并尝试相同的方法，但这也不起作用

str_of_list = " ".join(list)
tag_fd = nltk.FreqDist(tag for (word, tag) in str_of_list)

谢谢，谢谢你的帮助

我不确定在NLTK中是否有这样做的方法，但是

集合。计数器肯定有一种方法：
import collections

pos_list = nltk.pos_tag(list)
pos_counts = collections.Counter((subl[1] for subl in pos_list))
print "the five most common tags are", pos_counts.most_common(5)

与@inspectorG4dget提出的方法相同的方法是使用NLTK.FreqDist
，该方法保留在NLTK中（即不使用集合），如下所示：
pos_list = nltk.pos_tag(list)
pos_counts = nltk.FreqDist(tag for (word, tag) in pos_list)
print "the five most common tags are", pos_counts.most_common(5)

import nltk
from collections import Counter
text='apple is a good company'
tagged = nltk.pos_tag(text.split())
print(Counter(i[1] for i in tagged).most_common(2))



output ->[('NN', 2), ('VBZ', 1)]