List 如何从文件中返回唯一单词列表并按字母顺序排序_List_Nltk_Word Frequency

List 如何从文件中返回唯一单词列表并按字母顺序排序

list

List 如何从文件中返回唯一单词列表并按字母顺序排序,list,nltk,word-frequency,List,Nltk,Word Frequency,我一直在尝试从文件中返回唯一单词的列表，并使用NLTK按字母顺序对它们进行排序，但没有成功，尽管我使用了几种不同的方法。这是我的密码： import nltk from nltk import FreqDist def get_vocabulary(self): with open(self.path, "r") as file: split = [line.split('\n') for line in file] fdist1 = FreqDist(spli

我一直在尝试从文件中返回唯一单词的列表，并使用NLTK按字母顺序对它们进行排序，但没有成功，尽管我使用了几种不同的方法。这是我的密码：

import nltk
from nltk import FreqDist

def get_vocabulary(self):
    with open(self.path, "r") as file:
        split = [line.split('\n') for line in file]
    fdist1 = FreqDist(split)
    unique_words = fdist1.hapaxes()
    return sorted(set(unique_words))

错误是：

TypeError: unhashable type: 'list'

我尝试过的其他类似方法也出现了类似的错误。解决方案不必包括nltk，但如果您能告诉我在我自己的解决方案中犯了哪些错误，我将不胜感激；DR

from collections import Counter
from nltk import word_tokenize

with open('filename.txt') as fin:
    word_count = Counter(word_tokenize(fin.read()))


# Sorted by most common.
word_count.most_common()

# Sorted alphabetically
sorted(word_count.items())

# If you just need the words. 
sorted(word_count)

它不起作用了。我需要像['albanologie'，'AllgeMine'，'als']这样的独特单词，但我从您的代码中得到的是[（'（'，2），（'），2），（'albanologie'，1）]