python nltk——句子/短语的词干列表
我在一个列表中有很多句子,我想用nltk库来阻止它。我能够一次干掉一个句子,但是我在从列表中干掉句子并将它们重新连接在一起时遇到了问题。有没有我错过的一步?nltk库非常新。谢谢python nltk——句子/短语的词干列表,python,nltk,porter-stemmer,stem,Python,Nltk,Porter Stemmer,Stem,我在一个列表中有很多句子,我想用nltk库来阻止它。我能够一次干掉一个句子,但是我在从列表中干掉句子并将它们重新连接在一起时遇到了问题。有没有我错过的一步?nltk库非常新。谢谢 import nltk from nltk.stem import PorterStemmer ps = PorterStemmer() # Success: one sentences at a time data = 'the gamers playing games' words = word_token
import nltk
from nltk.stem import PorterStemmer
ps = PorterStemmer()
# Success: one sentences at a time
data = 'the gamers playing games'
words = word_tokenize(data)
for w in words:
print(ps.stem(w))
# Fails:
data_list = ['the gamers playing games',
'higher scores',
'sports']
words = word_tokenize(data_list)
for w in words:
print(ps.stem(w))
# Error: TypeError: expected string or bytes-like object
# result should be:
['the gamer play game',
'higher score',
'sport']
您正在将一个列表传递给
word\u tokenize
,但您不能这样做
解决方案是将逻辑包装在另一个for循环中
data_list = ['the gamers playing games','higher scores','sports']
for words in data_list:
words = tokenize.word_tokenize(words)
for w in words:
print(ps.stem(w))
>>>>the
gamer
play
game
higher
score
sport