如何在python中删除字符串数组的停止字?
我有问题,当用户输入字符串,然后我调用停止字它是给我一个错误[删除停止字r如何在python中删除字符串数组的停止字?,python,python-3.x,nlp,Python,Python 3.x,Nlp,我有问题,当用户输入字符串,然后我调用停止字它是给我一个错误[删除停止字r 不清楚您想完成什么,但这段代码可能会解决您的问题。下面的代码包含3个问题,这些问题已经输入,标记并规范化了问题,删除了英语停止词和常用标点符号 from nltk.corpus import stopwords from nltk.tokenize import word_tokenize # Import the string module needed to remove punctuation characte
不清楚您想完成什么,但这段代码可能会解决您的问题。下面的代码包含3个问题,这些问题已经输入,标记并规范化了问题,删除了英语停止词和常用标点符号
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
# Import the string module needed to remove punctuation characters
from string import punctuation
# English stop words to remove from text.
# A stop word is a commonly used word, such
# as “the”, “a”, “an”, “in”
stop_words = set(stopwords.words('english'))
# ASCII characters which are considered punctuation characters.
# These characters will be removed from the text
exclude_punctuation = set(punctuation)
# Combine the stop words and the punctuations to remove
exclude_combined = set.union(stop_words, exclude_punctuation)
question_input = []
for i in range(3):
question_input.append(input("Please Enter Your Question: "))
# converts the question list into a group of strings that are separated by a comma
questions = (', '.join(question_input))
# Tokenize and normalized the questions
tokenize_input = word_tokenize(questions.lower().strip())
# Remove the English stop words and punctuations
expunge_stopwords_punctuations = [word for word in tokenize_input if not word in exclude_combined]
print (expunge_stopwords_punctuations)
sys.exit(0)
#####################################################
# INPUT
# Please Enter Your Question: This is a question.
# Please Enter Your Question: This is another question.
# Please Enter Your Question: This is the final question.
#####################################################
#####################################################
# OUTPUT
# ['question', 'another', 'question', 'final', 'question']
#####################################################
请在您的问题中发布完整的代码和错误消息-而不仅仅是picture@nick我编辑了我的问题。看这个。@RanaEssam我的回答有助于解决你的问题吗?
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
# Import the string module needed to remove punctuation characters
from string import punctuation
# English stop words to remove from text.
# A stop word is a commonly used word, such
# as “the”, “a”, “an”, “in”
stop_words = set(stopwords.words('english'))
# ASCII characters which are considered punctuation characters.
# These characters will be removed from the text
exclude_punctuation = set(punctuation)
# Combine the stop words and the punctuations to remove
exclude_combined = set.union(stop_words, exclude_punctuation)
question_input = []
for i in range(3):
question_input.append(input("Please Enter Your Question: "))
# converts the question list into a group of strings that are separated by a comma
questions = (', '.join(question_input))
# Tokenize and normalized the questions
tokenize_input = word_tokenize(questions.lower().strip())
# Remove the English stop words and punctuations
expunge_stopwords_punctuations = [word for word in tokenize_input if not word in exclude_combined]
print (expunge_stopwords_punctuations)
sys.exit(0)
#####################################################
# INPUT
# Please Enter Your Question: This is a question.
# Please Enter Your Question: This is another question.
# Please Enter Your Question: This is the final question.
#####################################################
#####################################################
# OUTPUT
# ['question', 'another', 'question', 'final', 'question']
#####################################################