Nlp 计算已过滤的二进制数_Nlp_Nltk_Python 3.7

Nlp 计算已过滤的二进制数

nlp

Nlp 计算已过滤的二进制数,nlp,nltk,python-3.7,Nlp,Nltk,Python 3.7,在NLP上动手解决问题，并被困在下面给出的任务中以下是需要按顺序执行的语句我已完成以下步骤，但fresco平台不接受该解决方案请让我知道我在下面的代码和步骤中做错了什么任务 1.导入文本语料库提取与属于的文本集合关联的单词列表新闻体裁。将结果存储在变量news\u words中将列表中的每个单词转换成小写，并存储结果为lc\u新闻词计算列表的bigramlc\u news\u words，并将其存储在变量中 lc\u news\u bigrams 从lc\u news\u

在NLP上动手解决问题，并被困在下面给出的任务中

以下是需要按顺序执行的语句

我已完成以下步骤，但fresco平台不接受该解决方案

请让我知道我在下面的代码和步骤中做错了什么

任务 1.导入文本语料库

提取与属于的文本集合关联的单词列表新闻体裁。将结果存储在变量news\u words中

将列表中的每个单词转换成小写，并存储结果为lc\u新闻词

计算列表的bigramlc\u news\u words，并将其存储在变量中 lc\u news\u bigrams

从lc\u news\u bigrams中，过滤两个单词都只包含的bigrams 字母字符。将结果存储在lc_news_alpha_bigrams中

提取与语料库单词相关联的单词列表。存储导致停止单词

将列表中的每个单词转换成小写，并存储导致lc\u停止\u单词

仅从单词所在的地方过滤字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母字母不是lc\u停止词的一部分。将结果存储在 lc\u新闻\u alpha\u不间断\u bigrams

打印已过滤的bigram的总数

下面是我到目前为止所做的代码。但fresco平台不接受输出

导入nltk 导入nltk.corpus 从nltk.corpus导入布朗从nltk.util导入bigrams 从nltk.corpus导入停止词 news\u words=brown.words（categories='news'） lc_news_words=[w.lower（）表示新闻词中的w] lc_news_bigrams=列表（nltk.bigrams（lc_news_words）） lc_news_alpha_bigrams=[（word1，word2）表示lc_news_bigrams中的word1，word2如果（word1.isalpha（）和word2.isalpha（））] stop\u words=stopwords.words（'english'）） lc_stop_words=[w.lower（）表示停止词中的w] lc_news_alpha_nonstop_bigrams=[（w1，w2）对于lc_news_alpha_bigrams中的w1，w2，如果（w1.lower（）不在lc_stop_words中，w2.lower（）不在lc_stop_words中）] len（（连续剧新闻连续剧）
您做的一切都是正确的，只需从

stop\u words=stopwords.words（'english'） stop\u words=stopwords.words（）

行得通
你做的每件事都是正确的，只需从

stop\u words=stopwords.words（'english'） stop\u words=stopwords.words（）
将起作用
从nltk.corpus导入布朗从nltk.corpus导入停止词导入nltk news_words=[逐字逐句的棕色单词（类别='新闻'）] lc_news_words=[word.lower（）表示新闻单词中的单词] len_news_words=[在lc_news_words中len（word）表示单词] news_len_bigrams=列表（nltk.bigrams（len_news_words）） cfd\u news=nltk.ConditionalFreqDist 印刷（cfd_新闻[4][6]） lc_news_bigrams=列表（nltk.bigrams（lc_news_words）） lc_news_alpha_bigrams=[（w1，w2）对于lc_news_bigrams中的w1，w2，如果w1.isalpha（）和w2.isalpha（）] stop\u words=stopwords.words（） lc_stop_words=[word.lower（）表示stop_words中的单词] lc_news_alpha_nonstop_bigrams=[（w1，w2）对于lc_news_alpha_bigrams中的w1，w2，如果不是（（w1在lc_stop_单词中）或（w2在lc_stop_单词中））] 打印（len（连读新闻、字母、不间断的字母）
我在fresco平台上为task2和task3添加了代码，但平台不接受它
可能是什么问题？
从nltk.corpus导入布朗从nltk.corpus导入停止词导入nltk news_words=[逐字逐句的棕色单词（类别='新闻'）] lc_news_words=[word.lower（）表示新闻单词中的单词] len_news_words=[在lc_news_words中len（word）表示单词] news_len_bigrams=列表（nltk.bigrams（len_news_words）） cfd\u news=nltk.ConditionalFreqDist 印刷（cfd_新闻[4][6]） lc_news_bigrams=列表（nltk.bigrams（lc_news_words）） lc_news_alpha_bigrams=[（w1，w2）对于lc_news_bigrams中的w1，w2，如果w1.isalpha（）和w2.isalpha（）] stop\u words=stopwords.words（） lc_stop_words=[word.lower（）表示stop_words中的单词] lc_news_alpha_nonstop_bigrams=[（w1，w2）对于lc_news_alpha_bigrams中的w1，w2，如果不是（（w1在lc_stop_单词中）或（w2在lc_stop_单词中））] 打印（len（连读新闻、字母、不间断的字母）
我在fresco平台上为task2和task3添加了代码，但平台不接受它

可能有什么问题？
在中使用和而不是或
（连字号中的w1）或（连字号中的w2）
在（w1用连字号表示）或（w2用连字号表示）
最终工作代码-Python 3

最终工作代码-Python 3
上述代码在逻辑上是有效的。这可能是Katacode的问题。我已经看到实际答案和预期答案之间存在256的差异。上述代码在逻辑上是有效的。这可能是Katacode的问题。我已经看到实际答案和预期答案之间存在256的差异。
import nltk from nltk.corpus import brown from nltk.corpus import stopwords news_words = brown.words(categories='news') lc_news_words = [l.lower() for l in news_words] len_news_words = [len(w) for w in lc_news_words] news_len_bigrams = list(nltk.bigrams(len_news_words)) cfd_news = nltk.ConditionalFreqDist(news_len_bigrams) cfd_news.tabulate(conditions=[6,4]) lc_news_bigrams = list(nltk.bigrams(lc_news_words)) lc_news_alpha_bigrams = [(w1,w2) for w1,w2 in lc_news_bigrams if (w1.isalpha() and w2.isalpha())] stop_words = stopwords.words() lc_stop_words = [l.lower() for l in stop_words] lc_news_alpha_nonstop_bigrams = [ (w1, w2) for w1, w2 in lc_news_alpha_bigrams if (w1.lower() not in lc_stop_words and w2.lower() not in lc_stop_words) ] print(len((lc_news_alpha_nonstop_bigrams)))