Python 3.x 为什么对列表列表的迭代不起作用？_Python 3.x_List_Nlp

Python 3.x 为什么对列表列表的迭代不起作用？

python-3.x list nlp

Python 3.x 为什么对列表列表的迭代不起作用？,python-3.x,list,nlp,Python 3.x,List,Nlp,我试图在存储为列表列表的句子中查找关键字。外部列表包含句子，内部列表包含句子中的单词。我想迭代每个句子中的每个单词，查找定义的关键字，并返回找到的值这就是我的代词句子的样子。我从这篇文章中得到了帮助。然而，作为回报，我得到了一个空列表这是我写的代码 import nltk from nltk.tokenize import TweetTokenizer, sent_tokenize, word_tokenize text = "MDCT SCAN OF THE CHEST:

我试图在存储为列表列表的句子中查找关键字。外部列表包含句子，内部列表包含句子中的单词。我想迭代每个句子中的每个单词，查找定义的关键字，并返回找到的值

这就是我的代词句子的样子。

我从这篇文章中得到了帮助。然而，作为回报，我得到了一个空列表

这是我写的代码

 import nltk
 from nltk.tokenize import TweetTokenizer, sent_tokenize, word_tokenize

 text = "MDCT SCAN OF THE CHEST:     HISTORY: Follow-up LUL nodule.   TECHNIQUES: Non-enhanced and contrast-enhanced MDCT scans were performed with a slice thickness of 2 mm.   COMPARISON: Chest CT dated on 01/05/2018, 05/02/207, 28/09/2016, 25/02/2016, and 21/11/2015.     FINDINGS:   Lung parenchyma: There is further increased size and solid component of part-solid nodule associated with internal bubbly lucency and pleural tagging at apicoposterior segment of the LUL (SE 3; IM 38-50), now measuring about 2.9x1.7 cm in greatest transaxial dimension (previously size 2.5x1.3 cm in 2015). Also further increased size of two ground-glass nodules at apicoposterior segment of the LUL (SE 3; IM 37), and superior segment of the LLL (SE 3; IM 58), now measuring about 1 cm (previously size 0.4 cm in 2015), and 1.1 cm (previously size 0.7 cm in 2015) in greatest transaxial dimension, respectively."  

 tokenizer_words = TweetTokenizer()
 tokens_sentences = [tokenizer_words.tokenize(t) for t in 
 nltk.sent_tokenize(text)]

 nodule_keywords = ["nodules","nodule"]
 count_nodule =[]
 def GetNodule(sentence, keyword_list):
     s1 = sentence.split(' ')
     return [i for i in  s1 if i in keyword_list]

 for sub_list in tokens_sentences:
     result_calcified_nod = GetNodule(sub_list[0], nodule_keywords)
     count_nodule.append(result_calcified_nod)

但是，作为count_nomble中变量的结果，我得到了一个空列表

这是前两行“token_句子”的值

请帮我找出哪里做错了

错误在这里：

for sub_list in tokens_sentences:
     result_calcified_nod = GetNodule(sub_list[0], nodule_keywords)

您在

标记的句子中循环遍历每个子列表
，但只将第一个单词子列表[0]
传递给getnomble

这种类型的错误相当常见，而且有点难以捕捉，因为如果您错误地调用它，期望字符串列表的Python代码将愉快地接受并迭代单个字符串中的单个字符。如果你想防御性的话，也许最好加上
assert not all(len(x)==1 for x in sentence)

当然，正如@dyz在他们的回答中所指出的，如果您希望语句
已经是一个单词列表，那么就没有必要拆分函数中的任何内容。把句子循环一下
return [w for w in sentence if w in keyword_list]

另外，您可能希望使用列表result\u calcified\u nod扩展最终结果，而不是追加它
您需要从getnomine
中删除s1=句子。拆分（“”）
，因为句子已经标记化（它已经是列表
）

从获取结节（子列表[0]，结节关键字）
中删除[0]
。不知道为什么要将每个句子的第一个单词传递到getnomble

如果问题来自代码的nltk
部分，或者列表迭代器部分，您是否可以隔离问题？另外，请包括tokens\u句
@Devesh Kumar Singh列表迭代器部分的值。好的，请向我们展示tokens\u句的样子！添加代币的图片\u句子请不要粘贴数据，而不是图片！
return [w for w in sentence if w in keyword_list]