Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/ms-access/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在数据框中对列表中的单词进行语法化_Python_Pandas_Nlp - Fatal编程技术网

Python 在数据框中对列表中的单词进行语法化

Python 在数据框中对列表中的单词进行语法化,python,pandas,nlp,Python,Pandas,Nlp,应用标记化后,我有一个熊猫数据帧,如下所示。我想在这个数据帧中应用nltk lemmatizer。我所尝试的是在这里给予。在异常中,如果表单为:TypeError:unhabable type:“list”,则会出现错误。我如何在这里正确地实现柠檬化器 另外请注意,第5个数据帧单元格有一个空列表。如何删除此数据框中的此类列表 [[ive, searching, right, words, thank, breather], [i, promise, wont, take, help, gran

应用标记化后,我有一个熊猫数据帧,如下所示。我想在这个数据帧中应用nltk lemmatizer。我所尝试的是在这里给予。在异常中,如果表单为:TypeError:unhabable type:“list”,则会出现错误。我如何在这里正确地实现柠檬化器

另外请注意,第5个数据帧单元格有一个空列表。如何删除此数据框中的此类列表

 [[ive, searching, right, words, thank, breather], [i, promise, wont, take, help, granted, fulfil, promise], [you, wonderful, blessing, times]]                     

 [[free, entry, 2, wkly, comp, win, fa, cup, final, tkts, 21st, may, 2005], [text, fa, 87121, receive, entry, questionstd, txt, ratetcs, apply, 08452810075over18s]]

 [[nah, dont, think, goes, usf, lives, around, though]]                                                                                                             

 [[even, brother, like, speak, me], [they, treat, like, aids, patent]]                                                                                              

 [[i, date, sunday, will], []] 
我试过的lemmatizer函数


您可以尝试以下操作:

def lemmatize(fullCorpus):
    lemmatizer = nltk.stem.WordNetLemmatizer()
    lemmatized = fullCorpus['tokenized'].apply(
            lambda row: list(list(map(lemmatizer.lemmatize,y)) for y in row))
    return lemmatized
def lemmatize(fullCorpus):
    lemmatizer = nltk.stem.WordNetLemmatizer()
    lemmatized = fullCorpus['tokenized'].apply(
            lambda row: list(list(map(lemmatizer.lemmatize,y)) for y in row))
    return lemmatized