如何在pyspark数据帧上应用Word Net Lemmatizer?

如何在pyspark数据帧上应用Word Net Lemmatizer?,pyspark,nltk,lemmatization,Pyspark,Nltk,Lemmatization,我试图在我的一个数据框列上应用WordNet柠檬化 我的数据框如下所示: +--------------------+-----+ | removed|stars| +--------------------+-----+ |[today, second, t...| 1.0| |[ill, first, admi...| 4.0| |[believe, things,...| 1.0| |[great, lunch, to...| 4.0| |[weve, hu

我试图在我的一个数据框列上应用WordNet柠檬化

我的数据框如下所示:

+--------------------+-----+
|             removed|stars|
+--------------------+-----+
|[today, second, t...|  1.0|
|[ill, first, admi...|  4.0|
|[believe, things,...|  1.0|
|[great, lunch, to...|  4.0|
|[weve, huge, slim...|  5.0|
|[plumbsmart, prov...|  5.0|
因此,每一行都是令牌列表。现在我想把每一个代币柠檬化

我试过:

from nltk.stem import WordNetLemmatizer
lemmatizer = WordNetLemmatizer() 

df_lemma= df_removed.select(lemmatizer.lemmatize('removed')) 
df_lemma.show()
我没有收到任何错误消息,但我的数据帧没有更改

+--------------------+
|             removed|
+--------------------+
|[today, second, t...|
|[ill, first, admi...|
|[believe, things,...|
|[great, lunch, to...|
|[weve, huge, slim...|
|[plumbsmart, prov...|
我的代码有错误吗?我应该如何使用柠檬水