Python POS后使用Wordnet实现pandas列的柠檬化_Python_Pandas_Nltk_Wordnet_Lemmatization

Python POS后使用Wordnet实现pandas列的柠檬化

python pandas

Python POS后使用Wordnet实现pandas列的柠檬化,python,pandas,nltk,wordnet,lemmatization,Python,Pandas,Nltk,Wordnet,Lemmatization,我有一个带有文本的熊猫专栏df\u-travail[line\u-text] 我想把这个专栏的每个词都用柠檬语法化首先，我将文本小写： df_travail ['lowercase'] = df_travail['line_text'].str.lower() 然后，我对它进行标记并应用POS（因为WordNet默认配置将每个单词作为名词）。然后我有以下内容：（整个df_阵痛的摘录['tok_and_tag'] "[('so', 'RB'), ('you', 'PRP'), (""'ve"

我有一个带有文本的熊猫专栏

df\u-travail[line\u-text]

我想把这个专栏的每个词都用柠檬语法化

首先，我将文本小写：

df_travail ['lowercase'] = df_travail['line_text'].str.lower()

然后，我对它进行标记并应用POS（因为WordNet默认配置将每个单词作为名词）。然后我有以下内容：（整个

df_阵痛的摘录['tok_and_tag']

"[('so', 'RB'), ('you', 'PRP'), (""'ve"", 'VBP'), ('come', 'VBN'), ('to', 'TO'), ('the', 'DT'), ('master', 'NN'), ('for', 'IN'), ('guidance', 'NN'), ('?', '.'), ('is', 'VBZ'), ('this', 'DT'), ('what', 'WP'), ('you', 'PRP'), (""'re"", 'VBP'), ('saying', 'VBG'), (',', ','), ('grasshopper', 'NN'), ('?', '.')]"
[('actually', 'RB'), (',', ','), ('you', 'PRP'), ('called', 'VBD'), ('me', 'PRP'), ('in', 'IN'), ('here', 'RB'), (',', ','), ('but', 'CC'), ('yeah', 'UH'), ('.', '.')]

然而，为了考虑到我应用POS的事实，我对要应用的柠檬化函数（使用Wordnet）一无所知

编辑：下面的链接没有提到我问题的POS部分

这是一个考虑“动词”而不是“名词”的样本代码。

from nltk.stem import WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()

def convert(text):
    lemmatized_text = []
    for i in text.split():
        lemmatized_text.append(str(wordnet_lemmatizer.lemmatize(i,pos="v")))

    return ' '.join(lemmatized_text)

df['text'] = df['text'].apply(lambda x: convert(x))

不，我知道这篇文章，没有提到POS。你能提供一个样本数据吗？这个回答你的问题吗？这应该特别有帮助：好的，那么我知道我必须改变分类，以使其不那么具体，并与wordnet lematizer相匹配。但是，如何将所有内容与我的pandas专栏混合？

from nltk.stem import WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()

def convert(text):
    lemmatized_text = []
    for i in text.split():
        lemmatized_text.append(str(wordnet_lemmatizer.lemmatize(i,pos="v")))

    return ' '.join(lemmatized_text)

df['text'] = df['text'].apply(lambda x: convert(x))