Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/321.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何使用NLTK标记dataframe中的文本列_Python_Pandas_Dataframe_Nltk - Fatal编程技术网

Python 如何使用NLTK标记dataframe中的文本列

Python 如何使用NLTK标记dataframe中的文本列,python,pandas,dataframe,nltk,Python,Pandas,Dataframe,Nltk,我的df如下所示: team_name text --------- ---- red this is text from red team blue this is text from blue team green this is text from green team yellow this is text from yellow team 我正在努力做到这一点: team_name text

我的
df
如下所示:

team_name   text
---------   ----
red         this is text from red team
blue        this is text from blue team
green       this is text from green team
yellow      this is text from yellow team
我正在努力做到这一点:

team_name   text                             text_token
---------   ----                             ----------
red         this is text from red team       'this', 'is', 'text', 'from', 'red','team'
blue        this is text from blue team      'this', 'is', 'text', 'from', 'blue','team'
green       this is text from green team     'this', 'is', 'text', 'from', 'green','team'
yellow      this is text from yellow team    'this', 'is', 'text', 'from', 'yellow','team'
我试过什么

df['text_token'] = nltk.word_tokenize(df['text'])

这是行不通的。我如何达到我想要的结果?另外,是否可以执行
频率范围

堆栈溢出有几个示例供您研究

这已在链接中解决:


. 和
df['text\u token']=df.apply(lambda行:nltk.word\u tokenize(行['text']),axis=1)
这是否回答了您的问题?谢谢你写的答案。如何省略
NA
值?使用df['column']。fillna(value=myValue,inplace=True)谢谢!!如何获取
text\u标记的每行
Freq Dist
df['text_token'] = df.apply(lambda row: nltk.word_tokenize(row['text']), axis=1)