Python从数据帧中删除自定义的停止字
我在回答下一个问题: 但对于我来说,定制的停止语列表不起作用,请查看以下代码:Python从数据帧中删除自定义的停止字,python,pandas,dataframe,Python,Pandas,Dataframe,我在回答下一个问题: 但对于我来说,定制的停止语列表不起作用,请查看以下代码: pos_tweets = [('I love this car', 'positive'), ('This view is amazing', 'positive'), ('I feel great this morning', 'positive'), ('I am so excited about the concert', 'positive'), ('He is my best friend', 'posi
pos_tweets = [('I love this car', 'positive'),
('This view is amazing', 'positive'),
('I feel great this morning', 'positive'),
('I am so excited about the concert', 'positive'),
('He is my best friend', 'positive')]
import pandas as pd
test = pd.DataFrame(pos_tweets)
test.columns = ["tweet","col2"]
test["tweet"] = test["tweet"].str.lower().str.split()
stop = ['love','car','amazing']
test['tweet'].apply(lambda x: [item for item in x if item not in stop)
print test
结果是:
tweet col2
0 [i, love, this, car] positive
1 [this, view, is, amazing] positive
2 [i, feel, great, this, morning] positive
3 [i, am, so, excited, about, the, concert] positive
4 [he, is, my, best, friend] positive
爱、车和惊奇的字眼还在,我错过了什么
谢谢 您需要将输出分配回列
tweet
:
test['tweet'] = test['tweet'].apply(lambda x: [item for item in x if item not in stop])
print (test)
tweet col2
0 [i, this] positive
1 [this, view, is] positive
2 [i, feel, great, this, morning] positive
3 [i, am, so, excited, about, the, concert] positive
4 [he, is, my, best, friend] positive
你的解决方案非常有效!还有一个问题,我必须做些什么来删除文本中的逗号,比如:tweet col2 0[I this]positive 1[I this]positive 1[I this view is]positive 2[我今天早上感觉很棒]positive 3[我对音乐会非常兴奋]正数4[他是我最好的朋友]正数您是否需要将列表转换为每行的字符串?