Python Pandas:用特定值替换行中的所有单词,列表中的单词除外
我的数据框如下所示,但更大:Python Pandas:用特定值替换行中的所有单词,列表中的单词除外,python,pandas,list,replace,split,Python,Pandas,List,Replace,Split,我的数据框如下所示,但更大: df = {"text": ["it is two degrees warmer", "it is five degrees warmer today", "it was ten degrees warmer and not cooler", "it is ten degrees cooler", "it is too frosty today",
df = {"text": ["it is two degrees warmer", "it is five degrees warmer today", "it was ten degrees warmer and not cooler", "it is ten degrees cooler", "it is too frosty today", "it is a bit icy and frosty today" ]}
allowed_list= ["cooler", "warmer", "frosty", "icy"]
我想用“O”替换列表中除单词外的所有单词,同时保持逗号分隔,如下所示:
desired output:
text
0 O,O,O,O,warmer
1 O,O,O,O,warmer,O
2 O,O,O,O,warmer,O,O,cooler
3 O,O,O,O,cooler
4 O,O,O,frosty,O
5 O,O,O,O,icy,O,frosty,O,
到目前为止,我所做的是使用基于空白的
str.split(“”)
将sting行拆分到列表中,但不确定如何去除列表中没有的单词。您可以使用列表理解,并将,
作为分隔符。另外,通过构建fromallowed_列表
,我们可以更快地查找:
allowed_set= set(["cooler","warmer","frosty","icy"])
df['text'] = [','.join([w if w in allowed_set else 'O' for w in s.split()])
for s in df['text']]
您可以使用列表理解和返回设置
,
作为分隔符。另外,通过构建fromallowed_列表
,我们可以更快地查找:
allowed_set= set(["cooler","warmer","frosty","icy"])
df['text'] = [','.join([w if w in allowed_set else 'O' for w in s.split()])
for s in df['text']]
您在
df
的定义中是否省略了pd.DataFrame
?您在df
的定义中是否省略了pd.DataFrame
?