Python 如何仅在字符串完全显示时从字符串中删除特定单词_Python_Regex_Pandas

Python 如何仅在字符串完全显示时从字符串中删除特定单词

python regex pandas

Python 如何仅在字符串完全显示时从字符串中删除特定单词,python,regex,pandas,Python,Regex,Pandas,我有一个看起来像这样的数据框： 1 Hello? 2 Control. 3 that nan far. 4 Just in the last 20 years since your father di... 5 nan your

我有一个看起来像这样的数据框：

1                                               Hello?
2                                             Control.
3                                        that nan far.
4    Just in the last 20 years since your father di...
5    nan your father made all the financial nan nan...

我想从文本中删除子字符串“nan”。为此，我一直在使用以下方法：

df['words_no_nan'] = df['words'].replace(regex=True,to_replace=r'nan',value=r'')

这导致：

1                                               Hello?
2                                             Control.
3                                            that far.
4    Just in the last 20 years since your father di...
5                      your father made all the ficial

这基本上是有效的，但当“nan”出现在更大的单词中时，它会删除它。例如，在第5行中，子字符串“financial”变为“ficial”。当且仅当“nan”完全出现时，而不是作为子字符串的一部分（如财务）时，如何删除它？

尝试使用单词boundary

\b

，使其仅匹配boundry之前或之后的

nan

df['word'].str.replace(r'\bnan\b','',regex=True)

输出：

0                                               Hello?
1                                             Control.
2                                           that  far.
3    Just in the last 20 years since your father di...
4              your father made all the financial  ...