Python 与regex不一致&引用；点元字符？_Python_Regex_Pandas

Python 与regex不一致&引用；点元字符？

python regex pandas

Python 与regex不一致&引用；点元字符？,python,regex,pandas,Python,Regex,Pandas,考虑 df Cost Store 1 22.5 Store 1 ......... Store 2 ... 要将这些点转换为nan，我可以使用： df.replace('^\.+$', np.nan, regex=True) Cost Store 1 22.5 Store 1 NaN Store 2 NaN 我不明白的是为什么以下模式也有效： df.replace('^.+$', np.nan, rege

考虑

df

              Cost
Store 1       22.5
Store 1  .........
Store 2        ...

要将这些点转换为nan，我可以使用：

df.replace('^\.+$', np.nan, regex=True)

         Cost
Store 1  22.5
Store 1   NaN
Store 2   NaN

我不明白的是为什么以下模式也有效：

df.replace('^.+$', np.nan, regex=True)

         Cost
Store 1  22.5
Store 1   NaN
Store 2   NaN

请注意，在本例中，我没有转义

，因此应将其视为匹配所有字符，从而导致每一行都转换为NaN。。。但它不是。。。。只有

…

行匹配。。。尽管我使用了matchall字符

与此形成对比的是：

import re
re.sub('^.+$', '', '22.5') 
''

返回一个空字符串

发生了什么事？

写这个问题一半的时候，我意识到问题是什么：

df.Cost.dtype
dtype('O')

df.Cost.values
array([22.5, '.........', '...'], dtype=object)

因此，

22.5

恰好是一个数值，当试图替换时，regex模式只是跳过非字符串值。进行

astype

转换可以明显看出：

df.astype(str).replace('.+', np.nan, regex=True)

         Cost
Store 1   NaN
Store 1   NaN
Store 2   NaN

问题解决了。保留此选项，以防其他人对此感到困惑。

@PJProudhon 48小时后才能使用。抱歉，我不知道此限制。或者您可以使用

Series.str.replace