Python 从可变字符串中提取数字_Python_String_Python 3.x_Pandas

Python 从可变字符串中提取数字

python string python-3.x pandas

Python 从可变字符串中提取数字,python,string,python-3.x,pandas,Python,String,Python 3.x,Pandas,给定此数据帧： import pandas as pd df = pd.DataFrame({'ID':['a','b','c','d','e','f','g','h','i','j','k'], 'value':['None',np.nan,'6D','7','10D','NONE','x','10D aaa','1 D','10 D aa',7] }) df ID value 0 a None

给定此数据帧：

import pandas as pd

df = pd.DataFrame({'ID':['a','b','c','d','e','f','g','h','i','j','k'],
                   'value':['None',np.nan,'6D','7','10D','NONE','x','10D aaa','1 D','10 D aa',7]
                   })
df


    ID  value
0   a   None
1   b   NaN
2   c   6D
3   d   7
4   e   10D
5   f   NONE
6   g   x
7   h   10D aaa
8   i   1 D
9   j   10 D aa
10  k   i7D

我想在存在的地方提取数字，否则返回0，用于上面显示的任何混乱情况

预期的结果是：

提前谢谢

或者，您可以通过在提取数字时捕获多个异常，将函数应用于数据帧：

def get_number(item):
    try:
        return int(re.search(r"\d+", str(item)).group(0))
    except (AttributeError, ValueError, IndexError):
        return 0

print(df.applymap(get_number))

印刷品：

    ID  value
0    0      0
1    0      0
2    0      6
3    0      7
4    0     10
5    0      0
6    0      0
7    0     10
8    0      1
9    0     10
10   0      7

使用和尝试以下操作：

下面是我使用

re.findall

和

apply

df['value'].apply(lambda x: 0 if not re.findall('\d+', str(x)) else re.findall('\d+', str(x))[0])

我会这样做：

pd.to_numeric（df.value.str.replace（r'\D+'，''），errors='improve'）。fillna（0）。astype（int）

df['value'].apply(lambda x: 0 if not re.findall('\d+', str(x)) else re.findall('\d+', str(x))[0])