Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/276.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从可变字符串中提取数字_Python_String_Python 3.x_Pandas - Fatal编程技术网

Python 从可变字符串中提取数字

Python 从可变字符串中提取数字,python,string,python-3.x,pandas,Python,String,Python 3.x,Pandas,给定此数据帧: import pandas as pd df = pd.DataFrame({'ID':['a','b','c','d','e','f','g','h','i','j','k'], 'value':['None',np.nan,'6D','7','10D','NONE','x','10D aaa','1 D','10 D aa',7] }) df ID value 0 a None

给定此数据帧:

import pandas as pd

df = pd.DataFrame({'ID':['a','b','c','d','e','f','g','h','i','j','k'],
                   'value':['None',np.nan,'6D','7','10D','NONE','x','10D aaa','1 D','10 D aa',7]
                   })
df


    ID  value
0   a   None
1   b   NaN
2   c   6D
3   d   7
4   e   10D
5   f   NONE
6   g   x
7   h   10D aaa
8   i   1 D
9   j   10 D aa
10  k   i7D
我想在存在的地方提取数字,否则返回0,用于上面显示的任何混乱情况

预期的结果是:

    ID  value
0   a   0
1   b   0
2   c   6
3   d   7
4   e   10
5   f   0
6   g   0
7   h   10
8   i   1
9   j   10
10  k   7

提前谢谢

或者,您可以通过在提取数字时捕获多个异常,将函数应用于数据帧:

def get_number(item):
    try:
        return int(re.search(r"\d+", str(item)).group(0))
    except (AttributeError, ValueError, IndexError):
        return 0

print(df.applymap(get_number))
印刷品:

    ID  value
0    0      0
1    0      0
2    0      6
3    0      7
4    0     10
5    0      0
6    0      0
7    0     10
8    0      1
9    0     10
10   0      7
使用和尝试以下操作:


下面是我使用
re.findall
apply

df['value'].apply(lambda x: 0 if not re.findall('\d+', str(x)) else re.findall('\d+', str(x))[0])

我会这样做:
pd.to_numeric(df.value.str.replace(r'\D+',''),errors='improve')。fillna(0)。astype(int)
df['value'].apply(lambda x: 0 if not re.findall('\d+', str(x)) else re.findall('\d+', str(x))[0])