Python 将字符串与数字的来源分开

Python 将字符串与数字的来源分开,python,string,split,Python,String,Split,我有这根绳子 a = "IN 744301 Mus Andaman & Nicobar Islands 01 Nicobar 638 Carnicobar 9.2333 92.7833 4" 我想用正则表达式将其拆分,在任何数字出现的地方,输出如下 ['IN' , '744301', 'Mus Andaman & Nicobar Islands', '01' , 'Nicobar', '638', 'Carnicobar', '9.2333','92.7833', '4

我有这根绳子

a = "IN 744301 Mus Andaman & Nicobar Islands   01  Nicobar 638 Carnicobar 9.2333  92.7833 4"
我想用正则表达式将其拆分,在任何数字出现的地方,输出如下

['IN' , '744301', 'Mus Andaman & Nicobar Islands', '01' , 'Nicobar', '638', 'Carnicobar', '9.2333','92.7833', '4' ]

您可以使用“向前看”和“向后看”:

import re
a = "IN 744301 Mus Andaman & Nicobar Islands   01  Nicobar 638 Carnicobar 9.2333  92.7833 4"
new_a = re.split('(?<=\d)\s+|\s+(?=\d)', a)
正则表达式解释:


(?您可以
按类似数字的模式拆分
,然后按相同的模式拆分
findall。由于
split
findall
是“姐妹”函数,您将得到非数字和数字部分。现在,将它们压缩到一个列表中并消除空格

from itertools import chain
# You can improve the regex to cover numbers that start with a .
NUMBER = r'\d+(?:\.\d*)?'  
combined = chain.from_iterable(zip(re.split(NUMBER, a),                                                        
                                   re.findall(NUMBER, a)))
result = [x for x in map(str.strip, combined) if x]
#['IN', '744301', 'Mus Andaman & Nicobar Islands', '01', 'Nicobar',
# '638', 'Carnicobar', '9.2333', '92.7833', '4']
您可以与组(捕获括号)一起使用,以在结果中保留分隔符(数字):

>>> import re
>>> a = "IN 744301 Mus Andaman & Nicobar Islands   01  Nicobar 638 Carnicobar 9.2333  92.7833 4"
>>> re.split(r'(\d+(?:\.\d+)?)', a)
['IN ', '744301', ' Mus Andaman & Nicobar Islands   ', '01', '  Nicobar ', '638', ' Carnicobar ', '9.2333', '  ', '92.7833', ' ', '4', '']

到目前为止你试过什么?你能详细说明这个正则表达式吗..谢谢能详细说明这个正则表达式吗..谢谢在regex101.com上
>>> import re
>>> a = "IN 744301 Mus Andaman & Nicobar Islands   01  Nicobar 638 Carnicobar 9.2333  92.7833 4"
>>> re.split(r'(\d+(?:\.\d+)?)', a)
['IN ', '744301', ' Mus Andaman & Nicobar Islands   ', '01', '  Nicobar ', '638', ' Carnicobar ', '9.2333', '  ', '92.7833', ' ', '4', '']