使用Python'；s regex.match（）方法获取下划线前后的字符串_Python_Regex_String_Iterator

使用Python'；s regex.match（）方法获取下划线前后的字符串

python regex string

使用Python'；s regex.match（）方法获取下划线前后的字符串,python,regex,string,iterator,Python,Regex,String,Iterator,我有以下代码： tablesInDataset = ["henry_jones_12345678", "henry_jones", "henry_jones_123"] for table in tablesInDataset: tableregex = re.compile("\d{8}") tablespec = re.match(tableregex, table) everythingbeforedigits = tablespec.group(0)

我有以下代码：

tablesInDataset = ["henry_jones_12345678", "henry_jones", "henry_jones_123"]

for table in tablesInDataset:
    tableregex = re.compile("\d{8}")
    tablespec = re.match(tableregex, table)

    everythingbeforedigits = tablespec.group(0)
    digits = tablespec.group(1)

如果我的正则表达式在下划线后包含8位数字，则它只应返回字符串。一旦它返回字符串，我想使用

.match（）

方法获得两个组。第一组应包含一个字符串，该字符串将显示数字前的所有字符，第二组应包含一个8位字符串

使用

.match（）

和

.group（）

？

我认为这个模式应该符合您的需要：

（.*？）（\d{8}）

第一组包括8位以内的所有内容，包括下划线。第二组是8位数字

如果不希望包含下划线，请改用此选项：

（.*？）\ud（\d{8}）

使用捕获组：

>>> import re
>>> pat = re.compile(r'(?P<name>.*)_(?P<number>\d{8})')
>>> pat.findall(s)
[('henry_jones', '12345678')]

给你：

import re

tablesInDataset = ["henry_jones_12345678", "henry_jones", "henry_jones_123"]
rx = re.compile(r'^(\D+)_(\d{8})$')

matches = [(match.groups()) \
            for item in tablesInDataset \
            for match in [rx.search(item)] \
            if match]
print(matches)

比任何点星汤都好：）

>>> match = pat.match(s)
>>> match.groupdict()
{'name': 'henry_jones', 'number': '12345678'}

import re

tablesInDataset = ["henry_jones_12345678", "henry_jones", "henry_jones_123"]
rx = re.compile(r'^(\D+)_(\d{8})$')

matches = [(match.groups()) \
            for item in tablesInDataset \
            for match in [rx.search(item)] \
            if match]
print(matches)