Python重新查找组匹配的开始和结束索引_Python_Re

Python重新查找组匹配的开始和结束索引

python

Python重新查找组匹配的开始和结束索引,python,re,Python,Re,Python的重新匹配对象在匹配对象上有.start（）和.end（）方法。我想找到组匹配的开始和结束索引。我该怎么做？例如： >>重新导入 >>>REGEX=re.compile（r'h（？P[0-9]{3}）P'） >>>test=“你好，h889p什么的” >>>match=REGEX.search（测试） >>>match.group（'num'） '889' >>>match.start（） 6. >>>match.end（） 11 >>>match.group（'num'）.st

Python的重新匹配对象在匹配对象上有.start（）和.end（）方法。我想找到组匹配的开始和结束索引。我该怎么做？例如：

>>重新导入
>>>REGEX=re.compile（r'h（？P[0-9]{3}）P'）
>>>test=“你好，h889p什么的”
>>>match=REGEX.search（测试）
>>>match.group（'num'）
'889'
>>>match.start（）
6.
>>>match.end（）
11
>>>match.group（'num'）.start（）#试试这个。没用
回溯（最近一次呼叫最后一次）：
文件“”，第1行，在
AttributeError:“str”对象没有属性“start”
>>>REGEX.groupindex
mappingproxy（{'num'：1}）#这是正则表达式中组的索引，而不是组匹配的索引，因此不是我要查找的。

上面的预期输出是（7,10）

您可以使用字符串索引和

index（）

方法：

>>重新导入
>>>REGEX=re.compile（r'h（？P[0-9]{3}）P'）
>>>test=“你好，h889p什么的”
>>>match=REGEX.search（测试）
>>>test.index（match.group（'num'）[0]）
7.
>>>test.index（match.group（'num'）[-1]）
9

如果要将结果作为元组，请执行以下操作：

>>str_match=match.group（“num”）
>>>结果=（test.index（str_匹配[0]），test.index（str_匹配[-1]））
>>>结果
(7, 9)

注意：您可能需要考虑使用<代码>结果=（Test.index（StrasMead），Tetry.index（StrueMatt）+LeN（StrueMatt））< /C> >，以防止可能出现字符串相同字符的错误。例如，如果数字是

，那么

结果将是（7，8）
，因为9
的第一个实例位于索引8。
的一个轻微修改是使用索引
查找整个组，而不是组的开始和结束字符：
import re
REGEX = re.compile(r'h(?P<num>[0-9]{3})p')
test = "hello h889p something"
match = REGEX.search(test)
group = match.group('num')

# modification here to find the start point
idx = test.index(group)

# find the end point using len of group
output = (idx, idx + len(group)) #(7, 10)

重新导入
REGEX=re.compile（r'h（？P[0-9]{3}）P'）
test=“你好，h889p什么的”
match=REGEX.search（测试）
组=匹配。组（'num'）
#在此进行修改以找到起点
idx=测试索引（组）
#使用组的len找到终点
输出=（idx，idx+len（组））#（7,10）

这将在确定索引时检查整个字符串“889”
。因此，检查第一个8
和第一个9
时出错的可能性会小一些，尽管它仍然不完美（即如果“889”
在字符串中出现较早，没有被“h”
和“p”
包围）。给定示例的一个解决方法可以是使用lookarounds：
import re
REGEX = re.compile(r'(?<=h)[0-9]{3}(?=p)')
test = "hello h889p something"
match = REGEX.search(test)
print(match)

重新导入
REGEX=re.compile（r'（？）？
import re
REGEX = re.compile(r'(?<=h)[0-9]{3}(?=p)')
test = "hello h889p something"
match = REGEX.search(test)
print(match)

<re.Match object; span=(7, 10), match='889'>