python正则表达式，子组上的重复_Python_Regex

python正则表达式，子组上的重复

python regex

python正则表达式，子组上的重复,python,regex,Python,Regex,我目前正在努力掌握Python中正则表达式的技能。在这两方面我都不是专家。也许我使用了错误的正则表达式，这就是为什么我没有找到我的答案。如果是这样，请道歉我根据以下代码创建了一个teststring和两个不同的正则表达式： teststring = "This is just a string of literal text with some 0987654321 and an issue in it" reg = re.compile(r"([0-9]{3})*",re.DEBUG) o

我目前正在努力掌握Python中正则表达式的技能。在这两方面我都不是专家。也许我使用了错误的正则表达式，这就是为什么我没有找到我的答案。如果是这样，请道歉

我根据以下代码创建了一个teststring和两个不同的正则表达式：

teststring = "This is just a string of literal text with some 0987654321 and an issue in it"

reg = re.compile(r"([0-9]{3})*",re.DEBUG)
outSearch = reg.search(teststring)

print "Test with ([0-9]{3})*"
if outSearch:
    print "groupSearch = " + outSearch.group()
    print

reg = re.compile(r"([0-9]{3})+",re.DEBUG)
outSearch = reg.search(teststring)

print "Test with ([0-9]{3})+"
if outSearch:
    print "groupSearch = " + outSearch.group()

该测试cde的结果如下：

max_repeat 0 4294967295
  subpattern 1
    max_repeat 3 3
      in
        range (48, 57)
Test with ([0-9]{3})*
groupSearch = 

max_repeat 1 4294967295
  subpattern 1
    max_repeat 3 3
      in
        range (48, 57)
Test with ([0-9]{3})+
groupSearch = 098765432

现在有趣的是：我希望两个正则表达式返回时都会得到相同的结果。我得到的结果是（[0-9]{3}）+。当我使用（[0-9]{3}）*时，正则表达式匹配teststring，但outSearch.group（）为空。有人能解释一下这是为什么吗

顺便说一句，这两个正则表达式都没有实际用途，我只是想了解正则表达式在Python中是如何工作的。

您的第一个代码正在使用

进行重复-这意味着它将匹配上一组的零次或多次出现。但当您使用

重复时，至少需要出现一次。因此，只包含一个可选组的正则表达式将首先匹配字符串的开头，如果组不接受字符串的第一个字符，则不匹配任何字符。如果您检查每个匹配项的

start（）

和

end（）

，这将更清楚：

teststring = "some 0987654321"
reg = re.compile(r"([0-9]{3})*",re.DEBUG)
outSearch = reg.search(teststring)

print("Test with ([0-9]{3})*")
if outSearch:
    print ("groupSearch = " + outSearch.group() + ' , ' + str(outSearch.start()) + ' , ' + str(outSearch.end()))

reg = re.compile(r"([0-9]{3})+",re.DEBUG)
outSearch = reg.search(teststring)

print("Test with ([0-9]{3})+")
if outSearch:
    print ("groupSearch = " + outSearch.group() + ' , ' + str(outSearch.start()) + ' , ' + str(outSearch.end()))

输出：

Test with ([0-9]{3})*
groupSearch =  , 0 , 0

Test with ([0-9]{3})+
groupSearch = 098765432 , 5 , 14

（第一个正则表达式的匹配从索引0开始，在索引0结束-空字符串）

这不是Python独有的—这在任何地方都是预期的行为：

（点击其他语言-查看所有语言（不仅仅是Python）如何在索引0处开始和结束匹配）

在正则表达式中：

```
+
```
：匹配前面的一个或多个令牌
```
*
```
：匹配前面的零个或多个令牌

现在：

```
（[0-9]{3}）+
```
将匹配一个或多个时间（
```
+
```
）3个连续数字（
```
[0-9]{3}
```
），因此它在主要匹配组（即组0--098765432）中包含9个数字，忽略0987654321中的最后1个，即匹配范围从索引48到56（
```
测试字符串[48:57]
```
）。您也可以使用
```
SRE\u Match
```
对象的
```
span（）
```
方法来检查这一点，例如
```
outSearch.span（）
```
```
（[0-9]{3}）*
```
将匹配零个或更多时间（
```
*
```
）3个连续数字；由于它还可以匹配零时间，因此它匹配字符串的开头并在那里停止，并将空字符串作为主要匹配组输出，即匹配范围从字符串索引0到0

我想补充一点，打印re.findall（）方法也很有帮助。然后您可以看到在第一个正则表达式下发现的所有事件（结果可能也很有趣）