Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C# 为什么Group.Value总是最后匹配的组字符串?_C#_Regex_Match_Regex Group - Fatal编程技术网

C# 为什么Group.Value总是最后匹配的组字符串?

C# 为什么Group.Value总是最后匹配的组字符串?,c#,regex,match,regex-group,C#,Regex,Match,Regex Group,最近,我发现一个C#Regex API真的很烦人 我有正则表达式([0-9]+)|([a-z]+)+。我想找到所有匹配的字符串。代码如下所示 string regularExp = "(([0-9]+)|([a-z]+))+"; string str = "abc123xyz456defFOO"; Match match = Regex.Match(str, regularExp, RegexOptions.None); int matchCount = 0; while (match.Su

最近,我发现一个C#Regex API真的很烦人

我有正则表达式
([0-9]+)|([a-z]+)+
。我想找到所有匹配的字符串。代码如下所示

string regularExp = "(([0-9]+)|([a-z]+))+";
string str = "abc123xyz456defFOO";

Match match = Regex.Match(str, regularExp, RegexOptions.None);
int matchCount = 0;

while (match.Success)
{
    Console.WriteLine("Match" + (++matchCount));

    Console.WriteLine("Match group count = {0}", match.Groups.Count);
    for (int i = 0; i < match.Groups.Count; i++)
    {
        Group group = match.Groups[i];
        Console.WriteLine("Group" + i + "='" + group.Value + "'");
    }

    match = match.NextMatch();
    Console.WriteLine("go to next match");
    Console.WriteLine();
}
似乎所有group.Value都是最后匹配的字符串(“def”和“456”)。我花了一些时间想我应该依靠group.Captures而不是group.Value

string regularExp = "(([0-9]+)|([a-z]+))+";
string str = "abc123xyz456def";
//Console.WriteLine(str);

Match match = Regex.Match(str, regularExp, RegexOptions.None);
int matchCount = 0;

while (match.Success)
{
    Console.WriteLine("Match" + (++matchCount));

    Console.WriteLine("Match group count = {0}", match.Groups.Count);
    for (int i = 0; i < match.Groups.Count; i++)
    {
        Group group = match.Groups[i];
        Console.WriteLine("Group" + i + "='" + group.Value + "'");

        CaptureCollection cc = group.Captures;
        for (int j = 0; j < cc.Count; j++)
        {
            Capture c = cc[j];
            System.Console.WriteLine("    Capture" + j + "='" + c + "', Position=" + c.Index);
        }
    }

    match = match.NextMatch();
    Console.WriteLine("go to next match");
    Console.WriteLine();
}

现在,我想知道为什么API设计是这样的。为什么Group.Value只返回最后匹配的字符串?这种设计看起来不太好。

主要原因是历史原因:正则表达式一直都是这样工作的,可以追溯到Perl和更高版本。但它的设计并不糟糕。通常,如果您希望每个匹配都是这样的,您只需去掉最外层的量词(
+
,在这种情况下)并使用
Matches()
方法而不是
match()
。每个支持正则表达式的语言都提供了一种方法:在Perl或JavaScript中,您可以在
/g
模式下进行匹配;在Ruby中使用
scan
方法;在Java中,反复调用
find()
,直到它返回
false
。类似地,如果您正在执行替换操作,则可以使用占位符(
$1
$2
\1
\2
,具体取决于语言)将捕获的子字符串插回

另一方面,据我所知,没有其他Perl5派生的正则表达式能够像.NET使用CaptureCollections那样检索中间捕获组匹配。我一点也不奇怪:实际上很少有人真的需要像这样一次抓取所有的比赛。想想所有的存储和/或处理能力,它可以用来跟踪所有这些中间匹配。这是一个不错的功能

string regularExp = "(([0-9]+)|([a-z]+))+";
string str = "abc123xyz456def";
//Console.WriteLine(str);

Match match = Regex.Match(str, regularExp, RegexOptions.None);
int matchCount = 0;

while (match.Success)
{
    Console.WriteLine("Match" + (++matchCount));

    Console.WriteLine("Match group count = {0}", match.Groups.Count);
    for (int i = 0; i < match.Groups.Count; i++)
    {
        Group group = match.Groups[i];
        Console.WriteLine("Group" + i + "='" + group.Value + "'");

        CaptureCollection cc = group.Captures;
        for (int j = 0; j < cc.Count; j++)
        {
            Capture c = cc[j];
            System.Console.WriteLine("    Capture" + j + "='" + c + "', Position=" + c.Index);
        }
    }

    match = match.NextMatch();
    Console.WriteLine("go to next match");
    Console.WriteLine();
}
Match1
Match group count = 4
Group0='abc123xyz456def'
    Capture0='abc123xyz456def', Position=0
Group1='def'
    Capture0='abc', Position=0
    Capture1='123', Position=3
    Capture2='xyz', Position=6
    Capture3='456', Position=9
    Capture4='def', Position=12
Group2='456'
    Capture0='123', Position=3
    Capture1='456', Position=9
Group3='def'
    Capture0='abc', Position=0
    Capture1='xyz', Position=6
    Capture2='def', Position=12
go to next match