C# 正则表达式：仅替换非嵌套匹配项_C#_.net_Regex

C# 正则表达式：仅替换非嵌套匹配项

c# .net regex

C# 正则表达式：仅替换非嵌套匹配项,c#,.net,regex,C#,.net,Regex,给定文本，例如： This is my [position]. Here are some items: [items] [item] Position within the item: [position] [/item] [/items] Once again, my [position]. 我需要匹配第一个和最后一个[position]，但不需要匹配[items]…[/items]中的[position]。用正则表达式可以吗？到目前为止，我所拥有的只是

给定文本，例如：

This is my [position].
Here are some items:
[items]
    [item]
         Position within the item: [position]
    [/item]
[/items]

Once again, my [position].

我需要匹配第一个和最后一个

[position]

，但不需要匹配

[items]…[/items]

中的[position]。用正则表达式可以吗？到目前为止，我所拥有的只是：

Regex.Replace(input, @"\[position\]", "replacement value")

但这比我想要的要多。

你可能会得到：

Regex.Replace(input,@"(?=\[position\])(!(\[item\].+\[position\].+\[/item\]))","replacement value");

我不知道，我讨厌这样的。但这是xml解析的工作，而不是正则表达式。如果括号真的是括号，只需搜索并替换为胡萝卜，然后进行xml解析。

如果检查两次该怎么办。像

s1 = Regex.Replace(input, @"(\[items\])(\w|\W)*(\[\/items\])", "")

这将为您提供：

This is my [position].
Here are some items:
Once again, my [position].

如您所见，项目部分被提取。然后在s1上，你可以提取你想要的位置。像

s2 = Regex.Replace(s1, @"\[position\]", "raplacement_value")

这可能不是最好的解决方案。我非常努力地在正则表达式上解决它，但没有成功。

正如Wug所提到的，正则表达式在计算方面并不擅长。一个更简单的选择是只找到您正在寻找的所有标记的位置，然后迭代它们并相应地构造您的输出。也许是这样的：

public string Replace(input, replacement)
{
    // find all the tags
    var regex = new Regex("(\[(?:position|/?item)\])");
    var matches = regex.Matches(input);

    // loop through the tags and build up the output string
    var builder = new StringBuilder();
    int lastIndex = 0;
    int nestingLevel = 0;
    foreach(var match in matches)
    {
        // append everything since the last tag;
        builder.Append(input.Substring(lastIndex, (match.Index - lastIndex) + 1));

        switch(match.Value)
        {
            case "[item]":
                nestingLevel++;
                builder.Append(match.Value);
                break;
            case "[/item]":
                nestingLevel--;
                builder.Append(match.Value);
                break;
            case "[position]":
                // Append the replacement text if we're outside of any [item]/[/item] pairs
                // Otherwise append the tag
                builder.Append(nestingLevel == 0 ? replacement : match.Value);
                break;
        }
        lastIndex = match.Index + match.Length;
    }

    builder.Append(input.Substring(lastIndex));
    return builder.ToString();
}

（免责声明：没有测试过。甚至没有尝试编译。为不可避免的错误提前道歉。）

这不是HTML，但它已经足够接近引用关于使用正则表达式解析HTML的强制性帖子了。逐字解析文本。如果在嵌套元素中发现任何位置（您必须为此维护一个标志），请忽略它。对于其他人，请替换数据。这个算法编写起来很简单。@Wug我不同意，因为OP想从搜索中排除所有[item]…[/item]位。@Wug我肯定不是在解析HTML。我发表了我的评论，因为你必须处理基于其他嵌套标记的有意排除项，这是一个计数问题。正则表达式不能计数，或者至少不能很好地计数。它会使XML解析器失败，因为没有根节点，并且有大量未关闭的“标记”。我在Expresso中尝试了这种模式，但没有成功。使用文本字符串将使其更具可读性。i、例如，

@（？=\[position\]）（！（\[item\].+\[position\]\[/item\]）”

@wug是的，如果是我的话，我会在代码中使用它，但我这样做是因为他那样做了way@PhillipSchmidt我确实在您的模式中看到了一个小问题

\[position\]\[/item\]

，应该是

\[position\].+\[/item\]

。即使进行了修改，它仍然不起作用。@chrisofspades，等等，我现在实际上正在测试它：PInteresting建议，但我仍然需要保留

[items]…[/items]

中的内容。我自己也在考虑类似的方法，基于@shiplu.mokadd.im上面的评论（）。这可能是最好的解决方案，因为纯正则表达式方法似乎不可行。