Regex在.NET测试站点上工作，但不在C#环境中工作_C#_Regex

Regex在.NET测试站点上工作，但不在C#环境中工作

c# regex

Regex在.NET测试站点上工作，但不在C#环境中工作,c#,regex,C#,Regex,关于这一问题： - 我在这里得到了问题的答案： - 在regex.NET测试站点上运行的.NET答案在我的C#Visual Studio环境中不起作用。下面是它的单元测试： [Test] public void GetAllHtmlSubsectionsWorksAsExpected() { var regPattern = new Regex(@"(?'o'<)(.*)(?'-o'>)+"); var html = "<%@ Page La

关于这一问题： -

我在这里得到了问题的答案： -

在regex.NET测试站点上运行的.NET答案在我的C#Visual Studio环境中不起作用。下面是它的单元测试：

[Test]
public void GetAllHtmlSubsectionsWorksAsExpected()
{
    var regPattern = new Regex(@"(?'o'<)(.*)(?'-o'>)+");

    var html = 
        "<%@ Page Language=\"C#\" %>" +
        "<td class=\"c1 c2 c3\" colspan=\"2\">" + 
        "lorem ipsum" + 
        "<div class=\"d1\" id=\"div2\" attrid=\"<%# Eval(\"CategoryID\") %>\">" + 
        "testing 123" + 
        "</div>" + 
        "asdf" + 
        "</td>";

    List<string> results = new List<string>();

    MatchCollection matches = regPattern.Matches(html);
    for (int mnum = 0; mnum < matches.Count; mnum++)
    {   
        Match match = matches[mnum];
        results.Add("Match #" + (mnum + 1) + " - Value: " + match.Value);
    }

    Assert.AreEqual(5, results.Count()); //Fails: results.Count() == 1
}

[测试]
public void GetAllHtmlSubsectionSworksAseExpected（）
{
var regPattern=new Regex（@“（？'o'）+”）；
变量html=
"" +
"" + 
“lorem ipsum”+
"" + 
“测试123”+
"" + 
“asdf”+
"";
列表结果=新列表（）；
MatchCollection matches=regPattern.matches（html）；
for（int mnum=0；mnum


为什么这在regexstorm网站上有效，而在我的单元测试中无效？
在regex中有两种不同的功能：匹配和捕获
您需要的是捕获组1
所以你需要使用这个：
results.Add("Match #" + (mnum + 1) + " - Value: " + match.Groups[1].Value);

另外，正如另一个答案所指出的，您缺少新行，并且regex在第一次匹配中捕获了所有新行。
请注意，使用regex解析HTML不是最佳做法，您应该使用专用的解析器。
现在，对于问题本身，您使用的模式将仅适用于具有一个子串的行，该子串以
结束。但是，您的输入字符串没有换行符它看起来像：
<%@ Page Language="C#" %><td class="c1 c2 c3" colspan="2">lorem ipsum<div class="d1" id="div2" attrid="<%# Eval("CategoryID") %>">testing 123</div>asdf</td>

看
C#：
var r=new Regex（@“
<#第一个“”，并从捕获堆栈中删除1个值
)*
（？（c）（？！）#如果“c”堆栈不是空的，则失败！
)                
>#上次结账`>`
"; RegexOptions.IgnoreWhitespace）；

免责声明：如果元素节点中有未配对的
，则即使此正则表达式也会失败，这就是为什么不使用正则表达式解析HTML的原因。
您确定。*需要更改吗？我找不到一个例子来说明这种差异。作为（？'o'），那么（？'o'）和[^]+|（？）*（？（c）（？！））有什么区别？我确信*
不是用于分隔文本的最佳构造。将不匹配尖括号内的平衡子字符串。我的最后一个正则表达式将正确匹配平衡的子字符串。您能举一个平衡的子字符串的例子吗？我觉得我的正则表达式也很匹配。我的回答更直截了当了。我的正则表达式可以做到这一点，请参阅我的演示链接。我也看到了。适当的平衡构造匹配必须包括使用条件子句检查堆栈状态。
<((?>[^<>]+|<(?<c>)|>(?<-c>))*(?(c)(?!)))>

var r = new Regex(@"
    <                      # First '<'
      (                    # Capturing group 1
        (?>                # Atomic group start 
        [^<>]              # Match all characters other than `<` or `>`
        |
         < (?<c>)          # Match '<', and add a capture into group 'c'
        |
         > (?<-c>)         # Match '>', and delete 1 value from capture stack
        )*
        (?(c)(?!))         # Fails if 'c' stack isn't empty!
      )                
    >                      # Last closing `>`
"; RegexOptions.IgnoreWhitespace);