C#Regex在调试字符串中查找除此之外的所有字符串_C#_Regex_Regex Negation

C#Regex在调试字符串中查找除此之外的所有字符串

c# regex

C#Regex在调试字符串中查找除此之外的所有字符串,c#,regex,regex-negation,C#,Regex,Regex Negation,首先，我只能使用C#regex，所以建议使用其他语言或非regex解决方案不会有帮助。现在提问我必须找到代码中的所有字符串（几千个文件）。基本上有6种情况： string a = "something"; // output" "something" sring b = "something else" + " more"; // output: "something else" and " more" Print("this should match"); // outpu

首先，我只能使用C#regex，所以建议使用其他语言或非regex解决方案不会有帮助。现在提问

我必须找到代码中的所有字符串（几千个文件）。基本上有6种情况：

   string a = "something"; // output" "something"
   sring b = "something else" + " more"; // output: "something else" and " more"
   Print("this should match"); // output: "this should match"
   Print("So" + "should this"); // output: "So" and "should this"
   Debug("just some bebug text"); // output: this should not match
   Debug("more " + "debug text"); // output: this should not match

正则表达式应该匹配前4个（我只需要引号中的内容，打印也可以是任何其他函数）

到目前为止，我有一个，它返回引号中的任何内容：

 ".*?"

简而言之：

@“^（？！Debug\（“”）（[^“]*”（？[^“]*））***$”

它的作用：

var regex = new Regex(@"^(?!Debug\("")([^""]*""(?<Text>[^""]*)"")*.*$");

var inputs = new[]
                 {
                     @"string a = ""something"";",
                     @"sring b = ""something else"" + "" more"";",
                     @"Print(""this should match"");",
                     @"Print(""So"" + ""should this"");",
                     @"Debug(""just some bebug text"");",
                     @"Debug(""more "" + ""debug text"");"
                 };

foreach (var input in inputs)
{
    Console.WriteLine(input);
    Console.WriteLine("=====");

    var match = regex.Match(input);

    var captures = match.Groups["Text"].Captures;

    for (var i = 0; i < captures.Count; i++)
    {
        Console.WriteLine(captures[i].Value);
    }

    Console.WriteLine("=====");
    Console.WriteLine();
}

string a = "something";
=====
something
=====

sring b = "something else" + " more";
=====
something else
 more
=====

Print("this should match");
=====
this should match
=====

Print("So" + "should this");
=====
So
should this
=====

Debug("just some bebug text");
=====
=====

Debug("more " + "debug text");
=====
=====

如果字符串以
```
Debug（
```
沿着字符串运行，直到它遇到第一个
```
“
```
，然后经过它
- 如果未找到
```
“
```
  ，且已到达字符串末尾，则将停止
开始“录制”到名为
```
Text
```
沿着字符串运行，直到它遇到下一个
```
“
```
”，停止录制并经过它
返回到步骤2

结果：在名为

Text

的组中，您拥有

“

之间的所有字符串

剩下要做的事情：在调试之前将其转换为多行正则表达式并支持whitepsaces（

\s

）

进一步的使用示例和测试：

var regex = new Regex(@"^(?!Debug\("")([^""]*""(?<Text>[^""]*)"")*.*$");

var inputs = new[]
                 {
                     @"string a = ""something"";",
                     @"sring b = ""something else"" + "" more"";",
                     @"Print(""this should match"");",
                     @"Print(""So"" + ""should this"");",
                     @"Debug(""just some bebug text"");",
                     @"Debug(""more "" + ""debug text"");"
                 };

foreach (var input in inputs)
{
    Console.WriteLine(input);
    Console.WriteLine("=====");

    var match = regex.Match(input);

    var captures = match.Groups["Text"].Captures;

    for (var i = 0; i < captures.Count; i++)
    {
        Console.WriteLine(captures[i].Value);
    }

    Console.WriteLine("=====");
    Console.WriteLine();
}

string a = "something";
=====
something
=====

sring b = "something else" + " more";
=====
something else
 more
=====

Print("this should match");
=====
this should match
=====

Print("So" + "should this");
=====
So
should this
=====

Debug("just some bebug text");
=====
=====

Debug("more " + "debug text");
=====
=====

我建议使用一个正则表达式工具来帮助你完成作业。我使用expresso（）。你能提供一些示例数据吗？或者你只想排除任何带有

“debug”

“debug”的行吗？上面的输出应该是：“something”、“something other”、“more”、“this应该匹配”、“So”、“should this”。我不想要的是“只是一些bebug文本”关于你的评论，我误解了这个问题，所以我将删除我的答案"第二部分，我将建议任何比正则表达式更简单的同等好的解决方案，因为他正在用C#编写提取程序。@nhahdh完全同意，但

我只能用C#regex

-听起来像是家庭作业-没什么可做的。这是OP的需要。@YoryeNathan，谢谢你的帮助。它很有效，但我仍然有困难从我的理解来看，

^（？！Debug\（“”

摆脱

调试（

（[^“]*”（？[^“]*））*

对于实际字符串，用

[^”]*

摆脱

”

，然后用

*$

来消费

）行尾的“

”。当我编写一个正则表达式时，我用来测试正则表达式，但上面的正则表达式仍然不起作用。我假设ruby对正则表达式有不同的规则？

（？！Debug（“

去掉以

Debug（

）开头的行，但实际上并不“移动读卡器”-只需确保它不是以它开头，然后返回到开头，尝试匹配字符串的其余部分。

[^]*

是传递所有非

“

字符，然后

”

是使用

“

字符，所以现在我们在文本的开头。

（？[^]*）

消耗所有内容，直到下一个

“

进入文本组，然后另一个

”

消耗

“

，以及括号中的整个最后部分，然后

允许该行中有多个文本（

是“重复零次或多次”）。