C# 优化for循环中的正则表达式替换？_C#_Regex_Optimization_Replace

C# 优化for循环中的正则表达式替换？

c# regex optimization replace

C# 优化for循环中的正则表达式替换？,c#,regex,optimization,replace,C#,Regex,Optimization,Replace,这基本上是我的研究的后续。我一直在使用以下代码替换数组中包含的字符串： string[] replacements = {"these", "words", "will", "get", "replaced"}; string newString = "Hello.replacedthesewor

这基本上是我的研究的后续。我一直在使用以下代码替换数组中包含的字符串：

string[] replacements = {"these",
                         "words",
                         "will",
                         "get",
                         "replaced"};

string newString = "Hello.replacedthesewordswillgetreplacedreplaced";

for (int j = 0; j < replacements.Length; j++)
{
    newString = Regex.Replace(newBase,
    @"((?<firstMatch>(" + replacements[j] + @"))(\k<firstMatch>)*)",
    m => "[" + j + "," + (m.Groups[3].Captures.Count + 1) + "]");
}

string[]replacements={“这些”，
“文字”，
“将”，
“得到”，
“替换”}；
string newString=“Hello.replacedthesewordswillgetreplacedreplacement”；
for（int j=0；j“[”+j+”，“+（m.Groups[3].Captures.Count+1）+“]”；
}

运行此代码后，

newString

将：

你好。[4,1][0,1][1,1][2,1][3,1][4,2]

这对于像上面这样的小替换很有效。它基本上会立即替换字符串，但是对于大量替换，它往往会减慢速度

有没有人能找到一种方法，我可以优化它，让它更快地替换

我假设for循环是减慢速度的原因。数组中总是包含一些不需要替换的字符串（因为它们不包含在主

newString

string中），所以我想知道是否有方法在for循环之前检查它们。但这可能会变得更慢

我想不出比这更好的方法了，所以我想我应该问一下。谢谢大家的帮助！：）

有两种方法可以尝试（注意，这两种方法都未经测试，但我相信它们应该可以工作，并且比您当前的代码更快）

一个使用静态编译的正则表达式：

private static readonly Dictionary<string, int> Indexes = new Dictionary<string, int> 
{
  { "these", 0 },
  { "words", 1 },
  { "will", 2 },
  { "be", 3 },
  { "replaced", 4 },
};

private static readonly Regex ReplacementRegex = new Regex(string.Join("|", Indexes.Keys), RegexOptions.Compiled)

...
var occurrences = Indexes.Keys.ToDictionary(k => k, k => 0);
return ReplacementRegex.Replace(newString, m => {
  var count = occurences[m.Value];
  occurences[m.Value] = count + 1;
  return "[" + Indexes[m.Value] + "," + count + "]";
});

私有静态只读字典索引=新字典
{
{“这些”，0}，
{“单词”，1}，
{“威尔”，2}，
{“be”，3}，
{“替换”，4}，
};
private static readonly Regex ReplacementRegex=new Regex（string.Join（“|”）、index.Keys、RegexOptions.Compiled）
...
变量出现次数=index.Keys.ToDictionary（k=>k，k=>0）；
return ReplacementRegex.Replace（newString，m=>{
var计数=发生次数[m.Value]；
发生率[m.Value]=计数+1；
返回“[”+索引[m.Value]+”，“+计数+”]；
});

没有正则表达式：

for (int j = 0; j < replacements.Length; j++)
{
  var index = 0;
  var count = 0;
  var replacement = replacements[j];
  while((index = newString.IndexOf(replacement, index)) > -1) 
  {
    count++;
    newString = newString.Substring(0, index) + "[" + j + "," + count + "]" + newString.Substring(index + replacement.Length);
  }
}

for（int j=0；j-1）
{
计数++；
newString=newString.Substring（0，索引）+“[”+j+”，“+count+”]”+newString.Substring（索引+替换.Length）；
}
}

有两种方法可以尝试（注意，这两种方法都未经测试，但我相信它们应该可以工作，并且比您当前的代码更快）

一个使用静态编译的正则表达式：

private static readonly Dictionary<string, int> Indexes = new Dictionary<string, int> 
{
  { "these", 0 },
  { "words", 1 },
  { "will", 2 },
  { "be", 3 },
  { "replaced", 4 },
};

private static readonly Regex ReplacementRegex = new Regex(string.Join("|", Indexes.Keys), RegexOptions.Compiled)

...
var occurrences = Indexes.Keys.ToDictionary(k => k, k => 0);
return ReplacementRegex.Replace(newString, m => {
  var count = occurences[m.Value];
  occurences[m.Value] = count + 1;
  return "[" + Indexes[m.Value] + "," + count + "]";
});

私有静态只读字典索引=新字典
{
{“这些”，0}，
{“单词”，1}，
{“威尔”，2}，
{“be”，3}，
{“替换”，4}，
};
private static readonly Regex ReplacementRegex=new Regex（string.Join（“|”）、index.Keys、RegexOptions.Compiled）
...
变量出现次数=index.Keys.ToDictionary（k=>k，k=>0）；
return ReplacementRegex.Replace（newString，m=>{
var计数=发生次数[m.Value]；
发生率[m.Value]=计数+1；
返回“[”+索引[m.Value]+”，“+计数+”]；
});

没有正则表达式：

for (int j = 0; j < replacements.Length; j++)
{
  var index = 0;
  var count = 0;
  var replacement = replacements[j];
  while((index = newString.IndexOf(replacement, index)) > -1) 
  {
    count++;
    newString = newString.Substring(0, index) + "[" + j + "," + count + "]" + newString.Substring(index + replacement.Length);
  }
}

for（int j=0；j-1）
{
计数++；
newString=newString.Substring（0，索引）+“[”+j+”，“+count+”]”+newString.Substring（索引+替换.Length）；
}
}

非常感谢！但在使用非正则表达式代码时遇到了一些问题。文本“Hello.replacedthesewordswillgetreplacedreplacement”应替换为

Hello.[4,1][0,1][1,1][2,1][3,1][3,1][4,2][4,2]

但它正在变成

Hello.[4,1][0,1][1,1][2,1][3,1][4,3]

。第二部分应该是被替换的匹配附近字符串的数量。这就是我问上一个问题时遇到的麻烦，但肯德尔·弗雷和肖恩解决了。你知道我该怎么做吗？再次感谢rich，非常感谢！但在使用非正则表达式代码时遇到了一些问题。文本“Hello.replacedthesewordswillgetreplacedreplacement”应替换为

Hello.[4,1][0,1][1,1][2,1][3,1][3,1][4,2][4,2]

但它正在变成

Hello.[4,1][0,1][1,1][2,1][3,1][4,3]

。第二部分应该是被替换的匹配附近字符串的数量。这就是我问上一个问题时遇到的麻烦，但肯德尔·弗雷和肖恩解决了。你知道我该怎么做吗？再次感谢rich。