C# 优化for循环中的正则表达式替换?
这基本上是我的研究的后续。我一直在使用以下代码替换数组中包含的字符串:C# 优化for循环中的正则表达式替换?,c#,regex,optimization,replace,C#,Regex,Optimization,Replace,这基本上是我的研究的后续。我一直在使用以下代码替换数组中包含的字符串: string[] replacements = {"these", "words", "will", "get", "replaced"}; string newString = "Hello.replacedthesewor
string[] replacements = {"these",
"words",
"will",
"get",
"replaced"};
string newString = "Hello.replacedthesewordswillgetreplacedreplaced";
for (int j = 0; j < replacements.Length; j++)
{
newString = Regex.Replace(newBase,
@"((?<firstMatch>(" + replacements[j] + @"))(\k<firstMatch>)*)",
m => "[" + j + "," + (m.Groups[3].Captures.Count + 1) + "]");
}
string[]replacements={“这些”,
“文字”,
“将”,
“得到”,
“替换”};
string newString=“Hello.replacedthesewordswillgetreplacedreplacement”;
for(int j=0;j“[”+j+”,“+(m.Groups[3].Captures.Count+1)+“]”;
}
运行此代码后,newString
将:
你好。[4,1][0,1][1,1][2,1][3,1][4,2]
这对于像上面这样的小替换很有效。它基本上会立即替换字符串,但是对于大量替换,它往往会减慢速度
有没有人能找到一种方法,我可以优化它,让它更快地替换
我假设for循环是减慢速度的原因。数组中总是包含一些不需要替换的字符串(因为它们不包含在主newString
string中),所以我想知道是否有方法在for循环之前检查它们。但这可能会变得更慢
我想不出比这更好的方法了,所以我想我应该问一下。谢谢大家的帮助!:) 有两种方法可以尝试(注意,这两种方法都未经测试,但我相信它们应该可以工作,并且比您当前的代码更快) 一个使用静态编译的正则表达式:
private static readonly Dictionary<string, int> Indexes = new Dictionary<string, int>
{
{ "these", 0 },
{ "words", 1 },
{ "will", 2 },
{ "be", 3 },
{ "replaced", 4 },
};
private static readonly Regex ReplacementRegex = new Regex(string.Join("|", Indexes.Keys), RegexOptions.Compiled)
...
var occurrences = Indexes.Keys.ToDictionary(k => k, k => 0);
return ReplacementRegex.Replace(newString, m => {
var count = occurences[m.Value];
occurences[m.Value] = count + 1;
return "[" + Indexes[m.Value] + "," + count + "]";
});
私有静态只读字典索引=新字典
{
{“这些”,0},
{“单词”,1},
{“威尔”,2},
{“be”,3},
{“替换”,4},
};
private static readonly Regex ReplacementRegex=new Regex(string.Join(“|”)、index.Keys、RegexOptions.Compiled)
...
变量出现次数=index.Keys.ToDictionary(k=>k,k=>0);
return ReplacementRegex.Replace(newString,m=>{
var计数=发生次数[m.Value];
发生率[m.Value]=计数+1;
返回“[”+索引[m.Value]+”,“+计数+”];
});
没有正则表达式:
for (int j = 0; j < replacements.Length; j++)
{
var index = 0;
var count = 0;
var replacement = replacements[j];
while((index = newString.IndexOf(replacement, index)) > -1)
{
count++;
newString = newString.Substring(0, index) + "[" + j + "," + count + "]" + newString.Substring(index + replacement.Length);
}
}
for(int j=0;j-1)
{
计数++;
newString=newString.Substring(0,索引)+“[”+j+”,“+count+”]”+newString.Substring(索引+替换.Length);
}
}
有两种方法可以尝试(注意,这两种方法都未经测试,但我相信它们应该可以工作,并且比您当前的代码更快)
一个使用静态编译的正则表达式:
private static readonly Dictionary<string, int> Indexes = new Dictionary<string, int>
{
{ "these", 0 },
{ "words", 1 },
{ "will", 2 },
{ "be", 3 },
{ "replaced", 4 },
};
private static readonly Regex ReplacementRegex = new Regex(string.Join("|", Indexes.Keys), RegexOptions.Compiled)
...
var occurrences = Indexes.Keys.ToDictionary(k => k, k => 0);
return ReplacementRegex.Replace(newString, m => {
var count = occurences[m.Value];
occurences[m.Value] = count + 1;
return "[" + Indexes[m.Value] + "," + count + "]";
});
私有静态只读字典索引=新字典
{
{“这些”,0},
{“单词”,1},
{“威尔”,2},
{“be”,3},
{“替换”,4},
};
private static readonly Regex ReplacementRegex=new Regex(string.Join(“|”)、index.Keys、RegexOptions.Compiled)
...
变量出现次数=index.Keys.ToDictionary(k=>k,k=>0);
return ReplacementRegex.Replace(newString,m=>{
var计数=发生次数[m.Value];
发生率[m.Value]=计数+1;
返回“[”+索引[m.Value]+”,“+计数+”];
});
没有正则表达式:
for (int j = 0; j < replacements.Length; j++)
{
var index = 0;
var count = 0;
var replacement = replacements[j];
while((index = newString.IndexOf(replacement, index)) > -1)
{
count++;
newString = newString.Substring(0, index) + "[" + j + "," + count + "]" + newString.Substring(index + replacement.Length);
}
}
for(int j=0;j-1)
{
计数++;
newString=newString.Substring(0,索引)+“[”+j+”,“+count+”]”+newString.Substring(索引+替换.Length);
}
}
非常感谢!但在使用非正则表达式代码时遇到了一些问题。文本“Hello.replacedthesewordswillgetreplacedreplacement”应替换为Hello.[4,1][0,1][1,1][2,1][3,1][3,1][4,2][4,2]
但它正在变成Hello.[4,1][0,1][1,1][2,1][3,1][4,3]
。第二部分应该是被替换的匹配附近字符串的数量。这就是我问上一个问题时遇到的麻烦,但肯德尔·弗雷和肖恩解决了。你知道我该怎么做吗?再次感谢rich,非常感谢!但在使用非正则表达式代码时遇到了一些问题。文本“Hello.replacedthesewordswillgetreplacedreplacement”应替换为Hello.[4,1][0,1][1,1][2,1][3,1][3,1][4,2][4,2]
但它正在变成Hello.[4,1][0,1][1,1][2,1][3,1][4,3]
。第二部分应该是被替换的匹配附近字符串的数量。这就是我问上一个问题时遇到的麻烦,但肯德尔·弗雷和肖恩解决了。你知道我该怎么做吗?再次感谢rich。