C# 使字符串检查器更高效
我使用下面的代码检查一个字符串是否包含在另一个字符串中-C# 使字符串检查器更高效,c#,regex,string,C#,Regex,String,我使用下面的代码检查一个字符串是否包含在另一个字符串中- foreach (string testrecord in testlist) { foreach (string realrecord in reallist) { if ((Regex.Replace(testrecord , "[^0-9a-zA-Z]+", "") .Contains(( Regex.Replace(r
foreach (string testrecord in testlist)
{
foreach (string realrecord in reallist)
{
if ((Regex.Replace(testrecord , "[^0-9a-zA-Z]+", "")
.Contains((
Regex.Replace(realrecord, "[^0-9a-zA-Z]+", "")))
&&
((Regex.Replace(realrecord, "[^0-9a-zA-Z]+", "") != "")
&&
((Regex.Replace(realrecord, "[^0-9a-zA-Z]+", "").Length >= 4)))))
{
matchTextBox.AppendText("Match: " + testrecord + " & " + realrecord + Environment.NewLine);
}
}
}
然而,完成这项工作的运行时需要相当长的时间。由于我添加了特殊字符regex removation,运行时需要花费更长的时间,但是regex是绝对必需的
有没有更有效的方法来应用这个正则表达式?我试图将它添加到foreach字符串变量中,但是您不能像foreach循环中那样更改它们 我想知道您是在使用正则表达式来达到目的,而忽略了一个事实,即您也可以通过仅使用.Contains()方法来实现这一点,这样您的代码应该比以前更简单更快
foreach (string testrecord in testlist)
{
foreach (string realrecord in reallist)
{
if(testrecord.Contains(realrecord))
{
matchTextBox.AppendText("Match: " + testrecord + " & " + realrecord + Environment.NewLine);
}
}
}
优化版本:
// Do not put text into matchTextBox direct:
// it makes the control re-painting each time you change the text
// Instead, collect all the text into StringBuffer
StringBuilder Sb = new StringBuilder();
// Pull out as much as you can from the inner loop,
// that's why I've changed the loops' order:
// first loop on reallist, then on testlist
foreach (string realrecord in reallist) {
// Cache Regex.Replace result
String realCleaned = Regex.Replace(realrecord, "[^0-9a-zA-Z]+", "");
// Test as early as possible
if (realCleaned.Length < 4)
continue;
// You don't need to test realCleaned != "";: realCleaned.Length < 4 is enough
foreach (string testrecord in testlist) {
// Cache Regex.Replace result: it's a little bit overshoot here, but if some
// more tests are added it'll be helpful
String testCleaned = Regex.Replace(testrecord, "[^0-9a-zA-Z]+", "");
if (testCleaned.Contains(realCleaned))
Sb.AppendLine("Match: " + testrecord + " & " + realrecord);
}
}
// At last matchTextBox.Text change
matchTextBox.AppendText(Sb.ToString());
//不要直接将文本放入matchTextBox:
//它使控件在每次更改文本时重新绘制
//相反,将所有文本收集到StringBuffer中
StringBuilder Sb=新的StringBuilder();
//尽可能多地从内环中拉出,
//这就是为什么我改变了循环的顺序:
//首先在reallist上循环,然后在testlist上循环
foreach(reallist中的字符串realrecord){
//缓存正则表达式。替换结果
字符串realCleaned=Regex.Replace(realrecord,[^0-9a-zA-Z]+,”);
//尽早测试
如果(实际长度<4)
继续;
//您不需要测试realCleaned!=“”;:realCleaned。长度<4就足够了
foreach(testlist中的字符串testrecord){
//替换结果:这里有点过冲,但如果
//增加了更多的测试,这将很有帮助
字符串testCleaned=Regex.Replace(testrecord,“[^0-9a-zA-Z]+”,”);
if(testCleaned.Contains(realCleaned))
Sb.追加行(“匹配:“+testrecord+”&“+realrecord”);
}
}
//最后匹配文本框。文本更改
AppendText(Sb.ToString());
这应该快一点(每个testrecord
执行一个正则表达式操作):
如果需要性能,可以自己实现字符串处理。据我所知,您所做的只是限制字符集。首先,您可能希望只运行一次
Regex.Replace(realrecord,[^0-9a-zA-Z]+,”)
,并将结果缓存在变量中,而不是每次迭代调用三次。=“
似乎是Length>=4
@O.R.Mapper的复制品。即使不考虑性能,他也应该这样做。现在是复制粘贴编程。@usr:True。当我们使用时,Regex.Replace(testrecord,“[^0-9a-zA-Z]+”,“)
在内部循环的每次迭代中被调用一次,即使它的结果在内部循环中似乎没有任何变化,因此,它也可以在外部循环中调用一次。-1:在比较之前,显然需要使用正则表达式来删除非字母或数字的字符。首先运行30分钟,进行更改-14分钟,金星Dmitry,速度加倍!
var strippedRealList = reallist.Select(s => Regex.Replace(s, "[^0-9a-zA-Z]+", ""))
.Where(s => s.Length >= 4)
.ToArray();
foreach (string realrecord in reallist)
{
strippedRealList.Where(s => realrecord.Contains(s))
.ToList()
.ForEach(s =>
matchTextBox.AppendText("Match: "
+ s
+ " & "
+ realrecord
+ Environment.NewLine));
}