C# 正则表达式减慢了程序的速度
我试图创建一个程序,解析游戏聊天日志中的数据。到目前为止,我已经设法让程序工作,并解析我想要的数据,但我的问题是,程序越来越慢 目前,解析一个10MB的文本文件需要5秒钟,我注意到如果我添加RegexOptions.Compiled到我的正则表达式中,它会下降到3秒钟 我相信我已经指出了我的正则表达式匹配的问题。由于有5个正则表达式,一行当前被读取了5次,因此当我稍后添加更多正则表达式时,程序会变得更慢 我应该怎么做才能使我的程序不会因为多个正则表达式而变慢?所有的建议,使代码更好的感谢C# 正则表达式减慢了程序的速度,c#,regex,parsing,C#,Regex,Parsing,我试图创建一个程序,解析游戏聊天日志中的数据。到目前为止,我已经设法让程序工作,并解析我想要的数据,但我的问题是,程序越来越慢 目前,解析一个10MB的文本文件需要5秒钟,我注意到如果我添加RegexOptions.Compiled到我的正则表达式中,它会下降到3秒钟 我相信我已经指出了我的正则表达式匹配的问题。由于有5个正则表达式,一行当前被读取了5次,因此当我稍后添加更多正则表达式时,程序会变得更慢 我应该怎么做才能使我的程序不会因为多个正则表达式而变慢?所有的建议,使代码更好的感谢 if
if (sender.Equals(ButtonParse))
{
var totalShots = 0f;
var totalHits = 0f;
var misses = 0;
var crits = 0;
var regDmg = new Regex(@"(?<=\bSystem\b.* You inflicted )\d+.\d", RegexOptions.Compiled);
var regMiss = new Regex(@"(?<=\bSystem\b.* Target evaded attack)", RegexOptions.Compiled);
var regCrit = new Regex(@"(?<=\bSystem\b.* Critical hit - additional damage)", RegexOptions.Compiled);
var regHeal = new Regex(@"(?<=\bSystem\b.* You healed yourself )\d+.\d", RegexOptions.Compiled);
var regDmgrec = new Regex(@"(?<=\bSystem\b.* You take )\d+.\d", RegexOptions.Compiled);
var dmgList = new List<float>(); //New list for damage values
var healList = new List<float>(); //New list for heal values
var dmgRecList = new List<float>(); //New list for damage received values
using (var sr = new StreamReader(TextBox1.Text))
{
while (!sr.EndOfStream)
{
var line = sr.ReadLine();
var match = regDmg.Match(line);
var match2 = regMiss.Match(line);
var match3 = regCrit.Match(line);
var match4 = regHeal.Match(line);
var match5 = regDmgrec.Match(line);
if (match.Success)
{
dmgList.Add(float.Parse(match.Value, CultureInfo.InvariantCulture));
totalShots++;
totalHits++;
}
if (match2.Success)
{
misses++;
totalShots++;
}
if (match3.Success)
{
crits++;
}
if (match4.Success)
{
healList.Add(float.Parse(match4.Value, CultureInfo.InvariantCulture));
}
if (match5.Success)
{
dmgRecList.Add(float.Parse(match5.Value, CultureInfo.InvariantCulture));
}
}
TextBlockTotalShots.Text = totalShots.ToString(); //Show total shots
TextBlockTotalDmg.Text = dmgList.Sum().ToString("0.##"); //Show total damage inflicted
TextBlockTotalHits.Text = totalHits.ToString(); //Show total hits
var hitChance = totalHits / totalShots; //Calculate hit chance
TextBlockHitChance.Text = hitChance.ToString("P"); //Show hit chance
TextBlockTotalMiss.Text = misses.ToString(); //Show total misses
var missChance = misses / totalShots; //Calculate miss chance
TextBlockMissChance.Text = missChance.ToString("P"); //Show miss chance
TextBlockTotalCrits.Text = crits.ToString(); //Show total crits
var critChance = crits / totalShots; //Calculate crit chance
TextBlockCritChance.Text = critChance.ToString("P"); //Show crit chance
TextBlockDmgHealed.Text = healList.Sum().ToString("F1"); //Show damage healed
TextBlockDmgReceived.Text = dmgRecList.Sum().ToString("F1"); //Show damage received
var pedSpent = dmgList.Sum() / (float.Parse(TextBoxEco.Text, CultureInfo.InvariantCulture) * 100); //Calculate ped spent
TextBlockPedSpent.Text = pedSpent.ToString("0.##") + " PED"; //Estimated ped spent
}
}
以下是我所看到的问题
[System]
是否包含在该行中。如果不是,则在该行上不匹配。如果它确实有系统,那么它会查找特定的关键字和可能的值,并将它们放在键/值对情况下的regex命名匹配捕获中
使用linq完成后,它将汇总找到的值。注意,我已经注释了该模式,并让正则表达式解析器忽略它
string pattern = @"^ # Beginning of line to anchor it.
(?=.+\[System\]) # Within the line a literal '[System]' has to occur
(?=.+ # Somewhere within that line search for these keywords:
(?<Action> # Named Match Capture Group 'Action' will hold a keyword.
inflicte?d? # if the line has inflict or inflicted put it into 'Action'
| # or
evaded # evaded
| take # or take
| yourself # or yourself (heal)
)
(\s(?<Value>[\d.]+))?) # if a value of points exist place into 'Value'
.+ # match one or more to complete it.
$ #end of line to stop on";
// IgnorePatternWhiteSpace only allows us to comment the pattern. Does not affect processing.
var tokens =
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline)
.OfType<Match>()
.Select( mt => new {
Action = mt.Groups["Action"].Value,
Value = mt.Groups["Value"].Success ? double.Parse(mt.Groups["Value"].Value) : 0,
Count = 1,
})
.GroupBy ( itm => itm.Action, // Each action will be grouped into its name for summing
itm => itm, // This is value to summed amongst the individual items of the group.
(action, values) => new
{
Action = action,
Count = values.Sum (itm => itm.Count),
Total = values.Sum(itm => itm.Value)
}
);
不要使用lookarounds(可变长度lookbehinds,即(?非常感谢Qtax!我不知道lookbehind的性能,现在我学习了分组。现在处理整个过程只需不到一秒钟。感谢你向我展示正则表达式模式。我尝试过做类似的事情,但我缺乏技能,但现在我正在学习。我在我的程序中实现了这一点,但我有两个问题首先,这对示例数据很好,但对于实际日志,它会给我错误“输入字符串的格式不正确”第二,我在只输入逃避和关键数据后使其工作,但程序现在需要10秒来处理数据。@NHAHDH我同意该模式可以进一步修改,以删除不必要的部分以加快速度。这是我第一次尝试,因此可能有更好的模式。我在一开始就有该部分(要匹配但不捕获(?:)
,或者将开始部分去掉,以找到用户要查找的某个真正匹配项。该部分将其锚定到每一行以进行匹配。但我同意,在某个时候可以将其更改为只查找特定单词匹配项,而不必使用行锚定项。我建议删除整个内容,因为关键字锚定效果比havi好ng+
部分。@NHAHDH我应该指出我需要检查消息是否来自[System],而不是其他地方。由于日志也包含聊天数据,如果播放机键入这些单词,我会得到不准确的数据,因为它不是来自[System]。无论如何,正如Qtax对我的问题所作的评论,lookbehinds在我的程序中的性能受到了巨大的影响,删除它们解决了这个问题,但如果您知道更好的处理方法,请随意回答。@Iceyou90我更新了我的正则表达式,以验证它只获取系统行,但更改了逻辑,将每个项放入可能被删除的键值对中正如新帖子中提到的,由于这是WPF,所以不要在GUI线程上进行任何处理,将操作放在后台线程/任务中,以免中断用户。
string pattern = @"^ # Beginning of line to anchor it.
(?=.+\[System\]) # Within the line a literal '[System]' has to occur
(?=.+ # Somewhere within that line search for these keywords:
(?<Action> # Named Match Capture Group 'Action' will hold a keyword.
inflicte?d? # if the line has inflict or inflicted put it into 'Action'
| # or
evaded # evaded
| take # or take
| yourself # or yourself (heal)
)
(\s(?<Value>[\d.]+))?) # if a value of points exist place into 'Value'
.+ # match one or more to complete it.
$ #end of line to stop on";
// IgnorePatternWhiteSpace only allows us to comment the pattern. Does not affect processing.
var tokens =
Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline)
.OfType<Match>()
.Select( mt => new {
Action = mt.Groups["Action"].Value,
Value = mt.Groups["Value"].Success ? double.Parse(mt.Groups["Value"].Value) : 0,
Count = 1,
})
.GroupBy ( itm => itm.Action, // Each action will be grouped into its name for summing
itm => itm, // This is value to summed amongst the individual items of the group.
(action, values) => new
{
Action = action,
Count = values.Sum (itm => itm.Count),
Total = values.Sum(itm => itm.Value)
}
);
string data=@"2014-09-02 23:07:22 [System] [] You inflicted 45.2 points of damage.
2014-09-02 23:07:23 [System] [] You inflicted 45.4 points of damage.
2014-09-02 23:07:24 [System] [] Target evaded attack.
2014-09-02 23:07:25 [System] [] You inflicted 48.4 points of damage.
2014-09-02 23:07:26 [System] [] You inflicted 48.6 points of damage.
2014-10-15 12:39:55 [System] [] Target evaded attack.
2014-10-15 12:39:58 [System] [] You inflicted 56.0 points of damage.
2014-10-15 12:39:59 [System] [] You inflicted 74.6 points of damage.
2014-10-15 12:40:02 [System] [] You inflicted 78.6 points of damage.
2014-10-15 12:40:04 [System] [] Target evaded attack.
2014-10-15 12:40:06 [System] [] You inflicted 66.9 points of damage.
2014-10-15 12:40:08 [System] [] You inflicted 76.2 points of damage.
2014-10-15 12:40:12 [System] [] You take 18.4 points of damage.
2014-10-15 12:40:14 [System] [] You inflicted 76.1 points of damage.
2014-10-15 12:40:17 [System] [] You inflicted 88.5 points of damage.
2014-10-15 12:40:19 [System] [] You inflicted 69.0 points of damage.
2014-10-19 05:56:30 [System] [] Critical hit - additional damage! You inflict 275.4 points of damage.
2014-10-19 05:59:29 [System] [] You inflicted 92.8 points of damage.
2014-10-19 05:59:31 [System] [] Critical hit - additional damage! You inflict 251.5 points of damage.
2014-10-19 05:59:35 [System] [] You take 59.4 points of damage.
2014-10-19 05:59:39 [System] [] You healed yourself 84.0 points.";