Java 正则表达式日志解析
我使用正则表达式来解析日志。我之前将文件读入一个字符串数组,然后在与时间戳不匹配的情况下通过字符串数组进行迭代,否则我会将迭代的行添加到一个变量并继续搜索。一旦我得到一个完整的日志条目,我就使用另一个正则表达式来解析它 扫描文件Java 正则表达式日志解析,java,regex,Java,Regex,我使用正则表达式来解析日志。我之前将文件读入一个字符串数组,然后在与时间戳不匹配的情况下通过字符串数组进行迭代,否则我会将迭代的行添加到一个变量并继续搜索。一旦我得到一个完整的日志条目,我就使用另一个正则表达式来解析它 扫描文件 try { List<String> lines = Files.readAllLines(filepath); Pattern pattern = Pattern.compile("\\d{4}-\\d{2}-\\d{2} \\d{2}:
try {
List<String> lines = Files.readAllLines(filepath);
Pattern pattern = Pattern.compile("\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3}");
Matcher matcher;
String currentEntry = "";
for(String line : lines) {
matcher = pattern.matcher(line);
// If this is a new entry, then wrap up the previous one and start again
if ( matcher.lookingAt() ) {
// If the previous entry was not empty
if(!StringUtils.trimWhitespace(currentEntry).isEmpty()) {
entries.add(new LogEntry(currentEntry));
}
// Clear the current entry
currentEntry = "";
}
if (!currentEntry.trim().isEmpty())
currentEntry += "\n";
currentEntry += line;
}
// At the end, if we have one leftover entry, add it
if (!currentEntry.isEmpty()) {
entries.add(new LogEntry(currentEntry));
}
}catch (Exception ex){
return null;
}
您需要像这样定义模式
final static String timestampRgx = "(?<timestamp>\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3})";
final static String levelRgx = "(?<level>INFO|ERROR|WARN|TRACE|DEBUG|FATAL)";
final static String classRgx = "\\[(?<class>[^\\]]+)]";
final static String threadRgx = "\\[(?<thread>[^\\]]+)]";
final static String textRgx = "(?<text>.*?)(?=\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3}|\\Z)";
private static Pattern PatternFullLog = Pattern.compile(timestampRgx + " " + levelRgx + "\\s+" + classRgx + "-" + threadRgx + "\\s+" + textRgx, Pattern.DOTALL);
见
以下是该模式的外观:
(?<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) (?<level>INFO|ERROR|WARN|TRACE|DEBUG|FATAL)\s+\[(?<class>[^\]]+)]-\[(?<thread>[^\]]+)]\s+(?<text>.*?)(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}|\Z)
[代码><代码><代码>(?\d{{4}{4}{4}{4}{4}{4}{4}{4}{4}{4}{4}{4}{4}{4}{4}{5}{2}{2}{2}{2}{3{3},(,(纽约{4}4}{4},,,,{3}3}3}{3}),(,(,(,(上述上述上述)学校学校校校校校校校校校际)))的,(,(,(学校学校学校学校)可能可能)可能可能)的)可能可能,(,(,(,(学校学校学校学校学校)的)名名名名名名名名名名名名名名名,,,{{{{{{{3},,,,,{{3},,{{{3},(,(,(,(,(,(;\Z)
看
一些注意事项:
- 您在字符类内的转义符号(
必须转义,而]
应替换为\-
-
- 匹配文本到日期时间或字符串结束时间的模式匹配文本到日期时间的模式匹配到文本到日期时间或字符串结束时间的模式是<代码>(<代码>>((?????????????????????{{d{4}{2{2{2{2{2}2{2{2}2{2}2{2}2}2{2{2}2}2}2}2{2}2}2{2}2}2}2},,,,{2},,,,,,,,{3{3}3}3}3}3}3}3}3}<<<<<<<<<<<<<<<<<<<<<<<<<<3}}3}3}当当当当当>或字符串结尾(
)\Z
]
]->“[^\\]]”“
,2)要匹配到下一个时间戳或文件结尾,请使用(?*)(?*)(?=\\d{4}-\\d{2}-\\d{2}\\d{2}:\\d{2}:\\d{2},\\d{3}\\\Z)
。谢谢,哈哈,\
是我尝试将它们放在[]的地方的产物。谢谢你的回答!
2017-03-14 22:43:14,405 FATAL [org.springframework.web.context.support.XmlWebApplicationContext]-[localhost-startStop-1] Refreshing Root WebApplicationContext: startup date [Tue Mar 14 22:43:14 UTC 2017]; root of context hierarchy
2017-03-14 22:43:14,476 INFO [org.springframework.beans.factory.xml.XmlBeanDefinitionReader]-[localhost-startStop-1] Loading XML bean definitions from Serv
2017-03-14 22:43:14,476 INFO [org.springframework.beans.factory.xml.XmlBeanDefinitionReader]-[localhost-startStop-1] Here is a multiline
log entry with another entry after
2017-03-14 22:43:14,476 INFO [org.springframework.beans.factory.xml.XmlBeanDefinitionReader]-[localhost-startStop-1] Here is a multiline
log entry with no entries after
final static String timestampRgx = "(?<timestamp>\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3})";
final static String levelRgx = "(?<level>INFO|ERROR|WARN|TRACE|DEBUG|FATAL)";
final static String classRgx = "\\[(?<class>[^\\]]+)]";
final static String threadRgx = "\\[(?<thread>[^\\]]+)]";
final static String textRgx = "(?<text>.*?)(?=\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{3}|\\Z)";
private static Pattern PatternFullLog = Pattern.compile(timestampRgx + " " + levelRgx + "\\s+" + classRgx + "-" + threadRgx + "\\s+" + textRgx, Pattern.DOTALL);
Matcher matcher = PatternFullLog.matcher(line);
(?<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) (?<level>INFO|ERROR|WARN|TRACE|DEBUG|FATAL)\s+\[(?<class>[^\]]+)]-\[(?<thread>[^\]]+)]\s+(?<text>.*?)(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}|\Z)