如何在java中使用java.regex.Matcher解析日志文件
我试图理解java中的正则表达式。我正在用java玩一个日志文件,以便可以提取日志字段。例如,我有以下行:如何在java中使用java.regex.Matcher解析日志文件,java,regex,parsing,Java,Regex,Parsing,我试图理解java中的正则表达式。我正在用java玩一个日志文件,以便可以提取日志字段。例如,我有以下行: Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2" 我想要这样的输出: "Date&Time" = Apr 10 21:08:55 "Hostname" = kali "Program Name" = sshd "Log" = Failed passw
Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2"
我想要这样的输出:
"Date&Time" = Apr 10 21:08:55
"Hostname" = kali
"Program Name" = sshd
"Log" = Failed password for root from 127.0.0.1 port 42035 ssh2
以下是我迄今为止的java代码:
公共类LogRegExp{
public static void main(String argv[]) {
String logEntryLine = "Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2";
String logEntryPattern = "(\\w.+) (\\d.+) (\\w.+) (\\w.+)";
Pattern p = Pattern.compile(logEntryPattern);
Matcher matcher = p.matcher(logEntryLine);
if (!matcher.matches()) {
System.err.println("Bad log entry (or problem with RE?):");
System.err.println(logEntryLine);
return;
}
System.out.println("Date&Time: " + matcher.group(1));
System.out.println("Hostname: " + matcher.group(2));
System.out.println("Program Name: " + matcher.group(3));
System.out.println("Log: " + matcher.group(4));
}
我尝试了以下示例:
但我无法将其适应我的需要。我了解如何应用eScape字符、数字等,但我不知道如何适应我的情况。有人能帮我吗?您可以对代码进行以下修改:
public class LogRegExp {
public static void main(String argv[]) {
String logEntryLine = "Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2";
String logEntryPattern = "([\\w]+\\s[\\d]+\\s[\\d:]+) (\\w+) (\\w{4})(\\[\\d{5}\\]:) (\\w.+)";
Pattern p = Pattern.compile(logEntryPattern);
Matcher matcher = p.matcher(logEntryLine);
if (!matcher.matches()) {
System.err.println("Bad log entry (or problem with RE?):");
System.err.println(logEntryLine);
return;
}
System.out.println("Date&Time: " + matcher.group(1));
System.out.println("Hostname: " + matcher.group(2));
System.out.println("Program Name: " + matcher.group(3));
System.out.println("Log: " + matcher.group(5));
}
}
该程序的输出为:
Date&Time: Apr 10 21:08:55
Hostname: kali
Program Name: sshd
Log: Failed password for root from 127.0.0.1 port 42035 ssh2
您可以对代码进行以下修改:
public class LogRegExp {
public static void main(String argv[]) {
String logEntryLine = "Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2";
String logEntryPattern = "([\\w]+\\s[\\d]+\\s[\\d:]+) (\\w+) (\\w{4})(\\[\\d{5}\\]:) (\\w.+)";
Pattern p = Pattern.compile(logEntryPattern);
Matcher matcher = p.matcher(logEntryLine);
if (!matcher.matches()) {
System.err.println("Bad log entry (or problem with RE?):");
System.err.println(logEntryLine);
return;
}
System.out.println("Date&Time: " + matcher.group(1));
System.out.println("Hostname: " + matcher.group(2));
System.out.println("Program Name: " + matcher.group(3));
System.out.println("Log: " + matcher.group(5));
}
}
该程序的输出为:
Date&Time: Apr 10 21:08:55
Hostname: kali
Program Name: sshd
Log: Failed password for root from 127.0.0.1 port 42035 ssh2
尝试以下模式:
String logEntryPattern = "(.+\\d\\d?:\\d\\d?:\\d\\d?) (\\S+) ([^\\[]+)\\S+ (.+)";
hh::mm::ss
尝试以下模式:
String logEntryPattern = "(.+\\d\\d?:\\d\\d?:\\d\\d?) (\\S+) ([^\\[]+)\\S+ (.+)";
hh::mm::ss
使用此代码:
public class LogRegExp {
public static void main(String argv[]) {
String logEntryLine = "Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2";
String logEntryPattern = "([\\w]+\\s[\\d]+\\s[\\d:]+)\\s([\\w]+)\\s([\\w]+)\\[.+\\]:\\s(.+)";
Pattern p = Pattern.compile(logEntryPattern);
Matcher matcher = p.matcher(logEntryLine);
if (!matcher.matches()) {
System.err.println("Bad log entry (or problem with RE?):");
System.err.println(logEntryLine);
return;
}
System.out.println("Date&Time: " + matcher.group(1));
System.out.println("Hostname: " + matcher.group(2));
System.out.println("Program Name: " + matcher.group(3));
System.out.println("Log: " + matcher.group(4));
}
}
使用此代码:
public class LogRegExp {
public static void main(String argv[]) {
String logEntryLine = "Apr 10 21:08:55 kali sshd[37727]: Failed password for root from 127.0.0.1 port 42035 ssh2";
String logEntryPattern = "([\\w]+\\s[\\d]+\\s[\\d:]+)\\s([\\w]+)\\s([\\w]+)\\[.+\\]:\\s(.+)";
Pattern p = Pattern.compile(logEntryPattern);
Matcher matcher = p.matcher(logEntryLine);
if (!matcher.matches()) {
System.err.println("Bad log entry (or problem with RE?):");
System.err.println(logEntryLine);
return;
}
System.out.println("Date&Time: " + matcher.group(1));
System.out.println("Hostname: " + matcher.group(2));
System.out.println("Program Name: " + matcher.group(3));
System.out.println("Log: " + matcher.group(4));
}
}
您根本不应该解析日志文件。如果您需要应用程序与自身或其他应用程序通信,请使用数据库。在这种情况下,您几乎没有解析日志文件的条件,因此正则表达式很容易出现缺陷。也就是说,无论使用正则表达式解析日志文件在任何情况下是正确的还是错误的首先,您根本不应该解析日志文件。如果您需要应用程序与自身或其他应用程序通信,请使用数据库。在这种情况下,您解析日志文件的条件很少,因此正则表达式很容易出现缺陷。也就是说,无论是否需要使用正则表达式解析日志文件首先是对还是错。