Java 解释工作正则表达式
找到此代码,如果包含双引号,则将CSV字段分隔开 但是我不太理解正则表达式中的模式匹配 如果有人能给我一个逐步解释这个表达式如何计算一个模式,我将不胜感激Java 解释工作正则表达式,java,regex,csv,Java,Regex,Csv,找到此代码,如果包含双引号,则将CSV字段分隔开 但是我不太理解正则表达式中的模式匹配 如果有人能给我一个逐步解释这个表达式如何计算一个模式,我将不胜感激 "([^\"]*)"|(?<=,|^)([^,]*)(?:,|$) “([^\“]*)”|(?我试图给你一些提示和所需的词汇,以便找到非常好的解释 “([^\“]*)”|(? “([^\“]*)”|(?如果需要,可以链接到另一个答案,这比在没有任何参考的情况下复制整个答案更有用。 import java.util.ArrayList;
"([^\"]*)"|(?<=,|^)([^,]*)(?:,|$)
“([^\“]*)”|(?我试图给你一些提示和所需的词汇,以便找到非常好的解释
“([^\“]*)”|(?
“([^\“]*)”|(?如果需要,可以链接到另一个答案,这比在没有任何参考的情况下复制整个答案更有用。
import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CSVParser {
/*
* This Pattern will match on either quoted text or text between commas, including
* whitespace, and accounting for beginning and end of line.
*/
private final Pattern csvPattern = Pattern.compile("\"([^\"]*)\"|(?<=,|^)([^,]*)(?:,|$)");
private ArrayList<String> allMatches = null;
private Matcher matcher = null;
private String match = null;
private int size;
public CSVParser() {
allMatches = new ArrayList<String>();
matcher = null;
match = null;
}
public String[] parse(String csvLine) {
matcher = csvPattern.matcher(csvLine);
allMatches.clear();
String match;
while (matcher.find()) {
match = matcher.group(1);
if (match!=null) {
allMatches.add(match);
}
else {
allMatches.add(matcher.group(2));
}
}
size = allMatches.size();
if (size > 0) {
return allMatches.toArray(new String[size]);
}
else {
return new String[0];
}
}
public static void main(String[] args) {
String lineinput = "the quick,\"brown, fox jumps\",over,\"the\",,\"lazy dog\"";
CSVParser myCSV = new CSVParser();
System.out.println("Testing CSVParser with: \n " + lineinput);
for (String s : myCSV.parse(lineinput)) {
System.out.println(s);
}
}
}
"([^\"]*)"|(?<=,|^)([^,])(?:,|$)
"([^\"]*)"|(?<=,|^)([^,]*)(?:,|$)
() capture group
(?:) non-capture group
[] any character within the bracket matches
\ escape character used to match operators aka "
(?<=) positive lookbehind (looks to see if the contained matches before the marker)
| either or operator (matches either side of the pipe)
^ beginning of line operator
* zero or more of the preceding character
$ or \z end of line operator