Java 带引号的Apache commons csv错误
我正在使用org.apache.commons-csv1.4,这周我在junit测试中发现了一个奇怪的行为:Java 带引号的Apache commons csv错误,java,csv,apache-commons-csv,Java,Csv,Apache Commons Csv,我正在使用org.apache.commons-csv1.4,这周我在junit测试中发现了一个奇怪的行为: CSVReader reader = null; List<String[]> linesCsv = new ArrayList<>(); FileInputStream fileStream = null; InputStreamReader inputStreamReader = null; try {
CSVReader reader = null;
List<String[]> linesCsv = new ArrayList<>();
FileInputStream fileStream = null;
InputStreamReader inputStreamReader = null;
try {
fileStream = new FileInputStream(file);
inputStreamReader = new InputStreamReader(fileStream, "ISO-8859-1");
reader = new CSVReader(inputStreamReader, ',', '"', 0);
String[] record = null;
while ((record = reader.readNext()) != null) {
linesCsv.add(record);
}
} catch (Exception e) {
logger.error("Error in ", e);
} finally {
if (inputStreamReader != null) {
inputStreamReader.close();
}
if (fileStream != null) {
fileStream.close();
}
if (reader != null) {
reader.close();
}
}
爪哇岛:
[0.0]达鲁123451[0.1]XXXXX Hello World“Hello World XXX\nDAR_123456,XXXXX Hello World”Hello World XXX
*正确案例 Input.csv
DAR_123451 ,"XXXXX Hello World "Hello World XXX "
DAR_123452 ,"XXXXX Hello World "Hello World XXX "
DAR_123451 ,"XXXXX Hello World "Hello World" XXX "
DAR_123452 ,"XXXXX Hello World "Hello World" XXX "
Java OK:
[0.0]达鲁123451
[0.1]XXXXX你好世界“你好世界”XXX
[1.0]达鲁123452
[1.1]XXXXX你好世界“你好世界”XXX
我无法将commons csv库设置为正常工作,这似乎是一个错误,我们如何才能正确读取字符串中带单引号的字符串 如果值被引号包围,CSV格式通常使用两个连续的双引号在文本中包括一个双引号,例如,以下工作 当我使用最新版本的commons csv时,我甚至会收到一个输入异常(
IOException:(第1行)封装的令牌和分隔符之间的无效字符
)
因此,要正确地包含双引号,您需要使用以下内容
DAR_123451 ,"XXXXX Hello World ""Hello World"" XXX "
DAR_123452 ,"XXXXX Hello World ""Hello World"" XXX "
然后,测试用例按预期工作:
Reader in = new StringReader(
"DAR_123451 ,\"XXXXX Hello World \"\"Hello World XXX\"\" \"\n" +
"DAR_123452 ,\"XXXXX Hello World \"\"Hello World XXX\"\" \"");
Iterable<CSVRecord> records = CSVFormat.DEFAULT.parse(in);
for (CSVRecord record : records) {
for (int i = 0; i < record.size(); i++) {
System.out.println("At " + i + ": " + record.get(i));
}
}
有关详细信息,请参阅。检查文件input.csv中以第一行结尾的行。
At 0: DAR_123451
At 1: XXXXX Hello World "Hello World XXX"
At 0: DAR_123452
At 1: XXXXX Hello World "Hello World XXX"