Java 我的CSV阅读器没有';t识别行首和行尾缺少的值
嗨,我正在做一个简单的模仿Panda的fillna方法的工作,它要求我用输入(参数方面)替换csv文件中的空/缺失值。几乎所有的工作都很好,但我有一个问题。我的CSV阅读器无法识别行首和行尾的空/缺失。比如说,Java 我的CSV阅读器没有';t识别行首和行尾缺少的值,java,csv,Java,Csv,嗨,我正在做一个简单的模仿Panda的fillna方法的工作,它要求我用输入(参数方面)替换csv文件中的空/缺失值。几乎所有的工作都很好,但我有一个问题。我的CSV阅读器无法识别行首和行尾的空/缺失。比如说, Name,Age,Class John,20,CLass-1 ,18,Class-1 ,21,Class-3 它将返回错误。 这个例子也是如此 Name,Age,Class John,20,CLass-1 Mike,18, Tyson,2
Name,Age,Class
John,20,CLass-1
,18,Class-1
,21,Class-3
它将返回错误。
这个例子也是如此
Name,Age,Class
John,20,CLass-1
Mike,18,
Tyson,21,
但是对于这种情况(在问题行的末尾),我可以通过在末尾添加另一个逗号来解决这个问题。像这样
Name,Age,Class
John,20,CLass-1
Mike,18,,
Tyson,21,,
然而,对于行的开头问题,我不知道如何解决它
以下是我的CSV文件读取器代码:
public void readCSV(String fileName) {
fileLocation = fileName;
File csvFile = new File(fileName);
Scanner sfile;
// noOfColumns = 0;
// noOfRows = 0;
data = new ArrayList<ArrayList>();
int colCounter = 0;
int rowCounter = 0;
try {
sfile = new Scanner(csvFile);
while (sfile.hasNextLine()) {
String aLine = sfile.nextLine();
Scanner sline = new Scanner(aLine);
sline.useDelimiter(",");
colCounter = 0;
while (sline.hasNext()) {
if (rowCounter == 0)
data.add(new ArrayList<String>());
data.get(colCounter).add(sline.next());
colCounter++;
}
rowCounter++;
sline.close();
}
// noOfColumns = colCounter;
// noOfRows = rowCounter;
sfile.close();
} catch (FileNotFoundException e) {
System.out.println("File to read " + csvFile + " not found!");
}
}
public void readCSV(字符串文件名){
fileLocation=文件名;
文件csvFile=新文件(文件名);
扫描文件;
//noOfColumns=0;
//noOfRows=0;
数据=新的ArrayList();
int colCounter=0;
int rowCounter=0;
试一试{
sfile=新扫描仪(csvFile);
while(sfile.hasNextLine()){
字符串aLine=sfile.nextLine();
扫描仪sline=新扫描仪(aLine);
sline.useDelimiter(“,”);
colCounter=0;
while(sline.hasNext()){
如果(行计数器==0)
add(newarraylist());
get(colCounter).add(sline.next());
colCounter++;
}
行计数器++;
sline.close();
}
//noOfColumns=colCounter;
//noOfRows=行计数器;
sfile.close();
}catch(filenotfounde异常){
System.out.println(“要读取的文件”+csvFile+“未找到!”);
}
}
除非您自己编写CSV文件,否则writer机制永远不会随意添加分隔符以满足您的应用程序方法的需要,因此,完全放弃这种思路,因为您也不应该这样做。如果您确实可以访问CSV文件创建过程,那么简单的解决方案是不允许空值或空值进入文件。换句话说,在编写CSV文件时,将默认值(在这种情况下)放入空元素中
CSV文件中的标题行存在是有原因的,它告诉您组成文件的每行(行)中数据列的数量和这些列的名称。在标题行和文件中的实际数据之间,您还可以很好地了解每个列的数据类型
在我看来,您的readCSV()
方法应该做的第一件事是读取此标题行(如果存在),并收集有关该方法将要迭代的文件的一些信息。在您的情况下,标题行包括:
Name,Age,Class
一开始我们就知道文件中的每一行都由三(3)个数据列组成。第一列包含name
的名称,第二列包含Age
的名称,第三列包含类的名称。根据CSV文件中提供的所有信息,我们实际上可以快速假设数据类型:
Name (String)
Age (Integer)
Class (String)
我只是指出这一点,因为在我看来,虽然不是强制性的,但我认为最好将CSV数据存储在对象类的ArrayList或List接口中,例如:
ArrayList<Student> studentData = new ArrayList<>();
// OR //
List<Student> studentData = new ArrayList<>();
ArrayList studentData=new ArrayList();
//或//
List studentData=new ArrayList();
其中,Student是一个对象类
您似乎希望所有内容都包含在2D ArrayList中,因此请记住,下面是一种读取CSV文件并将其内容放入2D ArrayList的方法。任何包含单词null或完全不包含单词的文件列元素都将应用默认字符串。代码中有很多注释解释了正在发生的事情,我建议您阅读一下。此代码可以轻松修改以满足您的需要。至少我希望它能让您了解如何将默认值应用于CSV文件中的空值:
/**
* Reads a supplied CSV file with any number of columnar rows and returns
* the data within a 2D ArrayList of String ({@code ArrayList<ArrayList<String>>}).
* <br><br>File delimited data that contains 'null' or nothing (a Null String (""))
* will have a supplied common default applied to that column element before it is
* stored within the 2D ArrayList.<br><br>
*
* Modify this code to suit your needs.<br>
*
* @param fileName (String) The CSV file to process.<br>
*
* @param csvDelimiterUsed (String) // The delimiter use in CSV file.<br>
*
* @param commonDefault (String) A default String value that can be common
* to all columnar elements within the CSV file that contains the string
* 'null' or nothing at all (a Null String ("")). Those empty elements will
* end up containing this supplied string value postfixed with the name of
* that column. As an Example, If the CSV file Header line was
* 'Name,Age,Class Room' and if the string "Unknown " is supplied to the
* commonDefault parameter and during file parsing a specific data column
* (let's say Age) contained the word 'null' or nothing at all (ex:
* Bob,null,Class-Math OR Bob,,Class-Math) then this line will be stored
* within the 2D ArrayList as:<pre>
*
* Bob, Unknown Age, Class-Math</pre>
*
* @return (2D ArrayList of String Type - {@code ArrayList<ArrayList<String>>})
*/
public ArrayList<ArrayList<String>> readCSV(final String fileName, final String csvDelimiterUsed,
final String commonDefault) {
String fileLocation = fileName; // The student data file name to process.
File csvFile = new File(fileLocation); // Create a File Object (use in Scanner reader).
/* The 2D ArrayList that will be returned containing all the CSV Row/Column data.
You should really consider creating a Class to hold Student instances of this
data however, this can be accomplish by working the ArrayList later on when it
is received. */
ArrayList<ArrayList<String>> fileData = new ArrayList<>();
// Open the supplied data file using Scanner (as per OP).
try (Scanner reader = new Scanner(csvFile)) {
/* Read the Header Line and gather information... This array
will ultimately be setup to hold default values should
any file columnar data hold null OR null-string (""). */
String[] columnData = reader.nextLine().split("\\s*\\" + csvDelimiterUsed + "\\s*");
/* How many columns of data will be expected per row.
This will be used in the String#split() method later
on as the limiter when we parse each file data line.
This limiter value is rather important in this case
since it ensures that a Null String ("") is in place
of where valid Array element should be should there
be no data available instead of just providing an
array of 'lesser length'. */
int csvValuesPerLineCount = columnData.length;
// Copy column Names Array: To just hold the column Names.
String[] columnName = new String[columnData.length];
System.arraycopy(columnData, 0, columnName, 0, columnData.length);
/* Create default data for columns based on the supplied
commonDefault String. Here the supplied default prefixes
the actual column name (see JavaDoc). */
for (int i = 0; i < columnData.length; i++) {
columnData[i] = commonDefault + columnData[i];
}
// An ArrayList to hold each row of columnar data.
ArrayList<String> rowData;
// Iterate through in each row of file data...
while (reader.hasNextLine()) {
rowData = new ArrayList<>(); // Initialize a new ArrayList.
// Read file line and trim off any leading or trailing white-spaces.
String aLine = reader.nextLine().trim();
// Only Process lines that contain something (blank lines are ignored).
if (!aLine.isEmpty()) {
/* Split the read in line based on the supplied CSV file
delimiter used and the number of columns established
from the Header line. We do this to determine is a
default value will be reguired for a specific column
that contains no value at all (null or Null String("")). */
String[] aLineParts = aLine.split("\\s*\\" + csvDelimiterUsed + "\\s*", csvValuesPerLineCount);
/* Here we determine if default values will be required
and apply them. We then add the columnar row data to
the rowData ArrayList. */
for (int i = 0; i < aLineParts.length; i++) {
rowData.add((aLineParts[i].isEmpty() || aLineParts[i].equalsIgnoreCase("null"))
? columnData[i] : aLineParts[i]);
}
/* Add the rowData ArrayList to the fileData
ArrayList since we are now done with this
file row of data and will now iterate to
the next file line for processing. */
fileData.add(rowData);
}
}
}
// Process the 'File Not Found Exception'.
catch (FileNotFoundException ex) {
System.err.println("The CSV file to read (" + csvFile + ") can not be found!");
}
// Return the fileData ArrayList to the caller.
return fileData;
}
/**
*读取提供的CSV文件,其中包含任意数量的列行和返回
*字符串({@code ArrayList})的2D ArrayList中的数据。
*
包含“null”或不包含任何内容的文件分隔数据(null字符串(“”)
*将对该列元素应用提供的公共默认值,然后再将其
*存储在二维阵列列表中。
*
*修改此代码以满足您的需要。
*
*@param fileName(String)要处理的CSV文件。
*
*@param csvDelimiterUsed(String)//CSV文件中使用的分隔符。
*
*@param commonDefault(String)可以是公共的默认字符串值
*添加到CSV文件中包含该字符串的所有列元素
*“null”或根本不存在(null字符串(“”)。这些空元素将
*最后包含提供的字符串值,该字符串值以
*那个专栏。例如,如果CSV文件头行
*“姓名、年龄、教室”以及字符串“未知”是否提供给
*commonDefault参数和在文件解析特定数据列期间
*(比如说年龄)包含“null”一词或根本没有(例如:
*Bob,null,Class Math或Bob,,Class Math),则将存储此行
*在二维ArrayList中,显示为:
ArrayList<ArrayList<String>> list = readCSV("MyStudentsData.txt", ",", "Unknown ");
if (list == null) { return; }
StringBuilder sb;
for (int i = 0; i < list.size(); i++) {
sb = new StringBuilder("");
for (int j = 0; j < list.get(i).size(); j++) {
if (!sb.toString().isEmpty()) { sb.append(", "); }
sb.append(list.get(i).get(j));
}
System.out.println(sb.toString());
}
*
*鲍勃,年龄不详,数学课
*
*@return(字符串类型的2D ArrayList-{@code ArrayList})
*/
公共ArrayList readCSV(最终字符串文件名,最终字符串csvDelimiterUsed,
最终字符串(默认值){
String fileLocation=fileName;//要处理的学生数据文件名。
File csvFile=新文件(fileLocation);//创建文件对象(在扫描仪读取器中使用)。
/*将返回的2D ArrayList包含所有CSV行/列数据。
游寿