Java 制表符分隔的字符串到数组
我试图实现的是从Excel工作表(以Txt格式保存,以制表符分隔)中逐行读取数据,并且每一列都是我希望存储在数组中的不同数据 我试过不同的方法。。我甚至从网上下载了CSVReader的课程,但都没用。 至少这次是阅读真实人物,而不是胡说八道 我现在的版本是使用bufferedReader和字符串标记器。Java 制表符分隔的字符串到数组,java,arrays,string,csv,Java,Arrays,String,Csv,我试图实现的是从Excel工作表(以Txt格式保存,以制表符分隔)中逐行读取数据,并且每一列都是我希望存储在数组中的不同数据 我试过不同的方法。。我甚至从网上下载了CSVReader的课程,但都没用。 至少这次是阅读真实人物,而不是胡说八道 我现在的版本是使用bufferedReader和字符串标记器。 但它读得不正确 代码如下: import java.io.BufferedReader; import java.io.BufferedWriter; import
但它读得不正确 代码如下:
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.StringTokenizer;
import com.csvreader.CsvReader;
import au.com.bytecode.opencsv.CSVReader;
public class excelToText{
public static void main(String[] args) throws IOException {
try
{
//csv file containing data
BufferedReader CSVFile = new BufferedReader(new FileReader("C:/Users/nhajjar/workspace/MB/src/yoyo.txt"));
// Read first line.
String dataRow;
int lineNumber = 0;
while ((dataRow = CSVFile.readLine() )!= null)
{
String [] dataArray;
lineNumber++;
String delimiter = "\t";
/* given string will be split by the argument delimiter provided. */
dataArray = dataRow.split(delimiter);
/* print substrings */
//for(int i=0;i Printer name Model IP Location Department Primary Server Secondary Server Share Name
Boundary Sprinter Techs Lexmark E360dn Boundary Sprinter s173m928site s173mho1site 928-Sprinter-techs-L-E360dn
Boundary Sprinter Xerox 7232 Xerox WorkCentre 7232 Boundary Sprinter s173m928site s173mho1site 928-Sprinter-WC7232
Boundry Parts HP LaserJet P2055dn Boundary Parts s173m928site s173mho1site 928-Parts-LJ-P2055dn
Boundry Sales HP Color LaserJet CP4005 Boundary Sales s173m928site s173mho1site 928-Sales-Main-CP4005
Boundry Techs East HP LaserJet P3015 Boundary Techs East s173m928site s173mho1site 928-Techs-east-LJ-P3015
Boundry Techs West Lexmark E352dn Boundary Techs West s173m928site s173mho1site 928-Techs-west-L-E352dn
Concord Lexmark E360dn Concord s173mho1site
Dundas Parts Xerox WorkCentre 7232 Dundas Parts s173m910site s173mho1site 910-Parts-WC-7232
Dundas Preowned Xerox WorkCentre 7425 Dundas Preowned s173m910site s173mho1site 910-PreOwned-WC-7425
Dundas Sales 2nd Floor HP Color LaserJet CP4025 Dundas Sales s173m910site s173mho1site 910-Sales-2nd-CP4025
Dundas Sales Main Floor HP Color LaserJet CP4025 Dundas Sales s173m910site s173mho1site 910-Sales-Main-CP4025
output im getting is :
PrinterName is : Printer name
Model is : Model
IP is : IP
Location is : Location
Department is : Department
PrimServer is : Primary Server
SecServer is : Secondary Server
ShareName is : Share Name
GroupNamePrefix is : Group Name Prefix
GroupNameSuffix is : Group Name Suffix
GroupNameFinal is : Group Name Final
WSPPrefix is : WSP Prefix
WSPFull is : WSP Full
PrimWSP is : Primary WSP
PrimWSP is : Primary WSP
SecWSP is : Secondary WSP
PrinterName is : Boundary Sprinter Techs
Model is : Lexmark E360dn
IP is : 53.254.177.138
Location is : Boundary
Department is : Sprinter
PrimServer is : s173m928site
SecServer is : s173mho1site
ShareName is : 928-Sprinter-techs-L-E360dn
GroupNamePrefix is : D173_PRINTER-
GroupNameSuffix is : 928-Sprinter-techs-L-E360dn
GroupNameFinal is : D173_PRINTER-928-Sprinter-techs-L-E360dn
WSPPrefix is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\
WSPFull is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\D173_PRINTER-928-Sprinter-techs-L-E360dn.wsp;D173_PRINTER-928-Sprinter-techs-L-E360dn
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Sprinter-techs-L-E360dn
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Sprinter-techs-L-E360dn
SecWSP is : ;>;%;\\s173mho1site.cambc.corpintra.net\928-Sprinter-techs-L-E360dn
PrinterName is : Boundary Sprinter Xerox 7232
Model is : Xerox WorkCentre 7232
IP is : 53.254.177.136
Location is : Boundary
Department is : Sprinter
PrimServer is : s173m928site
SecServer is : s173mho1site
ShareName is : 928-Sprinter-WC7232
GroupNamePrefix is : D173_PRINTER-
GroupNameSuffix is : 928-Sprinter-WC7232
GroupNameFinal is : D173_PRINTER-928-Sprinter-WC7232
WSPPrefix is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\
WSPFull is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\D173_PRINTER-928-Sprinter-WC7232.wsp;D173_PRINTER-928-Sprinter-WC7232
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Sprinter-WC7232
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Sprinter-WC7232
SecWSP is : ;>;%;\\s173mho1site.cambc.corpintra.net\928-Sprinter-WC7232
PrinterName is : Boundry Parts
Model is : HP LaserJet P2055dn
IP is : 53.254.193.222
Location is : Boundary
Department is : Parts
PrimServer is : s173m928site
SecServer is : s173mho1site
ShareName is : 928-Parts-LJ-P2055dn
GroupNamePrefix is : D173_PRINTER-
GroupNameSuffix is : 928-Parts-LJ-P2055dn
GroupNameFinal is : D173_PRINTER-928-Parts-LJ-P2055dn
WSPPrefix is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\
WSPFull is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\D173_PRINTER-928-Parts-LJ-P2055dn.wsp;D173_PRINTER-928-Parts-LJ-P2055dn
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Parts-LJ-P2055dn
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Parts-LJ-P2055dn
SecWSP is : ;>;%;\\s173mho1site.cambc.corpintra.net\928-Parts-LJ-P2055dn
PrinterName is : Boundry Sales
Model is : HP Color LaserJet CP4005
IP is : 53.254.193.117
Location is : Boundary
Department is : Sales
PrimServer is : s173m928site
SecServer is : s173mho1site
ShareName is : 928-Sales-Main-CP4005
GroupNamePrefix is : D173_PRINTER-
GroupNameSuffix is : 928-Sales-Main-CP4005
GroupNameFinal is : D173_PRINTER-928-Sales-Main-CP4005
WSPPrefix is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\
WSPFull is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\D173_PRINTER-928-Sales-Main-CP4005.wsp;D173_PRINTER-928-Sales-Main-CP4005
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Sales-Main-CP4005
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Sales-Main-CP4005
SecWSP is : ;>;%;\\s173mho1site.cambc.corpintra.net\928-Sales-Main-CP4005
PrinterName is : Boundry Techs East
Model is : HP LaserJet P3015
IP is : 53.254.193.220
Location is : Boundary
Department is : Techs East
PrimServer is : s173m928site
SecServer is : s173mho1site
ShareName is : 928-Techs-east-LJ-P3015
GroupNamePrefix is : D173_PRINTER-
GroupNameSuffix is : 928-Techs-east-LJ-P3015
GroupNameFinal is : D173_PRINTER-928-Techs-east-LJ-P3015
WSPPrefix is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\
WSPFull is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\D173_PRINTER-928-Techs-east-LJ-P3015.wsp;D173_PRINTER-928-Techs-east-LJ-P3015
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Techs-east-LJ-P3015
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Techs-east-LJ-P3015
SecWSP is : ;>;%;\\s173mho1site.cambc.corpintra.net\928-Techs-east-LJ-P3015
PrinterName is : Boundry Techs West
Model is : Lexmark E352dn
IP is : 53.254.193.221
Location is : Boundary
Department is : Techs West
PrimServer is : s173m928site
SecServer is : s173mho1site
ShareName is : 928-Techs-west-L-E352dn
GroupNamePrefix is : D173_PRINTER-
GroupNameSuffix is : 928-Techs-west-L-E352dn
GroupNameFinal is : D173_PRINTER-928-Techs-west-L-E352dn
WSPPrefix is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\
WSPFull is : #;\D173\_GLOBALRESOURCES\GROUPS\Printers\D173_PRINTER-928-Techs-west-L-E352dn.wsp;D173_PRINTER-928-Techs-west-L-E352dn
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Techs-west-L-E352dn
PrimWSP is : >;%;\\s173m928site.cambc.corpintra.net\928-Techs-west-L-E352dn
SecWSP is : ;>;%;\\s173mho1site.cambc.corpintra.net\928-Techs-west-L-E352dn
Exception while reading/writing csv file: java.lang.ArrayIndexOutOfBoundsException: 7
注意:我截断了输出。还有很多这样的“空块”令牌只被空格和加号分割。这是因为
StringTokenizer
的第二个参数不允许regexp。试着用“\t”
来分隔字段
顺便说一句,每行实际上不需要100个字符串数组。这看起来过于复杂了。CSV文件具有逗号分隔的值。因此,您的文件数据应该如下所示: 第1行:第1单元格、第2单元格、第3单元格、第4单元格等 第2行:第2行的第1单元格、第2行的第2单元格等 您正在使用制表符分隔的文件,因此需要在制表符上拆分
//Read first line
String dataRow;
int lineNumber = 0;
while((dataRow = CSVFile.readline()) != null)
{
String [] dataArray;
lineNumber++;
String delimiter = "\t";
dataArray = dataRow.split(delimiter); //Now dataArray contains all the tab delimited cells that were on line one of the .txt file
//Start assigning the information to your variables since it is stored in dataArray
String PrinterName = dataArray[0];//This would assign the first cell (row 1, column 1 in excel) that was read from the text file. From looking at your input this should be "Printer name"
String Model = dataArray[1];
String IP = dataArray[2];
//etc...Assign the rest
//Print your output
//Do anything else you need to do
}//end the while loop
以下是一些资料来源:
注意:此示例更接近您正在做的事情,只需使用“\t”而不是“,”:
希望这有帮助我建议使用
String.split()
函数,因为这将大大简化您的代码。我不建议您自己使用.split或StringTokenizer类拆分数据。它们不包括字段引用等
使用OpenCSV或StrTokenizer(commons lang的一部分)之类的库为您进行解析
使用制表符分隔的解析器的strokenizer示例:
StrTokenizer tokenizer = StrTokenizer.getTSVInstance();
while ( (line = ...) != null) {
tokenizer.reset(line);
String tokArray[] = tokenizer.getTokenArray();
}
“但它读得不正确。”你能说得更具体一点吗?你有问题吗?预期产量是多少?当文件以制表符分隔时,为什么要使用“+”作为分隔符?您是否尝试过“\t”或使用拆分(“\t”)?我的问题是obv读取不正确。我不明白为什么。预期输出为打印机名称、型号、ip、位置。。。您指定的输入似乎不是csv值列表,而是空格/制表符分隔的值。你的问题是什么?你想实现什么?你的标题是CSV,但你的输入中没有逗号。文件真的是以制表符分隔的吗?我要修复的第一件事是您的分隔符,因为它们对我来说没有任何意义。在读/写csv文件时异常崩溃:java.lang.ArrayIndexOutOfBoundsException:7我将查看代码中将数据插入数组的部分。错误只是指您试图访问一个尚不存在的插槽。您是否超出了数组的大小?我的第一个猜测是这行代码:dataArray[i++]=tokens.nextToken()//如果i为100或更大,则将超出数组的边界,因为声明的大小为100(数组的索引为0-99)。通常每个单元格包含的字符不超过50个左右。我把它设为100,因为当我没有为数组声明大小时,它崩溃了。也许数组更大?我不是指单元格中的字符数,而是指您拥有的单元格数。如果您使用“\t”创建令牌,并且调用tokens.countTokens(),并且它返回的值超过100(意味着您有超过100个单元格),这也意味着我将超过您声明的字符串数组的边界。声明字符串[]数据数组=新字符串[100]//这将为您分配内存以存储最多100个字符串(例如,100个单元格的内容),而不是100个字符。如果不清楚,我可以进一步解释。
StrTokenizer tokenizer = StrTokenizer.getTSVInstance();
while ( (line = ...) != null) {
tokenizer.reset(line);
String tokArray[] = tokenizer.getTokenArray();
}