Java 如何解析输入文件
所以我需要解析这个输入文件,但我似乎不知道该怎么做。我尝试过使用Java 如何解析输入文件,java,parsing,java.util.scanner,Java,Parsing,Java.util.scanner,所以我需要解析这个输入文件,但我似乎不知道该怎么做。我尝试过使用scanner.Delimiter(),但仍然有问题。有人知道怎么做吗 以下是输入文件中的一行: 200.88.223.98--[01/Feb/2007:04:02:22-0500]“GET/gallery/v/events/album02/Attracts/ProgrammingAttraction05/?g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2和g2_fromNavId=x33
scanner.Delimiter()
,但仍然有问题。有人知道怎么做吗
以下是输入文件中的一行:
200.88.223.98--[01/Feb/2007:04:02:22-0500]“GET/gallery/v/events/album02/Attracts/ProgrammingAttraction05/?g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2和g2_fromNavId=x332be852 HTTP/1.1”200 52464"http://cs.tcnj.edu/gallery/main.php?g2_view=comment.AddComment&g2_itemId=664&g2_return=http%3A%2F%2Fcs.tcnj.edu%2Fgallery%2Fv%2Fevents%2Falbum02%2Fcontests%2FprogrammingContest05%2F%3Fg2_GALLERYSID%3D3be9666f9c07e16b7f33e2ea8acb8dd2&g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_returnName=album“Opera/6.01(Windows 98;U)[en]”
假设将其分成以下几个部分:
地址=200.88.223.98
date=01/Feb/2007:04:02:22-0500
request=GET/gallery/v/events/album02/contestics/programmingcontestict05/?g2\u GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2\u fromNavId=x332be852 HTTP/1.1
status=200
bytes=52464
参考=http://cs.tcnj.edu/gallery/main.php?
g2_view=comment.AddComment&g2_itemId=664&g2_return=http%3A%2F%2Fcs.tcnj.edu%2Fgallery%2Fv%2Fevents%2Falbum02%2fCompetings%2fProgrammingCompeting05%2F%3Fg2_GALLERYSID%3D3BE9666F9C07E16B7F33EA8ACB8DD2&G2GallerySid=3BE9666F9F9F9C07E16B7F37F33EA8ACB8DD2&g2 GALLERYSID=3E8ACB8ACB8DD2&g2返回名=相册
agent=Opera/6.01(Windows 98;U)[en]
Scanner scan = new Scanner(input);
scan.useDelimiter("[-']+");
while (scan.hasNextLine())
{
String address = scan.next();
String date = scan.next();
String request = scan.next();
int status = scan.nextInt();
int bytes = scan.nextInt();
String refer = scan.next();
String agent = scan.next();
}
显示以下错误:
线程“main”java.util.InputMismatchException中的异常
位于java.util.Scanner.throwFor(Scanner.java:840)
下一步(Scanner.java:1461)
位于java.util.Scanner.nextInt(Scanner.java:2091)
位于java.util.Scanner.nextInt(Scanner.java:2050)
在Analyzer.启动时(未知源)
at Driver.main(未知源)
Java结果:1
想想这个。
按空间分割行并提取数据
String s = "200.88.223.98 - - [01/Feb/2007:04:02:22 -0500] \"GET /gallery/v/events/album02/contests/programmingContest05/?g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_fromNavId=x332be852 HTTP/1.1\" 200 52464 \"http://cs.tcnj.edu/gallery/main.php?g2_view=comment.AddComment&g2_itemId=664&g2_return=http%3A%2F%2Fcs.tcnj.edu%2Fgallery%2Fv%2Fevents%2Falbum02%2Fcontests%2FprogrammingContest05%2F%3Fg2_GALLERYSID%3D3be9666f9c07e16b7f33e2ea8acb8dd2&g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_returnName=album\" \"Opera/6.01 (Windows 98; U) [en]\"";
String arr [] = s.split(" ");
for(int i =0 ;i<arr.length;i++){
System.out.println(i+" - "+arr[i]);
}
因此,第0个元素表示ip,第3个和第4个元素表示日期,第6个和第7个元素表示请求,这样您就可以提取数据。您遇到的实际问题是什么?数据是否总是遵循相同的模式?如果是这样,也许一个正则表达式就是答案。&每行数据可能不同。有些可能是不同的缺少这些字段中的一个或多个,每一行可能有不同的长度,等等,这可以用正则表达式以某种方式实现吗?当我试图提取数据时,我得到一个ArrayIndexOutOfBoundsException,我不知道为什么
0 : 200.88.223.98
1 : -
2 : -
3 : [01/Feb/2007:04:02:22
4 : -0500]
5 : "GET
6 : /gallery/v/events/album02/contests/programmingContest05/?g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_fromNavId=x332be852
7 : HTTP/1.1"
8 : 200
9 : 52464
10 : "http://cs.tcnj.edu/gallery/main.php?g2_view=comment.AddComment&g2_itemId=664&g2_return=http%3A%2F%2Fcs.tcnj.edu%2Fgallery%2Fv%2Fevents%2Falbum02%2Fcontests%2FprogrammingContest05%2F%3Fg2_GALLERYSID%3D3be9666f9c07e16b7f33e2ea8acb8dd2&g2_GALLERYSID=3be9666f9c07e16b7f33e2ea8acb8dd2&g2_returnName=album"
11 : "Opera/6.01
12 : (Windows
13 : 98;
14 : U)
15 : [en]"