Java扫描器Dilimiter

Java扫描器Dilimiter,java,regex,parsing,tokenize,delimiter,Java,Regex,Parsing,Tokenize,Delimiter,我正在使用Scanner和分隔符标记我的.txt文件(这是我必须做的家庭作业)。文件的第一个版本如下所示: 5,5,5,6,5,8,9,5,6,8, good, very good, excellent, good 7,7,8,7,6,7,8,8,9,7,very good, Good, excellent, very good 8,7,6,7,8,7,5,6,8,7 ,GOOD, VERY GOOD, GOOD, AVERAGE 9,9,9,8,9,7,9,8,9,9 ,Excellent,

我正在使用Scanner和分隔符标记我的.txt文件(这是我必须做的家庭作业)。文件的第一个版本如下所示:

5,5,5,6,5,8,9,5,6,8, good, very good, excellent, good
7,7,8,7,6,7,8,8,9,7,very good, Good, excellent, very good
8,7,6,7,8,7,5,6,8,7 ,GOOD, VERY GOOD, GOOD, AVERAGE
9,9,9,8,9,7,9,8,9,9 ,Excellent, very good, very good, excellent
7,8,8,7,8,7,8,9,6,8 ,very good, good, excellent, excellent
6,5,6,4,5,6,5,6,6,6 ,good, average, good, good
7,8,7,7,6,8,7,8,6,6 ,good, very good, good,  very good
5,7,6,7,6,7,6,7,7,7  ,excellent, very good, very good, very good
5 5 5 6 5 8 9 5 6 8 good, very good, excellent, good
7 7 8 7 6 7 8 8 9 7 very good, Good, excellent, very good
8 7 6 7 8 7  5 6 8 7 GOOD, VERY GOOD, GOOD, AVERAGE
9 9 9 8 9 7 9  8 9 9 Excellent, very good, very good, excellent
7 8 8 7 8 7 8 9 6 8 very good, good, excellent, excellent
6 5 6 4 5 6 5 6 6 6 good, average, good, good
7  8 7 7 6 8 7 8 6 6 good, very good, good,  very good
5 7 6 7 6 7 6 7 7 7  excellent, very good, very good, very good
我使用了
useDelimiter(“[]*(,)[]*”)
文件的第二个版本如下所示:

5,5,5,6,5,8,9,5,6,8, good, very good, excellent, good
7,7,8,7,6,7,8,8,9,7,very good, Good, excellent, very good
8,7,6,7,8,7,5,6,8,7 ,GOOD, VERY GOOD, GOOD, AVERAGE
9,9,9,8,9,7,9,8,9,9 ,Excellent, very good, very good, excellent
7,8,8,7,8,7,8,9,6,8 ,very good, good, excellent, excellent
6,5,6,4,5,6,5,6,6,6 ,good, average, good, good
7,8,7,7,6,8,7,8,6,6 ,good, very good, good,  very good
5,7,6,7,6,7,6,7,7,7  ,excellent, very good, very good, very good
5 5 5 6 5 8 9 5 6 8 good, very good, excellent, good
7 7 8 7 6 7 8 8 9 7 very good, Good, excellent, very good
8 7 6 7 8 7  5 6 8 7 GOOD, VERY GOOD, GOOD, AVERAGE
9 9 9 8 9 7 9  8 9 9 Excellent, very good, very good, excellent
7 8 8 7 8 7 8 9 6 8 very good, good, excellent, excellent
6 5 6 4 5 6 5 6 6 6 good, average, good, good
7  8 7 7 6 8 7 8 6 6 good, very good, good,  very good
5 7 6 7 6 7 6 7 7 7  excellent, very good, very good, very good
我也不能想出一个regexp来帮助我用空格分隔数字,用逗号分隔单词。本质上我需要一个有14个值的数组(作为一个单变量非常好)

注意,有多个空格(这是为了让我们更难使用)

因此,任何形式的帮助都将不胜感激


另外,我们只允许使用分隔符(不允许拆分等)

你可以试试这个
(?)?

尝试此操作:

请注意,
Scanner
允许您随时更改分隔符。如果您可以依靠输入文本的开头总是有10个数字,结尾总是有4个词组,那么您只需在空格(
\s+
)上拆分分隔符,并在10次调用
nextInt()后即可开始使用分隔符
,切换到一个分隔符,该分隔符在 逗号和空格(
\s*,\s*

比如:

String input = "5 5 5 6 5 8 9 5 6 8 good, very good, excellent, good";
Scanner scanner = new Scanner(input).useDelimiter("\\s+");
int[] results = new int[14];
for (int i = 0; i < 10; ++i) {
    results[i] = scanner.nextInt();
}
scanner.useDelimiter("\\s*,\\s*");
scanner.skip("\\s*");
for (int i = 10; i < 14; ++i) {
    String wordPhrase = scanner.next();
    int wordValue;
    if ("average".equalsIgnoreCase(wordPhrase))
        wordValue = 1;
    else if ("good".equalsIgnoreCase(wordPhrase))
        wordValue = 2;
    else if ("very good".equalsIgnoreCase(wordPhrase))
        wordValue = 3;
    else if ("excellent".equalsIgnoreCase(wordPhrase))
        wordValue = 4;
    else
        wordValue = 0;
    results[i] = wordValue;
}
String input=“5 5 6 5 8 9 5 6 8好,非常好,非常好,很好”;
扫描仪扫描仪=新扫描仪(输入)。使用分隔符(\\s+);
int[]结果=新的int[14];
对于(int i=0;i<10;++i){
结果[i]=scanner.nextInt();
}
scanner.useDelimiter(“\\s*,\\s*”);
scanner.skip(“\\s*”);
对于(int i=10;i<14;++i){
String wordPhrase=scanner.next();
int-wordValue;
如果(“平均”。相等信号情况(词语短语))
wordValue=1;
else if(“good”.equalsIgnoreCase(wordPhrase))
wordValue=2;
else if(“非常好”。equalsIgnoreCase(wordPhrase))
wordValue=3;
else if(“极好”。相等信号情况(wordPhrase))
wordValue=4;
其他的
wordValue=0;
结果[i]=wordValue;
}

使用单个分隔符regex也可以做到这一点,但对于一个简单的家庭作业问题来说,这可能有点高级。

这应该可以做到,关键是正向查找(
)(有“空格”(\s)和“单词”(\w)以及“数字”(\d)和“单词边界”(\b)正则表达式的字符类可能会对您有所帮助。您想在最后一个数字和第一个单词之间使用逗号吗?我不需要向文件中添加任何内容,我只需要提取值并将其放入多维数组中。在这种情况下,它将是int[8][14],单词将替换为适当的数字。首先,这是错误的。我猜您的意思是
(\s*,\s*)|(\s+)
。但这也不起作用。它会将
很好地
分成两个标记。数字似乎没问题,但字符串每个只包含一个字母。效果很好,但如果他让我解释,我就不知道从何处开始。呵呵:)事实上,|(交替)有点像“或”条件?同样非常好的是被分割成单独的标记,而它应该是一个单独的标记。谢谢你的建议,为什么我以前没有想到:)我有一个问题,第10个索引似乎总是0。啊,哎呀,我想当你切换分隔符时,
扫描器
不会占用空间在最后一个数字和第一个单词之间,因此第一个单词短语类似于“`good
”。我更新了答案,告诉
Scanner`在更改分隔符后跳过空格。
Token: .9.
Token: .9.
Token: .9.
Token: .8.
Token: .9.
Token: .7.
Token: .9.
Token: .8.
Token: .9.
Token: .9.
Token: .Excellent.
Token: .very good.
Token: .very good.
Token: .excellent.