Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/373.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
java正则表达式拆分_Java_Regex - Fatal编程技术网

java正则表达式拆分

java正则表达式拆分,java,regex,Java,Regex,我有一个字符串,比如: Snt:It was the most widespread day of environmental action in the planet's history ==================== ----------- Snt:Five years ago, I was working for just over minimum wage ==================== ----------- 我想用你的手把绳子分开 ===============

我有一个字符串,比如:

Snt:It was the most widespread day of environmental action in the planet's history
====================
-----------
Snt:Five years ago, I was working for just over minimum wage
====================
-----------
我想用你的手把绳子分开

====================
-----------
当然,从第一个句子中删除Snt:。 最好的方法是什么

我使用了这个正则表达式,但它不起作用

String[] content1 =content.split("\\n\\====================\\n\\-----------\\n");

提前感谢。

因为最后一行没有换行符,所以它与最后一行==,-不匹配。您需要在最后添加行尾锚点$,作为正则表达式中\n的替代

String s = "Snt:It was the most widespread day of environmental action in the planet's history\n" +
"====================\n" +
"-----------\n" +
"Snt:Five years ago, I was working for just over minimum wage\n" +
"====================\n" +
"-----------";
String m = s.replaceAll("(?m)^Snt:", "");
String[] tok = m.split("\\n\\====================\\n\\-----------(?:\\n|$)");
System.out.println(Arrays.toString(tok));
输出:

[It was the most widespread day of environmental action in the planet's history, Five years ago, I was working for just over minimum wage]

因为最后一行不存在换行符,所以它与最后一行==,-不匹配。您需要在最后添加行尾锚点$,作为正则表达式中\n的替代

String s = "Snt:It was the most widespread day of environmental action in the planet's history\n" +
"====================\n" +
"-----------\n" +
"Snt:Five years ago, I was working for just over minimum wage\n" +
"====================\n" +
"-----------";
String m = s.replaceAll("(?m)^Snt:", "");
String[] tok = m.split("\\n\\====================\\n\\-----------(?:\\n|$)");
System.out.println(Arrays.toString(tok));
输出:

[It was the most widespread day of environmental action in the planet's history, Five years ago, I was working for just over minimum wage]

由于数据的结构方式,我将把概念从拆分中颠倒过来,改为作为匹配者。这也让您能够很好地匹配Snt:

private static final String VAL = "Snt:It was the most widespread day of environmental action in the planet's history\n"
        + "====================\n"
        + "-----------\n"
        + "Snt:Five years ago, I was working for just over minimum wage\n"
        + "====================\n"
        + "-----------";

public static void main(String[] args) {
    List<String> phrases = new ArrayList<String>();
    Matcher mat = Pattern.compile("Snt:(.+?)\n={20}\n-{11}\\s*").matcher(VAL);
    while (mat.find()) {
        phrases.add(mat.group(1));
    }

    System.out.printf("Value: %s%n", phrases); 
}
我使用正则表达式:Snt:.+?\n={20}\n-{11}\\s*

这假定文件中的第一个单词是Snt:,然后它对下一个短语进行分组,直到使用分隔符为止。它将使用任何尾随空格,使表达式为下一条记录做好准备


这个过程的好处是匹配匹配一条记录,而不是有一个表达式匹配一条记录的部分结尾,一条可能是下一条记录的开始。

由于数据的结构方式,我将从拆分的概念转变为匹配器。,这也允许您很好地计算Snt:

private static final String VAL = "Snt:It was the most widespread day of environmental action in the planet's history\n"
        + "====================\n"
        + "-----------\n"
        + "Snt:Five years ago, I was working for just over minimum wage\n"
        + "====================\n"
        + "-----------";

public static void main(String[] args) {
    List<String> phrases = new ArrayList<String>();
    Matcher mat = Pattern.compile("Snt:(.+?)\n={20}\n-{11}\\s*").matcher(VAL);
    while (mat.find()) {
        phrases.add(mat.group(1));
    }

    System.out.printf("Value: %s%n", phrases); 
}
我使用正则表达式:Snt:.+?\n={20}\n-{11}\\s*

这假定文件中的第一个单词是Snt:,然后它对下一个短语进行分组,直到使用分隔符为止。它将使用任何尾随空格,使表达式为下一条记录做好准备

这个过程的好处是匹配匹配一条记录,而不是有一个表达式匹配一条记录的部分结尾,一条可能是下一条记录的开始。

那怎么办

Pattern p = Pattern.compile("^Snt:(.*)$", Pattern.MULTILINE);
Matcher m = p.matcher(str);

while (m.find()) {
    String sentence = m.group(1);
}
与其使用split进行黑客攻击并进行额外的解析,不如只查找以Snt开头的行,然后捕获后面的内容。

怎么样

Pattern p = Pattern.compile("^Snt:(.*)$", Pattern.MULTILINE);
Matcher m = p.matcher(str);

while (m.find()) {
    String sentence = m.group(1);
}
Matcher m = Pattern.compile("([^=\\-]+)([=\\-]+[\\t\\n\\s]*)+").matcher(str);   
while (m.find()) {
    String match = m.group(1);
    System.out.println(match);
}

与使用拆分和进行额外的解析不同,这只是查找以Snt开头的行,然后捕获下面的内容。

使用content.replaceAllSnt:;然后进行拆分这可能不是拆分的最佳用途。你正在从文件中读取这些行吗?也许检查从BufferedReader返回的行才是您真正想要做的;然后进行拆分这可能不是拆分的最佳用途。你正在从文件中读取这些行吗?也许检查从BufferedReader返回的行才是您真正想要做的。您忘记了使用Pattern.MULTILINE标志让$匹配行的结尾,而不仅仅是字符串的结尾。无论如何+1,使用split无法合理地完成此操作,除非我们希望忽略结果数组中的第一个元素,因为还需要删除Snt:。您忘记使用Pattern.MULTILINE标志让$match-end-of-line,而不仅仅是字符串的结尾。无论如何+1,这不能用split合理地完成,除非我们想忽略结果数组中的第一个元素,因为还需要删除Snt:。
Matcher m = Pattern.compile("([^=\\-]+)([=\\-]+[\\t\\n\\s]*)+").matcher(str);   
while (m.find()) {
    String match = m.group(1);
    System.out.println(match);
}