Java正则表达式-获取字符串中子字符串之前的所有单词_Java_Regex_String_Substring

Java正则表达式-获取字符串中子字符串之前的所有单词

java regex string

Java正则表达式-获取字符串中子字符串之前的所有单词,java,regex,string,substring,Java,Regex,String,Substring,我有一个包含一个句子的字符串，我想根据一个单词将它一分为二。我有一个regex（\\w+）单词，我想它会让我记住“word”+“word”之前的所有单词，然后我可以删除最后四个字符然而，这似乎不起作用。。知道我做错了什么吗谢谢。您需要在单词前后标记句子的每个部分 String[]result=“这是一个测试”。拆分（\\s”）//用word替换\\s 对于（int x=0；x这似乎有效： import java.util.regex.Matcher; import java.util.r

我有一个包含一个句子的字符串，我想根据一个单词将它一分为二。我有一个regex

（\\w+）单词

，我想它会让我记住“word”+“word”之前的所有单词，然后我可以删除最后四个字符

然而，这似乎不起作用。。知道我做错了什么吗

谢谢。

您需要在单词前后标记句子的每个部分

String[]result=“这是一个测试”。拆分（\\s”）//用word替换\\s
对于（int x=0；x这似乎有效：
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        Pattern p = Pattern.compile("([\\w\\s]+) word");
        Matcher m = p.matcher("Could you test a phrase with some word");
        while (m.find()) {
            System.err.println(m.group(1));
            System.err.println(m.group());
        }
    }
}

原因是+
是一个贪婪的量词，它将匹配整个字符串，包括指定的单词，而不会返回
如果您将其更改为（\\w+？）word，它应该可以工作（不情愿的量词）。有关量词及其确切功能的详细信息。
使用字符串操作：
int idx = sentence.indexOf(word);
if (idx < 0)
  throw new IllegalArgumentException("Word not found.");
String before = sentence.substring(0, idx);

或者：
Pattern p = Pattern.compile("(.*?)" + Pattern.quote(word) + ".*");
Matcher m = p.matcher(sentence);
if (!m.matches())
  throw new IllegalArgumentException("Word not found.");
String before = m.group(1);

试试这个：
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        Pattern p = Pattern.compile("^.*?(?= word)");
        Matcher m = p.matcher("Everything before the word");
        while (m.find()) {
            System.out.println(m.group());
        }
    }
}

其细分如下：
什么都有
（？=之前
话
结束时
代码对描述问题更有帮助。也许考虑一个非贪婪限定符“+”，而不是“+”，这似乎不起作用。嗯？会发生什么？你想发生什么？为什么不直接使用word？使用Pattern.find你可以在字符串中找到它的索引如果需要的话，我可以帮助扩展我的示例，但快速浏览一下是，句子的各个部分存储在一个数组中，它被你所使用的单词拆分成一个句子。+
是贪婪的，但它确实是贪婪的s允许回溯。所有格等价物是++
好吧，我当时还没有真正弄清楚量词。我想回溯的意思是你实际上指定了正则表达式中的位置和内容？然而，如果输入字符串包含了他要找的单词，那么你会自动找到2个匹配项跟踪我的意思是表达式“\\w+\\w”将与“xy”匹配。匹配者将“\\w+”与“xy”匹配，然后意识到第二个“\\w”与之匹配已所剩无几。因此它将回溯，将“\\w+”与“x”匹配，将第二个“\\w”与“y”匹配。哦，是的，愚蠢的尝试很好地总结了这一点：）我在这里没有粗鲁，我说的是一个事实……我不认为代码格式是必要的，因为问题是关于正则表达式本身的，我假设他已经知道如何编译表达式。我给出了表达式，并将其拆分，以显示每个部分都在做什么。我以后会尝试更具描述性，全新的堆栈溢出。您的编辑已经好多了，我已经清除了否决票。祝你玩得开心！
Pattern p = Pattern.compile("(.*?)" + Pattern.quote(word) + ".*");
Matcher m = p.matcher(sentence);
if (!m.matches())
  throw new IllegalArgumentException("Word not found.");
String before = m.group(1);

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test {
    public static void main(String[] args) {
        Pattern p = Pattern.compile("^.*?(?= word)");
        Matcher m = p.matcher("Everything before the word");
        while (m.find()) {
            System.out.println(m.group());
        }
    }
}