Java拆分字符串,单词中间带引号

Java拆分字符串,单词中间带引号,java,regex,string,Java,Regex,String,我非常感谢您对Java代码的帮助,以拆分以下输入: word1 key="value with space" word3 -> [ "word1", "key=\"value with space\"", "word3" ] word1 "word2 with space" word3 -> [ "word1", "word2 with space", "word3" ] word1 word2 word3 -> [ "word1" , "word2", "word3" ]

我非常感谢您对Java代码的帮助,以拆分以下输入:

word1 key="value with space" word3 -> [ "word1", "key=\"value with space\"", "word3" ]
word1 "word2 with space" word3 -> [ "word1", "word2 with space", "word3" ]
word1 word2 word3 -> [ "word1" , "word2", "word3" ]

第一个样本输入是困难的。第二个词在字符串的中间有引号而不是在开头。我找到了几种处理中间示例的方法,如

中所述,这可以通过混合使用regex和replace来完成。只需先找到用引号括起来的文本,然后替换为非空格。然后可以基于空格拆分字符串并替换回键文本

    String s1 = "word1 key=\"value with space\" word3";

    List<String> list = new ArrayList<String>();
    Matcher m = Pattern.compile("\"([^\"]*)\"").matcher(s1);
    while (m.find())
        s1 = s1.replace(m.group(1), m.group(1).replace(" ", "||")); // replaces the spaces between quotes with ||

    for(String s : s1.split(" ")) {
        list.add(s.replace("||", " ")); // switch back the text to a space.
        System.out.println(s.replace("||", " ")); // just to see output
    }
String s1=“word1键=\”带空格的值\“word3”;
列表=新的ArrayList();
Matcher m=Pattern.compile(“\”([^\“]*)\”).Matcher(s1);
while(m.find())
s1=s1.replace(m.group(1),m.group(1).replace(“,“| |”);//将引号之间的空格替换为||
用于(字符串s:s1.split(“”){
list.add(s.replace(“| |,”);//将文本切换回空格。
System.out.println(s.replace(“| |,”);//只是为了查看输出
}

完全不用正则表达式,您可以对字符串进行简单的迭代:

public static String[] splitWords(String str) {
        List<String> array = new ArrayList<>(); 
        boolean inQuote = false; // Marker telling us if we are between quotes
        int previousStart = -1;  // The index of the beginning of the last word
        for (int i = 0; i < str.length(); i++) {
            char c = str.charAt(i);
            if (Character.isWhitespace(c)) {
                if (previousStart != -1 && !inQuote) {
                    // end of word
                    array.add(str.substring(previousStart, i));
                    previousStart = -1;
                }
            } else {
                // possibly new word
                if (previousStart == -1) previousStart = i;
                // toggle state of quote
                if (c == '"')
                    inQuote = !inQuote;
            }
        }
        // Add last segment if there is one
        if (previousStart != -1) 
            array.add(str.substring(previousStart));
        return array.toArray(new String [array.size()]);
    }

可以通过在正则表达式中使用前瞻来完成拆分:

String[] words = input.split(" +(?=(([^\"]*\"){2})*[^\"]*$)");

下面是一些测试代码:

String[] inputs = { "word1 key=\"value with space\" word3","word1 \"word2 with space\" word3", "word1 word2 word3"};
for (String input : inputs) {
    String[] words = input.split(" +(?=(([^\"]*\"){2})*[^\"]*$)");
    System.out.println(Arrays.toString(words));
}
输出:

[word1, key="value with space", word3]
[word1, "word2 with space", word3]
[word1, word2, word3]

这解决了我的问题!请查看我的代码编辑以了解一些拼写错误修复。感谢疯狂的物理学家!很高兴这有助于解决问题,并感谢修复。如果答案对您有效,请选择(并可能向上投票)。
[word1, key="value with space", word3]
[word1, "word2 with space", word3]
[word1, word2, word3]