Java 标记字符串但忽略引号中的分隔符_Java

Java 标记字符串但忽略引号中的分隔符

java

Java 标记字符串但忽略引号中的分隔符,java,Java,我希望有以下字符串 !cmd 45 90 "An argument" Another AndAnother "Another one in quotes" 成为以下内容的数组 { "!cmd", "45", "90", "An argument", "Another", "AndAnother", "Another one in quotes" } 我试过了 new StringTokenizer(cmd, "\"") 但这会将“另一个”和“另一个”返回为“另一个”而不是期望的效果谢谢

我希望有以下字符串

!cmd 45 90 "An argument" Another AndAnother "Another one in quotes"

成为以下内容的数组

{ "!cmd", "45", "90", "An argument", "Another", "AndAnother", "Another one in quotes" }

我试过了

new StringTokenizer(cmd, "\"")

但这会将“另一个”和“另一个”返回为“另一个”而不是期望的效果

谢谢

编辑：

我再次更改了示例，这次我相信它最好地解释了这种情况，尽管它与第二个示例没有什么不同。

这里的示例只需按双引号字符分割。

尝试以下操作：

String str = "One two \"three four\" five \"six seven eight\" nine \"ten\"";
String strArr[] = str.split("\"|\s");

这有点棘手，因为需要转义双引号。此正则表达式应使用空格（\s）或双引号标记字符串

您应该使用String的

split

方法，因为它接受正则表达式，而

StringTokenizer

中delimiter的构造函数参数不接受。在我上面提供的内容的末尾，您可以添加以下内容：

String s;
for(String k : strArr) {
     s += k;
}
StringTokenizer strTok = new StringTokenizer(s);

试试这个：

String str = "One two \"three four\" five \"six seven eight\" nine \"ten\"";
String[] strings = str.split("[ ]?\"[ ]?");

我不知道您试图做什么的上下文，但它看起来像是您试图解析命令行参数。一般来说，这对于所有转义问题来说都是相当棘手的；如果这是您的目标，我个人会考虑类似JCommander的内容。

用老式的方式来做。制作一个函数，该函数会查看for循环中的每个字符。如果角色是一个空格，则将所有内容都保留到该空格（不包括空格）并将其作为条目添加到数组中。注意位置，然后再次执行相同操作，将下一部分添加到数组中的空格后。遇到双引号时，将名为“inQuote”的布尔值标记为true，并在inQuote为true时忽略空格。当inQuote为true时单击引号时，将其标记为false，然后在遇到空格。然后您可以根据需要扩展它以支持转义符等

这可以用正则表达式来完成吗？我不知道，我想。但是整个函数的编写时间比这个回复要少。

用一种老式的方式：

public static String[] split(String str) {
    str += " "; // To detect last token when not quoted...
    ArrayList<String> strings = new ArrayList<String>();
    boolean inQuote = false;
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < str.length(); i++) {
        char c = str.charAt(i);
        if (c == '"' || c == ' ' && !inQuote) {
            if (c == '"')
                inQuote = !inQuote;
            if (!inQuote && sb.length() > 0) {
                strings.add(sb.toString());
                sb.delete(0, sb.length());
            }
        } else
            sb.append(c);
    }
    return strings.toArray(new String[strings.size()]);
}

公共静态字符串[]拆分（字符串str）{
str+=“”；//要在未引用时检测最后一个标记。。。
ArrayList字符串=新的ArrayList（）；
布尔inQuote=false；
StringBuilder sb=新的StringBuilder（）；
对于（int i=0；i0）{
添加（sb.toString（））；
sb.删除（0，sb.length（））；
}
}否则
sb.附加（c）；
}
返回strings.toArray（新字符串[strings.size（）]）；
}

我假设嵌套引号是非法的，而且空标记可以省略。

在这种情况下，使用和执行

find（）

比任何类型的

split

要容易得多

也就是说，您不是为令牌之间的分隔符定义模式，而是为令牌本身定义模式

下面是一个例子：

    String text = "1 2 \"333 4\" 55 6    \"77\" 8 999";
    // 1 2 "333 4" 55 6    "77" 8 999

    String regex = "\"([^\"]*)\"|(\\S+)";

    Matcher m = Pattern.compile(regex).matcher(text);
    while (m.find()) {
        if (m.group(1) != null) {
            System.out.println("Quoted [" + m.group(1) + "]");
        } else {
            System.out.println("Plain [" + m.group(2) + "]");
        }
    }

以上打印内容（）：

这种模式基本上是：

"([^"]*)"|(\S+)
 \_____/  \___/
    1       2

有两种备选方案：

第一个备选方案匹配开始的双引号，一个除了双引号（在组1中捕获）以外的任何序列，然后匹配结束的双引号
第二个备选字符匹配组2中捕获的任何非空白字符序列
在这种模式中，交替事件的顺序

请注意，这不会处理引号段中的转义双引号。如果需要这样做，则模式会变得更复杂，但

Matcher

解决方案仍然有效

工具书类

另见

-对于带转义引号的模式

附录请注意，这是一个遗留类。建议使用或，当然，为了获得最大的灵活性

相关问题

-有很多例子

/**
 * Splits a command on whitespaces. Preserves whitespace in quotes. Trims excess whitespace between chunks. Supports quote
 * escape within quotes. Failed escape will preserve escape char.
 *
 * @return List of split commands
 */
static List<String> splitCommand(String inputString) {
    List<String> matchList = new LinkedList<>();
    LinkedList<Character> charList = inputString.chars()
            .mapToObj(i -> (char) i)
            .collect(Collectors.toCollection(LinkedList::new));

    // Finite-State Automaton for parsing.

    CommandSplitterState state = CommandSplitterState.BeginningChunk;
    LinkedList<Character> chunkBuffer = new LinkedList<>();

    for (Character currentChar : charList) {
        switch (state) {
            case BeginningChunk:
                switch (currentChar) {
                    case '"':
                        state = CommandSplitterState.ParsingQuote;
                        break;
                    case ' ':
                        break;
                    default:
                        state = CommandSplitterState.ParsingWord;
                        chunkBuffer.add(currentChar);
                }
                break;
            case ParsingWord:
                switch (currentChar) {
                    case ' ':
                        state = CommandSplitterState.BeginningChunk;
                        String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
                        matchList.add(newWord);
                        chunkBuffer = new LinkedList<>();
                        break;
                    default:
                        chunkBuffer.add(currentChar);
                }
                break;
            case ParsingQuote:
                switch (currentChar) {
                    case '"':
                        state = CommandSplitterState.BeginningChunk;
                        String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
                        matchList.add(newWord);
                        chunkBuffer = new LinkedList<>();
                        break;
                    case '\\':
                        state = CommandSplitterState.EscapeChar;
                        break;
                    default:
                        chunkBuffer.add(currentChar);
                }
                break;
            case EscapeChar:
                switch (currentChar) {
                    case '"': // Intentional fall through
                    case '\\':
                        state = CommandSplitterState.ParsingQuote;
                        chunkBuffer.add(currentChar);
                        break;
                    default:
                        state = CommandSplitterState.ParsingQuote;
                        chunkBuffer.add('\\');
                        chunkBuffer.add(currentChar);
                }
        }
    }

    if (state != CommandSplitterState.BeginningChunk) {
        String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
        matchList.add(newWord);
    }
    return matchList;
}

private enum CommandSplitterState {
    BeginningChunk, ParsingWord, ParsingQuote, EscapeChar
}

/**
*在空格上拆分命令。保留引号中的空格。在块之间修剪多余的空格。支持引号
*在引号内转义。失败的转义将保留转义字符。
*
*@返回拆分命令列表
*/
静态列表拆分命令（字符串输入字符串）{
列表匹配列表=新的LinkedList（）；
LinkedList charList=inputString.chars（）
.mapToObj（i->（char）i）
.collect（Collectors.toCollection（LinkedList:：new））；
//用于解析的有限状态自动机。
CommandSplitterState=CommandSplitterState.BeginingChunk；
LinkedList chunkBuffer=新建LinkedList（）；
用于（字符currentChar:charList）{
开关（状态）{
案例开始语块：
开关（currentChar）{
案例''：
state=CommandSplitterState.ParsingQuote；
打破
案例“”：
打破
违约：
state=CommandSplitterState.ParsingWord；
chunkBuffer.add（currentChar）；
}
打破
大小写分隔词：
开关（currentChar）{
案例“”：
state=CommandSplitterState.beginingchunk；
String newWord=chunkBuffer.stream（）.map（Object:：toString.collect（Collectors.joining（））；
matchList.add（newWord）；
chunkBuffer=新的LinkedList（）；
打破
违约：
chunkBuffer.add（currentChar）；
}
打破
案例分析引述：
开关（currentChar）{
/**
 * Splits a command on whitespaces. Preserves whitespace in quotes. Trims excess whitespace between chunks. Supports quote
 * escape within quotes. Failed escape will preserve escape char.
 *
 * @return List of split commands
 */
static List<String> splitCommand(String inputString) {
    List<String> matchList = new LinkedList<>();
    LinkedList<Character> charList = inputString.chars()
            .mapToObj(i -> (char) i)
            .collect(Collectors.toCollection(LinkedList::new));

    // Finite-State Automaton for parsing.

    CommandSplitterState state = CommandSplitterState.BeginningChunk;
    LinkedList<Character> chunkBuffer = new LinkedList<>();

    for (Character currentChar : charList) {
        switch (state) {
            case BeginningChunk:
                switch (currentChar) {
                    case '"':
                        state = CommandSplitterState.ParsingQuote;
                        break;
                    case ' ':
                        break;
                    default:
                        state = CommandSplitterState.ParsingWord;
                        chunkBuffer.add(currentChar);
                }
                break;
            case ParsingWord:
                switch (currentChar) {
                    case ' ':
                        state = CommandSplitterState.BeginningChunk;
                        String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
                        matchList.add(newWord);
                        chunkBuffer = new LinkedList<>();
                        break;
                    default:
                        chunkBuffer.add(currentChar);
                }
                break;
            case ParsingQuote:
                switch (currentChar) {
                    case '"':
                        state = CommandSplitterState.BeginningChunk;
                        String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
                        matchList.add(newWord);
                        chunkBuffer = new LinkedList<>();
                        break;
                    case '\\':
                        state = CommandSplitterState.EscapeChar;
                        break;
                    default:
                        chunkBuffer.add(currentChar);
                }
                break;
            case EscapeChar:
                switch (currentChar) {
                    case '"': // Intentional fall through
                    case '\\':
                        state = CommandSplitterState.ParsingQuote;
                        chunkBuffer.add(currentChar);
                        break;
                    default:
                        state = CommandSplitterState.ParsingQuote;
                        chunkBuffer.add('\\');
                        chunkBuffer.add(currentChar);
                }
        }
    }

    if (state != CommandSplitterState.BeginningChunk) {
        String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
        matchList.add(newWord);
    }
    return matchList;
}

private enum CommandSplitterState {
    BeginningChunk, ParsingWord, ParsingQuote, EscapeChar
}

import org.apache.commons.text.StringTokenizer
import org.apache.commons.text.matcher.StringMatcher
import org.apache.commons.text.matcher.StringMatcherFactory
@Grab(group='org.apache.commons', module='commons-text', version='1.3')

def str = /is this   'completely "impossible"' or """slightly"" impossible" to parse?/

StringTokenizer st = new StringTokenizer( str )
StringMatcher sm = StringMatcherFactory.INSTANCE.quoteMatcher()
st.setQuoteMatcher( sm )

println st.tokenList

private static final AbstractStringMatcher.CharSetMatcher QUOTE_MATCHER = new AbstractStringMatcher.CharSetMatcher(
            "'\"".toCharArray());

public StringTokenizer setQuoteMatcher(final StringMatcher quote) {
        if (quote != null) {
            this.quoteMatcher = quote;
        }
        return this;
}

private int readWithQuotes(final char[] srcChars ...

// If we've found a quote character, see if it's followed by a second quote. If so, then we need to actually put the quote character into the token rather than end the token.

public static void main(String[] args) {

    String text = "One two \"three four\" five \"six seven eight\" nine \"ten\"";
    String[] splits = text.split(" ");
    List<String> list = new ArrayList<>();
    String token = null;
    for(String s : splits) {

        if(s.startsWith("\"") ) {
            token = "" + s; 
        } else if (s.endsWith("\"")) {
            token = token + " "+ s;
            list.add(token);
            token = null;
        } else {
            if (token != null) {
                token = token + " " + s;
            } else {
                list.add(s);
            }
        }
    }
    System.out.println(list);
}

/opt/jboss-eap/bin/jboss-cli.sh
--connect
--controller=localhost:9990
-c
command="deploy /app/jboss-eap-7.1/standalone/updates/sample.war --force"

private static void findWords(String str) {
    boolean flag = false;
    StringBuilder sb = new StringBuilder();
    for(int i=0;i<str.length();i++) {
        if(str.charAt(i)!=' ' && str.charAt(i)!='"') {
            sb.append(str.charAt(i));
        }
        else {
            System.out.println(sb.toString());
            sb = new StringBuilder();
            if(str.charAt(i)==' ' && !flag)
                continue;
            else if(str.charAt(i)=='"') {
                if(!flag) {
                    flag=true;
                }
                i++;
                while(i<str.length() && str.charAt(i)!='"') {
                    sb.append(str.charAt(i));
                    i++;
                }
                flag=false;
                System.out.println(sb.toString());
                sb = new StringBuilder();
            }
        }
    }
}

public final class StringUtilities {
    private static final List<Character> WORD_DELIMITERS = Arrays.asList(' ', '\t');
    private static final List<Character> QUOTE_CHARACTERS = Arrays.asList('"', '\'');
    private static final char ESCAPE_CHARACTER = '\\';

    private StringUtilities() {

    }

    public static String[] splitWords(String string) {
        StringBuilder wordBuilder = new StringBuilder();
        List<String> words = new ArrayList<>();
        char quote = 0;

        for (int i = 0; i < string.length(); i++) {
            char c = string.charAt(i);

            if (c == ESCAPE_CHARACTER && i + 1 < string.length()) {
                wordBuilder.append(string.charAt(++i));
            } else if (WORD_DELIMITERS.contains(c) && quote == 0) {
                words.add(wordBuilder.toString());
                wordBuilder.setLength(0);
            } else if (quote == 0 && QUOTE_CHARACTERS.contains(c)) {
                quote = c;
            } else if (quote == c) {
                quote = 0;
            } else {
                wordBuilder.append(c);
            }
        }

        if (wordBuilder.length() > 0) {
            words.add(wordBuilder.toString());
        }

        return words.toArray(new String[0]);
    }
}