Java 标记字符串但忽略引号中的分隔符

Java 标记字符串但忽略引号中的分隔符,java,Java,我希望有以下字符串 !cmd 45 90 "An argument" Another AndAnother "Another one in quotes" 成为以下内容的数组 { "!cmd", "45", "90", "An argument", "Another", "AndAnother", "Another one in quotes" } 我试过了 new StringTokenizer(cmd, "\"") 但这会将“另一个”和“另一个”返回为“另一个”而不是期望的效果 谢谢

我希望有以下字符串

!cmd 45 90 "An argument" Another AndAnother "Another one in quotes"
成为以下内容的数组

{ "!cmd", "45", "90", "An argument", "Another", "AndAnother", "Another one in quotes" }
我试过了

new StringTokenizer(cmd, "\"")
但这会将“另一个”和“另一个”返回为“另一个”而不是期望的效果

谢谢

编辑:
我再次更改了示例,这次我相信它最好地解释了这种情况,尽管它与第二个示例没有什么不同。

这里的示例只需按双引号字符分割。

尝试以下操作:

String str = "One two \"three four\" five \"six seven eight\" nine \"ten\"";
String strArr[] = str.split("\"|\s");
这有点棘手,因为需要转义双引号。此正则表达式应使用空格(\s)或双引号标记字符串

您应该使用String的
split
方法,因为它接受正则表达式,而
StringTokenizer
中delimiter的构造函数参数不接受。在我上面提供的内容的末尾,您可以添加以下内容:

String s;
for(String k : strArr) {
     s += k;
}
StringTokenizer strTok = new StringTokenizer(s);
试试这个:

String str = "One two \"three four\" five \"six seven eight\" nine \"ten\"";
String[] strings = str.split("[ ]?\"[ ]?");

我不知道您试图做什么的上下文,但它看起来像是您试图解析命令行参数。一般来说,这对于所有转义问题来说都是相当棘手的;如果这是您的目标,我个人会考虑类似JCommander的内容。

用老式的方式来做。制作一个函数,该函数会查看for循环中的每个字符。如果角色是一个空格,则将所有内容都保留到该空格(不包括空格)并将其作为条目添加到数组中。注意位置,然后再次执行相同操作,将下一部分添加到数组中的空格后。遇到双引号时,将名为“inQuote”的布尔值标记为true,并在inQuote为true时忽略空格。当inQuote为true时单击引号时,将其标记为false,然后在遇到空格。然后您可以根据需要扩展它以支持转义符等


这可以用正则表达式来完成吗?我不知道,我想。但是整个函数的编写时间比这个回复要少。

用一种老式的方式:

public static String[] split(String str) {
    str += " "; // To detect last token when not quoted...
    ArrayList<String> strings = new ArrayList<String>();
    boolean inQuote = false;
    StringBuilder sb = new StringBuilder();
    for (int i = 0; i < str.length(); i++) {
        char c = str.charAt(i);
        if (c == '"' || c == ' ' && !inQuote) {
            if (c == '"')
                inQuote = !inQuote;
            if (!inQuote && sb.length() > 0) {
                strings.add(sb.toString());
                sb.delete(0, sb.length());
            }
        } else
            sb.append(c);
    }
    return strings.toArray(new String[strings.size()]);
}
公共静态字符串[]拆分(字符串str){
str+=“”;//要在未引用时检测最后一个标记。。。
ArrayList字符串=新的ArrayList();
布尔inQuote=false;
StringBuilder sb=新的StringBuilder();
对于(int i=0;i0){
添加(sb.toString());
sb.删除(0,sb.length());
}
}否则
sb.附加(c);
}
返回strings.toArray(新字符串[strings.size()]);
}
我假设嵌套引号是非法的,而且空标记可以省略。

在这种情况下,使用和执行
find()
比任何类型的
split
要容易得多

也就是说,您不是为令牌之间的分隔符定义模式,而是为令牌本身定义模式

下面是一个例子:

    String text = "1 2 \"333 4\" 55 6    \"77\" 8 999";
    // 1 2 "333 4" 55 6    "77" 8 999

    String regex = "\"([^\"]*)\"|(\\S+)";

    Matcher m = Pattern.compile(regex).matcher(text);
    while (m.find()) {
        if (m.group(1) != null) {
            System.out.println("Quoted [" + m.group(1) + "]");
        } else {
            System.out.println("Plain [" + m.group(2) + "]");
        }
    }
以上打印内容():

这种模式基本上是:

"([^"]*)"|(\S+)
 \_____/  \___/
    1       2
有两种备选方案:

  • 第一个备选方案匹配开始的双引号,一个除了双引号(在组1中捕获)以外的任何序列,然后匹配结束的双引号
  • 第二个备选字符匹配组2中捕获的任何非空白字符序列
  • 在这种模式中,交替事件的顺序
请注意,这不会处理引号段中的转义双引号。如果需要这样做,则模式会变得更复杂,但
Matcher
解决方案仍然有效

工具书类
  • ,
另见
  • -对于带转义引号的模式

附录 请注意,这是一个遗留类。建议使用或,当然,为了获得最大的灵活性

相关问题
  • -有很多例子

    • 这是一个老问题,但这是我作为有限状态机的解决方案

      高效、可预测且无花招

      100%的测试覆盖率

      拖放到代码中

      /**
       * Splits a command on whitespaces. Preserves whitespace in quotes. Trims excess whitespace between chunks. Supports quote
       * escape within quotes. Failed escape will preserve escape char.
       *
       * @return List of split commands
       */
      static List<String> splitCommand(String inputString) {
          List<String> matchList = new LinkedList<>();
          LinkedList<Character> charList = inputString.chars()
                  .mapToObj(i -> (char) i)
                  .collect(Collectors.toCollection(LinkedList::new));
      
          // Finite-State Automaton for parsing.
      
          CommandSplitterState state = CommandSplitterState.BeginningChunk;
          LinkedList<Character> chunkBuffer = new LinkedList<>();
      
          for (Character currentChar : charList) {
              switch (state) {
                  case BeginningChunk:
                      switch (currentChar) {
                          case '"':
                              state = CommandSplitterState.ParsingQuote;
                              break;
                          case ' ':
                              break;
                          default:
                              state = CommandSplitterState.ParsingWord;
                              chunkBuffer.add(currentChar);
                      }
                      break;
                  case ParsingWord:
                      switch (currentChar) {
                          case ' ':
                              state = CommandSplitterState.BeginningChunk;
                              String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
                              matchList.add(newWord);
                              chunkBuffer = new LinkedList<>();
                              break;
                          default:
                              chunkBuffer.add(currentChar);
                      }
                      break;
                  case ParsingQuote:
                      switch (currentChar) {
                          case '"':
                              state = CommandSplitterState.BeginningChunk;
                              String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
                              matchList.add(newWord);
                              chunkBuffer = new LinkedList<>();
                              break;
                          case '\\':
                              state = CommandSplitterState.EscapeChar;
                              break;
                          default:
                              chunkBuffer.add(currentChar);
                      }
                      break;
                  case EscapeChar:
                      switch (currentChar) {
                          case '"': // Intentional fall through
                          case '\\':
                              state = CommandSplitterState.ParsingQuote;
                              chunkBuffer.add(currentChar);
                              break;
                          default:
                              state = CommandSplitterState.ParsingQuote;
                              chunkBuffer.add('\\');
                              chunkBuffer.add(currentChar);
                      }
              }
          }
      
          if (state != CommandSplitterState.BeginningChunk) {
              String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
              matchList.add(newWord);
          }
          return matchList;
      }
      
      private enum CommandSplitterState {
          BeginningChunk, ParsingWord, ParsingQuote, EscapeChar
      }
      
      /**
      *在空格上拆分命令。保留引号中的空格。在块之间修剪多余的空格。支持引号
      *在引号内转义。失败的转义将保留转义字符。
      *
      *@返回拆分命令列表
      */
      静态列表拆分命令(字符串输入字符串){
      列表匹配列表=新的LinkedList();
      LinkedList charList=inputString.chars()
      .mapToObj(i->(char)i)
      .collect(Collectors.toCollection(LinkedList::new));
      //用于解析的有限状态自动机。
      CommandSplitterState=CommandSplitterState.BeginingChunk;
      LinkedList chunkBuffer=新建LinkedList();
      用于(字符currentChar:charList){
      开关(状态){
      案例开始语块:
      开关(currentChar){
      案例'':
      state=CommandSplitterState.ParsingQuote;
      打破
      案例“”:
      打破
      违约:
      state=CommandSplitterState.ParsingWord;
      chunkBuffer.add(currentChar);
      }
      打破
      大小写分隔词:
      开关(currentChar){
      案例“”:
      state=CommandSplitterState.beginingchunk;
      String newWord=chunkBuffer.stream().map(Object::toString.collect(Collectors.joining());
      matchList.add(newWord);
      chunkBuffer=新的LinkedList();
      打破
      违约:
      chunkBuffer.add(currentChar);
      }
      打破
      案例分析引述:
      开关(currentChar){
      
      /**
       * Splits a command on whitespaces. Preserves whitespace in quotes. Trims excess whitespace between chunks. Supports quote
       * escape within quotes. Failed escape will preserve escape char.
       *
       * @return List of split commands
       */
      static List<String> splitCommand(String inputString) {
          List<String> matchList = new LinkedList<>();
          LinkedList<Character> charList = inputString.chars()
                  .mapToObj(i -> (char) i)
                  .collect(Collectors.toCollection(LinkedList::new));
      
          // Finite-State Automaton for parsing.
      
          CommandSplitterState state = CommandSplitterState.BeginningChunk;
          LinkedList<Character> chunkBuffer = new LinkedList<>();
      
          for (Character currentChar : charList) {
              switch (state) {
                  case BeginningChunk:
                      switch (currentChar) {
                          case '"':
                              state = CommandSplitterState.ParsingQuote;
                              break;
                          case ' ':
                              break;
                          default:
                              state = CommandSplitterState.ParsingWord;
                              chunkBuffer.add(currentChar);
                      }
                      break;
                  case ParsingWord:
                      switch (currentChar) {
                          case ' ':
                              state = CommandSplitterState.BeginningChunk;
                              String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
                              matchList.add(newWord);
                              chunkBuffer = new LinkedList<>();
                              break;
                          default:
                              chunkBuffer.add(currentChar);
                      }
                      break;
                  case ParsingQuote:
                      switch (currentChar) {
                          case '"':
                              state = CommandSplitterState.BeginningChunk;
                              String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
                              matchList.add(newWord);
                              chunkBuffer = new LinkedList<>();
                              break;
                          case '\\':
                              state = CommandSplitterState.EscapeChar;
                              break;
                          default:
                              chunkBuffer.add(currentChar);
                      }
                      break;
                  case EscapeChar:
                      switch (currentChar) {
                          case '"': // Intentional fall through
                          case '\\':
                              state = CommandSplitterState.ParsingQuote;
                              chunkBuffer.add(currentChar);
                              break;
                          default:
                              state = CommandSplitterState.ParsingQuote;
                              chunkBuffer.add('\\');
                              chunkBuffer.add(currentChar);
                      }
              }
          }
      
          if (state != CommandSplitterState.BeginningChunk) {
              String newWord = chunkBuffer.stream().map(Object::toString).collect(Collectors.joining());
              matchList.add(newWord);
          }
          return matchList;
      }
      
      private enum CommandSplitterState {
          BeginningChunk, ParsingWord, ParsingQuote, EscapeChar
      }
      
      import org.apache.commons.text.StringTokenizer
      import org.apache.commons.text.matcher.StringMatcher
      import org.apache.commons.text.matcher.StringMatcherFactory
      @Grab(group='org.apache.commons', module='commons-text', version='1.3')
      
      def str = /is this   'completely "impossible"' or """slightly"" impossible" to parse?/
      
      StringTokenizer st = new StringTokenizer( str )
      StringMatcher sm = StringMatcherFactory.INSTANCE.quoteMatcher()
      st.setQuoteMatcher( sm )
      
      println st.tokenList
      
      private static final AbstractStringMatcher.CharSetMatcher QUOTE_MATCHER = new AbstractStringMatcher.CharSetMatcher(
                  "'\"".toCharArray());
      
      public StringTokenizer setQuoteMatcher(final StringMatcher quote) {
              if (quote != null) {
                  this.quoteMatcher = quote;
              }
              return this;
      }
      
      private int readWithQuotes(final char[] srcChars ...
      
      // If we've found a quote character, see if it's followed by a second quote. If so, then we need to actually put the quote character into the token rather than end the token.
      
      public static void main(String[] args) {
      
          String text = "One two \"three four\" five \"six seven eight\" nine \"ten\"";
          String[] splits = text.split(" ");
          List<String> list = new ArrayList<>();
          String token = null;
          for(String s : splits) {
      
              if(s.startsWith("\"") ) {
                  token = "" + s; 
              } else if (s.endsWith("\"")) {
                  token = token + " "+ s;
                  list.add(token);
                  token = null;
              } else {
                  if (token != null) {
                      token = token + " " + s;
                  } else {
                      list.add(s);
                  }
              }
          }
          System.out.println(list);
      }
      
      /opt/jboss-eap/bin/jboss-cli.sh
      --connect
      --controller=localhost:9990
      -c
      command="deploy /app/jboss-eap-7.1/standalone/updates/sample.war --force"
      
      private static void findWords(String str) {
          boolean flag = false;
          StringBuilder sb = new StringBuilder();
          for(int i=0;i<str.length();i++) {
              if(str.charAt(i)!=' ' && str.charAt(i)!='"') {
                  sb.append(str.charAt(i));
              }
              else {
                  System.out.println(sb.toString());
                  sb = new StringBuilder();
                  if(str.charAt(i)==' ' && !flag)
                      continue;
                  else if(str.charAt(i)=='"') {
                      if(!flag) {
                          flag=true;
                      }
                      i++;
                      while(i<str.length() && str.charAt(i)!='"') {
                          sb.append(str.charAt(i));
                          i++;
                      }
                      flag=false;
                      System.out.println(sb.toString());
                      sb = new StringBuilder();
                  }
              }
          }
      }
      
      public final class StringUtilities {
          private static final List<Character> WORD_DELIMITERS = Arrays.asList(' ', '\t');
          private static final List<Character> QUOTE_CHARACTERS = Arrays.asList('"', '\'');
          private static final char ESCAPE_CHARACTER = '\\';
      
          private StringUtilities() {
      
          }
      
          public static String[] splitWords(String string) {
              StringBuilder wordBuilder = new StringBuilder();
              List<String> words = new ArrayList<>();
              char quote = 0;
      
              for (int i = 0; i < string.length(); i++) {
                  char c = string.charAt(i);
      
                  if (c == ESCAPE_CHARACTER && i + 1 < string.length()) {
                      wordBuilder.append(string.charAt(++i));
                  } else if (WORD_DELIMITERS.contains(c) && quote == 0) {
                      words.add(wordBuilder.toString());
                      wordBuilder.setLength(0);
                  } else if (quote == 0 && QUOTE_CHARACTERS.contains(c)) {
                      quote = c;
                  } else if (quote == c) {
                      quote = 0;
                  } else {
                      wordBuilder.append(c);
                  }
              }
      
              if (wordBuilder.length() > 0) {
                  words.add(wordBuilder.toString());
              }
      
              return words.toArray(new String[0]);
          }
      }