用strtok函数在C中拆分字符串_C_String_Split_Arguments

用strtok函数在C中拆分字符串

c string

用strtok函数在C中拆分字符串,c,string,split,arguments,C,String,Split,Arguments,我试图用{white_space}符号分割一些字符串。顺便说一句，有一个问题，在一些分裂。也就是说，我想用{white_space}符号分割，但也要用引号括起子字符串例如 char *pch; char str[] = "hello \"Stack Overflow\" good luck!"; pch = strtok(str," "); while (pch != NULL) { printf ("%s\n",pch); pch = strtok(NULL, " ");

我试图用{white_space}符号分割一些字符串。顺便说一句，有一个问题，在一些分裂。也就是说，我想用{white_space}符号分割，但也要用引号括起子字符串

例如

char *pch;
char str[] = "hello \"Stack Overflow\" good luck!";
pch = strtok(str," ");
while (pch != NULL)
{
    printf ("%s\n",pch);
    pch = strtok(NULL, " ");
}

这会给我

hello
"Stack
Overflow"
good
luck!

但是我想要的，你知道的

hello
Stack Overflow
good
luck!

有什么建议或想法吗？

试着改变你的策略

看看非空白的东西，当你们找到带引号的字符串时，你们可以把它放在一个字符串值中

因此，您需要一个函数来检查空白之间的字符。当您找到

'“

时，您可以更改规则并将所有内容胡佛到匹配的

'”

。如果此函数返回一个标记值和一个值（匹配的字符串），那么调用它的人可以决定执行正确的输出。然后您已经编写了一个tokeniser，实际上存在一些工具来生成它们，称为“lexer”，因为它们被广泛使用，用于实现编程语言/配置文件

假设nextc从字符串中读取下一个字符，以firstc（str）开头：

您需要定义EOS、QUOTE和WORD的返回值，以及获取每个QUOTE或WORD中文本的方法。

您需要标记两次。您当前拥有的程序流程如下所示：

1）寻找空间

2）在空格前打印所有字符

3）搜索下一个空间

4）打印最后一个空格和此空格之间的所有字符

您需要开始考虑另一个问题，即两层标记化

搜索引号

在奇数字符串上，执行原始程序（搜索空格）

在偶数字符串上，盲目打印

在这种情况下，偶数字符串（理想情况下）在引号内。ab“cd”ef将导致ab为奇数，cd为偶数。。。等等

另一方面，记住你需要做什么，而你实际上在寻找的（在正则表达式中）是“[a-zA-Z0-9\t\n]*”或[a-zA-Z0-9]+。这意味着两个选项之间的区别在于是否用引号分隔。所以用引号分开，并从中识别。

以下是有效的代码。。。在C中

我们的想法是首先标记引号，因为这是一个优先级（如果引号中有一个字符串，我们不标记它，我们只打印它）。对于这些标记化字符串中的每一个，我们在该字符串中对空格字符进行标记化，但我们对备用字符串进行标记化，因为备用字符串将在引号内和引号外

#include <stdio.h>
#include <string.h>
#include <stdbool.h>

int main() {
  char *pch1, *pch2, *save_ptr1, *save_ptr2;
  char str[] = "hello \"Stack Overflow\" good luck!";
  pch1 = strtok_r(str,"\"", &save_ptr1);
  bool in = false;
  while (pch1 != NULL) {
    if(in) {
      printf ("%s\n", pch1);
      pch1 = strtok_r(NULL, "\"", &save_ptr1);
      in = false;
      continue;
    }
    pch2 = strtok_r(pch1, " ", &save_ptr2);
    while (pch2 != NULL) {
      printf ("%s\n",pch2);
      pch2 = strtok_r(NULL, " ", &save_ptr2);
    }
    pch1 = strtok_r(NULL, "\"", &save_ptr1);
    in = true;
  }
}

#包括
#包括
#包括
int main（）{
字符*pch1、*pch2、*save_ptr1、*save_ptr2；
char str[]=“你好\”堆栈溢出\“祝你好运！”；
pch1=strtok\u r（str，“\”，&save\u ptr1）；
bool-in=false；
while（pch1！=NULL）{
如果（在）{
printf（“%s\n”，pch1）；
pch1=strtok\u r（空，\“”，&save\u ptr1）；
in=假；
继续；
}
pch2=strtok_r（pch1，“，&save_ptr2）；
while（pch2！=NULL）{
printf（“%s\n”，pch2）；
pch2=strtok_r（NULL，“，&save_ptr2）；
}
pch1=strtok\u r（空，\“”，&save\u ptr1）；
in=真；
}
}

参考资料

这里是C++，我相信它可以写得更优雅，但它是有效的，是一个开始：

#include <iostream>
#include <stdexcept>
#include <vector>
#include <string>

using namespace std;

using Tokens = vector<string>;


Tokens split(string const & sentence) {
  Tokens tokens;
  // indexes to split on
  string::size_type from = 0, to;

  // true if we are inside quotes: we don't split by spaces and we expect a closing quote
  // false otherwise
  bool in_quotes = false;

  while (true) {
    // compute to index
    if (!in_quotes) {
      // find next space or quote
      to = sentence.find_first_of(" \"", from);
      if (to != string::npos && sentence[to] == '\"') {
        // we found an opening quote
        in_quotes = true;
      }
    } else {
      // find next quote (ignoring spaces)
      to = sentence.find('\"', from);
      if (to == string::npos) {
        // no enclosing quote found, invalid string
        throw invalid_argument("missing enclosing quotes");
      }
      in_quotes = false;
    }
    // skip empty tokens
    if (from != to) {
      // get token
      // last token
      if (to == string::npos) {
        tokens.push_back(sentence.substr(from));
        break;
      }
      tokens.push_back(sentence.substr(from, to - from));
    }
    // move from index
    from = to + 1;
  }
  return tokens;
}

你在空格上进行拆分，得到的正是你想要的。引号只是字符串中的另一个字符。你必须自己进行解析：检查你是否在开始

“

和结束

之间“<代码> >如果我为你写C++实现可以吗？我懒得用C写这封信。@Happington噢！我已经完成了内部问题的编辑！有一些误解来解释我的问题：不管怎样，看到C++是很有趣的。strtok有点崩溃，例如，它不处理空字段。
#include <iostream>
#include <stdexcept>
#include <vector>
#include <string>

using namespace std;

using Tokens = vector<string>;


Tokens split(string const & sentence) {
  Tokens tokens;
  // indexes to split on
  string::size_type from = 0, to;

  // true if we are inside quotes: we don't split by spaces and we expect a closing quote
  // false otherwise
  bool in_quotes = false;

  while (true) {
    // compute to index
    if (!in_quotes) {
      // find next space or quote
      to = sentence.find_first_of(" \"", from);
      if (to != string::npos && sentence[to] == '\"') {
        // we found an opening quote
        in_quotes = true;
      }
    } else {
      // find next quote (ignoring spaces)
      to = sentence.find('\"', from);
      if (to == string::npos) {
        // no enclosing quote found, invalid string
        throw invalid_argument("missing enclosing quotes");
      }
      in_quotes = false;
    }
    // skip empty tokens
    if (from != to) {
      // get token
      // last token
      if (to == string::npos) {
        tokens.push_back(sentence.substr(from));
        break;
      }
      tokens.push_back(sentence.substr(from, to - from));
    }
    // move from index
    from = to + 1;
  }
  return tokens;
}

void splitAndPrint(string const & sentence) {
  Tokens tokens;
  cout << "-------------" << endl;
  cout << sentence << endl;
  try {
    tokens = split(sentence);
  } catch (exception &e) {
    cout << e.what() << endl;
    return;
  }
  for (const auto &token : tokens) {
    cout << token << endl;
  }
  cout << endl;
}

int main() {
  splitAndPrint("hello \"Stack Overflow\" good luck!");
  splitAndPrint("hello \"Stack Overflow\" good luck from \"User Name\"");
  splitAndPrint("hello and good luck!");
  splitAndPrint("hello and \" good luck!");

  return 0;
}

-------------
hello "Stack Overflow" good luck!
hello
Stack Overflow
good
luck!

-------------
hello "Stack Overflow" good luck from "User Name"
hello
Stack Overflow
good
luck
from
User Name

-------------
hello and good luck!
hello
and
good
luck!

-------------
hello and " good luck!
missing enclosing quotes