搜索字符串以查找字符串列表中出现的任何单词我想知道，在C++中，如何搜索字符串列表中的任何一个字符串的第一个实例。std:：string:：find_first_of（）：“在字符串中搜索与参数中指定的任何字符匹配的第一个字符。”_C++_Arrays_String_Text Search

搜索字符串以查找字符串列表中出现的任何单词我想知道，在C++中，如何搜索字符串列表中的任何一个字符串的第一个实例。std:：string:：find_first_of（）：“在字符串中搜索与参数中指定的任何字符匹配的第一个字符。”

c++ arrays string

搜索字符串以查找字符串列表中出现的任何单词我想知道，在C++中，如何搜索字符串列表中的任何一个字符串的第一个实例。std:：string:：find_first_of（）：“在字符串中搜索与参数中指定的任何字符匹配的第一个字符。”,c++,arrays,string,text-search,C++,Arrays,String,Text Search,我想要的东西将搜索字符串中的第一个单词，匹配提供的列表/数组中的任何单词。明确地说，我不想在数组中搜索字符串的实例。我想搜索字符串，查找数组中某个对象的实例我的目标是能够说出一个句子，并删除列表中的所有单词。例如，如果我给它一个列表{“the”“brown”，“over”} 还有一句话，“敏捷的棕色狐狸跳过了懒狗”，我想让它输出，“快狐狸跳懒狗”。如果我愿意，我想给它一个100字的列表；我需要这个是可扩展的我能想到的唯一解决方案是在我的文本块上的while循环中使用std:：find（s

我想要的东西将搜索字符串中的第一个单词，匹配提供的列表/数组中的任何单词。明确地说，我不想在数组中搜索字符串的实例。我想搜索字符串，查找数组中某个对象的实例

我的目标是能够说出一个句子，并删除列表中的所有单词。例如，如果我给它一个列表

{“the”“brown”，“over”}
还有一句话，“敏捷的棕色狐狸跳过了懒狗”，
我想让它输出，“快狐狸跳懒狗”。
如果我愿意，我想给它一个100字的列表；我需要这个是可扩展的
我能想到的唯一解决方案是在我的文本块上的while
循环中使用std:：find（stringArray[0]）
，并保存找到该单词的索引，然后将所有这些放在另一个for
循环中，并对数组中的每个单词执行该操作，将每个单词的索引保存到一个巨大的列表中。然后对列表进行数字排序，最后遍历并删除列表中某个位置的每个单词
我真的希望有一个函数或一个更简单的方法来实现它，因为我的解决方案似乎很难而且非常缓慢，特别是因为我需要在许多不同的字符串上多次使用它，来遍历50000个字符的文本块的所有句子。任何更好的优化都是首选。
如果你寻找标准函数，如果你敢将句子存储为字符串容器，那么有一些可能性：
string input="Hello, world ! I whish you all \na happy new year 2016 !";
vector<string> sentence; 

stringstream sst(input);    // split the string into its pieces 
string tmp; 
while (sst>>tmp) 
    sentence.push_back(tmp); 

从向量中删除单词应该不再是一个问题了。我把它作为一种锻炼交给你
清理向量后，可以重新生成字符串：
stringstream so;
copy(it , sentence.end(), ostream_iterator<string>(so," ")); 
string result = so.str(); 

stringstreamso；
复制（it，句子.end（），ostream_迭代器（so，”）；
string result=so.str（）；

这里有一个
但是，此解决方案不会解决所有性能问题。为此，您需要进一步分析性能瓶颈的来源：您是否制作了大量不必要的对象副本？是不是你自己的算法触发了大量低效的内存分配？或者，它真的是纯粹的文字量
进一步工作的一些想法：

为句子中的单词建立一个按字母顺序排列的索引（映射>未签名的单词所在的位置）
考虑一个数据结构（trie而不是tree！！）
在中使用正则表达式
有些人的速度快，而另一些人的速度慢，所以很难说你指的是哪一种速度，而且50000个字符听起来不那么大，以至于一个人必须做一些不同寻常的事情
唯一应该避免的是就地操作输入字符串（将导致O（n^2）运行时间）-只需返回一个新的结果字符串。为结果字符串保留足够的内存可能是明智的，因为这将为某些输入节省一个常量因子
我的建议如下：
std::string remove_words(const std::string &sentence, const std::set<std::string> &words2remove, const std::string &delimiters){

    std::string result;
    result.reserve(sentence.size());//ensure there is enough place 

    std::string lastDelimiter;//no delimiter so far...
    size_t cur_position=0;
    while(true){
      size_t next=sentence.find_first_of(delimiters, cur_position);
      std::string token=sentence.substr(cur_position, next-cur_position);

      result+=lastDelimiter;
      if(words2remove.find(token)==words2remove.end())
         result+=token;//not forbidden

      if(next==std::string::npos)
        break;

      //prepare for the next iteration:  
      lastDelimiter=sentence[next];
      cur_position=next+1;
    }

    return result;
}

std:：string remove_单词（const std:：string和句子、const std:：set和words删除、const std:：string和分隔符）{
std：：字符串结果；
result.reserve（句子.size（））；//确保有足够的位置
std:：string lastdimiter；//目前没有分隔符。。。
大小当前位置=0；
while（true）{
size\u t next=句子。查找（分隔符、当前位置）中的第一个；
std:：string token=句子.substr（当前位置，下一个当前位置）；
结果+=最后一个分隔符；
if（words2remove.find（token）=words2remove.end（））
结果+=标记；//不禁止
if（next==std:：string:：npos）
打破
//为下一次迭代做准备：
lastDelimiter=句子[下一步]；
当前位置=下一个+1；
}
返回结果；
}

由于查找速度更快，此方法使用的是一组而不是一个禁止字列表。可以使用任何字符集作为分隔符，例如“
或”，；“
”
它在O（n*log（k））中运行，其中n是句子中的字符数，k是禁止集合中的单词数
你可能想看看你是否需要一个更灵活的tokonizer，不想重新发明轮子
在禁用的字数量较大的情况下，可以考虑使用STD：：unOrdEdSeTeSET（C++ 11），或者代替STD::SET，将算法的预期运行时间减少到O（n）。.
当我想使用或搜索字符串时，我会在std:：string类中查找可用的方法。我发现这个函数看起来很有希望。请用您的代码尝试编辑您的问题。然后我们可以帮助您。谢谢，我会尝试这些建议。非常有用！谢谢，这非常详细和有用。我希望可以选择多个最佳答案。。。！
std::string remove_words(const std::string &sentence, const std::set<std::string> &words2remove, const std::string &delimiters){

    std::string result;
    result.reserve(sentence.size());//ensure there is enough place 

    std::string lastDelimiter;//no delimiter so far...
    size_t cur_position=0;
    while(true){
      size_t next=sentence.find_first_of(delimiters, cur_position);
      std::string token=sentence.substr(cur_position, next-cur_position);

      result+=lastDelimiter;
      if(words2remove.find(token)==words2remove.end())
         result+=token;//not forbidden

      if(next==std::string::npos)
        break;

      //prepare for the next iteration:  
      lastDelimiter=sentence[next];
      cur_position=next+1;
    }

    return result;
}