从C+；中删除不允许字符的最优雅、最有效的方法+；一串我使用C++ 11，我想知道什么是最优雅的处理现有的C++字符串，以便它只包含下面的这些有效字符。效率也是一个问题，但最重要的是追求优雅_C++_String_C++11_Text

从C+；中删除不允许字符的最优雅、最有效的方法+；一串我使用C++ 11，我想知道什么是最优雅的处理现有的C++字符串，以便它只包含下面的这些有效字符。效率也是一个问题，但最重要的是追求优雅

c++ string c++11 text

从C+；中删除不允许字符的最优雅、最有效的方法+；一串我使用C++ 11，我想知道什么是最优雅的处理现有的C++字符串，以便它只包含下面的这些有效字符。效率也是一个问题，但最重要的是追求优雅,c++,string,c++11,text,C++,String,C++11,Text,“0123456789abcdefghijklmnopqrstuvxyzabcdefghijklmnopqrstuvxyz-” 谢谢,，维吉尔。我的目标是： void removeDisallowed(std::string& in) { static const std::string allowed = "01234..."; in.erase( std::remove_if(in.begin(), in.end(), [&](const c

“0123456789abcdefghijklmnopqrstuvxyzabcdefghijklmnopqrstuvxyz-”

谢谢,，维吉尔。

我的目标是：

void removeDisallowed(std::string& in) {
    static const std::string allowed = "01234...";
    in.erase(
        std::remove_if(in.begin(), in.end(), [&](const char c) {
            return allowed.find(c) == std::string::npos;
        }),
        in.end());
}

如果您想提高效率，可以制作一套：

std::unordered_set<char> allowedSet(allowed.begin(), allowed.end());

[更新]基于很多好的评论和答案，我建议写一篇：

template <typename F>
void erase_if(std::string& in, F func) {
    in.erase(std::remove_if(in.begin(), in.end(), func));
}

模板
void erase_if（std:：string&in，F func）{
in.erase（std:：remove_if（in.begin（）、in.end（）、func））；
}

然后实际尝试使用所有建议的

func

s运行它，看看哪一个最适合您的用例。这与迪特玛的答案不符，所以你必须单独尝试，但它们可能都值得一试

最优雅的方法似乎是使用正则表达式（请注意括起来的方括号）：

根据对性能的评论，我制定了一个快速基准，并在上进行了检查。它比较了一些建议。以下是在MacOS笔记本电脑上运行的结果摘要，该笔记本电脑具有最新版本，并使用了高优化选项。显示的数字是处理冗长文本文档所用的时间，单位为μs：

benchmark                         gcc      clang
regex (build                     186131    552697
regex (prebuild)                 177959    566353
use_remove_if_str_find            44802     40644
use_remove_if_find                88377    123237
use_remove_if_binary_search       54091     64065
use_remove_if_ctype               13818     12901
use_remove_if_hash                81341     58582
use_remove_if_table                9033     10203

前两个基准使用上面发布的regex方法，而其他基准使用不同的方法在lambda内部实现谓词。要澄清名称，请概述所做的工作（在lambda内，根据需要结合

erase（）

等）：

regex（build）：

text=std:：regex\u replace（text，std:：regex（“[^”+allowed+“]”），“”）

正则表达式（预构建）：

text=std:：regex\u replace（text，filter，”）

（构建正则表达式超出了时间）

删除如果str find:

std:：删除如果（…a.find（c））

remove_if find:

std:：remove_if（…std:：find（a.begin（），a.end（），c）=a.end（））

如果二进制搜索，则删除：

std：：如果（…std：：二进制搜索（a.begin（），a.end（），c）），则删除（

remove_if ctype:

std:：remove_if（…isalnum（c）| | c='-'| | c='.''.''.'''.'代码>


删除\u if散列：std:：删除\u if（…无序\u set.count（c））
删除如果表格：std:：删除如果（…表格[c]）
有关详细信息，请参阅。
\include
#包括
#包括
#包括
#包括
#包括
void keep_chars_in_set（std:：string&s，const std:：unordered_set&chars）{
s、 抹去(
std:：remove_if（s.begin（）、s.end（）、[&chars]（常量字符c）{
返回！字符计数（c）；
}),
s、 end（））；
}
void keep_sorted_chars（std:：string&s，const std:：string&sorted_chars）{
s、 抹去(
std:：remove_if（s.begin（）、s.end（）、[&sorted_chars]（常量字符c）{
return！std:：binary_search（sorted_chars.begin（），sorted_chars.end（），c）；
}),
s、 end（））；
}
使用lookup_table=std:：array；
查找表生成查找表（const std:：string&s）{
查找表t={}；
用于（自动c:s）{
t[静态（c）]=真；
}
返回t；
}
void keep_chars_in_lookup_table（std:：string&s，const lookup_table&table）{
s、 抹去(
std:：remove_if（s.begin（）、s.end（）、[&table]（常量字符c）{
返回！表[static_cast（c）]；
}),
s、 end（））；
}
int main（）{
使用名称空间std；
字符串s1=“abcdefxabc”；
字符串s2=“abcdefyabc”；
字符串s3=“abcdefzabc”；
常量无序字符集={'a'，'b'，'c'，'d'，'e'，'f'}；
保持字符集（s1，字符集）；
CUT这可能是简单化的，但是我会考虑使用一个固定的时间查找表，它适合于几个高速缓存行。
void remove_disallowed(std::string &str)
{
    static const char disallowed[] = {
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
        1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,
        1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
    };
    str.erase(std::remove_if(str.begin(), str.end(), [&](char c) {
        return disallowed[static_cast<unsigned char>(c)];
    }), str.end());
}

void-remove\u不允许（std:：string&str）
{
不允许使用静态常量字符[]={
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
};
str.erase（std:：remove_if（str.begin（），str.end（），[&]（字符c）{
不允许返回[static_cast（c）]；
})，str.end（））；
}
提示：有一个算法。听起来像是（i=s.find\u first\u not\u of（…）！=npos）s.erase（i）
就可以了。对于可能有许多匹配项的长字符串，最有效的解决方案几乎肯定不是最优雅的解决方案。它可能是将不匹配的字符复制到新字符串，并在最后交换字符串，因为删除字符将导致对memmove
（或其等效项）的多次调用@Damon任何不止一次调用erase
的解决方案在优雅和效率上都会失败。@Damonstd:：copy_if
后跟swap
应该很快，因为代码可能没有毛茸茸的。非常好：-）在我接受之前，我会花几个小时来看看还有什么其他答案：）谓词（ìsalnum>（c） | | c=='| | | c=='-'）
可能会更快，但它不那么通用。（哦，它更了解区域设置。）查找
->二进制搜索
如果允许
benchmark                         gcc      clang
regex (build                     186131    552697
regex (prebuild)                 177959    566353
use_remove_if_str_find            44802     40644
use_remove_if_find                88377    123237
use_remove_if_binary_search       54091     64065
use_remove_if_ctype               13818     12901
use_remove_if_hash                81341     58582
use_remove_if_table                9033     10203

#include <array>
#include <string>
#include <limits>
#include <iostream>
#include <algorithm>
#include <unordered_set>

void keep_chars_in_set(std::string &s, const std::unordered_set<char> &chars) {
    s.erase(
        std::remove_if(s.begin(), s.end(), [&chars](const char c) {
            return !chars.count(c);
        }),
        s.end());
}

void keep_sorted_chars(std::string &s, const std::string &sorted_chars) {
    s.erase(
        std::remove_if(s.begin(), s.end(), [&sorted_chars](const char c) {
            return !std::binary_search(sorted_chars.begin(), sorted_chars.end(), c);
        }),
        s.end());
}

using lookup_table = std::array<bool, std::numeric_limits<unsigned char>::max()>;

lookup_table make_lookup_table(const std::string &s) {
    lookup_table t = {};
    for (auto c : s) {
        t[static_cast<size_t>(c)] = true;
    }
    return t;
}

void keep_chars_in_lookup_table(std::string &s, const lookup_table &table) {
    s.erase(
        std::remove_if(s.begin(), s.end(), [&table](const char c) {
            return !table[static_cast<size_t>(c)];
        }),
        s.end());
}

int main() {
    using namespace std;

    string s1 = "abcdefxabc";
    string s2 = "abcdefyabc";
    string s3 = "abcdefzabc";

    const unordered_set<char> set_of_chars = {'a', 'b', 'c', 'd', 'e', 'f'};
    keep_chars_in_set(s1, set_of_chars);
    cout << s1 << endl;

    keep_sorted_chars(s2, "abcdef");
    cout << s2 << endl;

    const lookup_table &char_lookup_table = make_lookup_table("abcdef");
    keep_chars_in_lookup_table(s3, char_lookup_table);
    cout << s3 << endl;
}

void remove_disallowed(std::string &str)
{
    static const char disallowed[] = {
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
        1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,
        1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
    };
    str.erase(std::remove_if(str.begin(), str.end(), [&](char c) {
        return disallowed[static_cast<unsigned char>(c)];
    }), str.end());
}