C++ C++；使用正则表达式查找子字符串_C++_Regex

C++ C++；使用正则表达式查找子字符串

c++ regex

C++ C++；使用正则表达式查找子字符串,c++,regex,C++,Regex,我有一个字符串测试我想查找（作为子字符串）作为谷歌博士的搜索：regex e（“”）；cmatch=cm标记我要查找的子字符串下一步我该怎么做我是否正确使用regex_匹配（htmlString，cm，e）]*？）+ > 你说的“下一步我该怎么做”是什么意思？你想找到所有匹配的子字符串吗？向谷歌医生寻求解决办法。为什么不使用字符而不是字符？要匹配一个精确的字符串，你不需要正则表达式。如果我理解问题+1，这似乎正是OP想要的。我能用string代替wstring？@Builchaumin

我有一个字符串测试

我想查找

（作为子字符串）

作为谷歌博士的搜索：

regex e（“”）；cmatch=cm

标记我要查找的子字符串

下一步我该怎么做

我是否正确使用

regex_匹配（htmlString，cm，e）htmlString

作为wchar\t*
进行编码如果要查找所有匹配的子字符串，则需要使用正则迭代器：
// example data
std::wstring const html = LR"(

<td><a href="4.%20Functions,%20scope.ppt">4. Functions, scope.ppt</a></td>
<td><a href="4.%20Functions,%20scope.ppt">4. Functions, scope.ppt</a></td>
<td><a href="4.%20Functions,%20scope.ppt">4. Functions, scope.ppt</a></td>

)";

// for convenience
constexpr auto fast_n_loose = std::regex_constants::optimize|std::regex_constants::icase;

// extract href's
std::wregex const e_link{LR"~(href=(["'])(.*?)\1)~", fast_n_loose};

int main()
{
    // regex iterators       
    std::wsregex_iterator itr_end;
    std::wsregex_iterator itr{std::begin(html), std::end(html), e_link};

    // iterate through the matches
    for(; itr != itr_end; ++itr)
    {
        std::wcout << itr->str(2) << L'\n';
    }
}

//示例数据
std:：wstring const html=LR“(
)";
//为了方便
constexpr auto fast_n_loose=std:：regex_常量：：优化| std:：regex_常量：：icase；
//摘录href's
std:：wregex const e_link{LR“~（href=（[“'））（.*？\1）~”，fast\n\u loose}；
int main（）
{
//正则迭代器
std:：wsregex_迭代器itr_end；
std:：wsregex_迭代器itr{std:：begin（html），std:：end（html），e_link}；
//迭代匹配
对于（；itr！=itr_end；++itr）
{
std:：wcout str（2）这将匹配完整的a
标记，并获得href属性值，

在第二组中
应该这样做，因为href属性可以位于标记中的任何位置


"'] | " [^"]* " | ' [^']* ' )*?
\s href\s*=\s*
(?:
（['”）#（1），引述
（[\S\S]*？）#（2），href值
\1 
)
)
\s+
（？：“[\S\S]*？”|“[\S\S]*？”|[^>]*？）+
>
你说的“下一步我该怎么做”是什么意思？你想找到所有匹配的子字符串吗？向谷歌医生寻求解决办法。为什么不使用字符而不是字符？要匹配一个精确的字符串，你不需要正则表达式。如果我理解问题+1，这似乎正是OP想要的。我能用string
代替wstring
？@Builchauminhtung你的问题提到wch吗对于数据，您确实需要使用std:：wstring
。但是，是的，您可以使用std:：string
、std:：regex
和std:：sregex迭代器
来完成这一切，如果您不需要处理多字节字符。@BUICHAUMinhTung，如果您的源数据是std:：string
中的UTF-8n您将需要转换为宽字符unicode，因为在此答案示例中，可以在以下答案中找到转换函数：
 < a                    # a tag, substitute [\w:]+ for any tag

 (?=                    # Asserttion (a pseudo atomic group)
      (?: [^>"'] | " [^"]* " | ' [^']* ' )*?
      \s href \s* = \s* 
      (?:
           ( ['"] )               # (1), Quote
           ( [\S\s]*? )           # (2), href value
           \1 
      )
 )
 \s+ 
 (?: " [\S\s]*? " | ' [\S\s]*? ' | [^>]*? )+
 >