C 如何使strstr高效，从而不捕获不需要的子字符串_C_Strstr

C 如何使strstr高效，从而不捕获不需要的子字符串

C 如何使strstr高效，从而不捕获不需要的子字符串,c,strstr,C,Strstr,例如，如果字符串是“仅适用于极客”，而我只查找“极客”子字符串而不是“极客”，则表示该单词不存在 iestrstr（“仅适用于极客”，“极客”）将为空如何解决这样的问题？您必须通过将strstrstr（）包装到一个函数中来处理它，也许str\u word（）（避免），该函数在找到单词后进行额外检查。或者，至少，这可能是处理这一问题最明智的方式用空格填充搜索字符串无效。前导填充将阻止代码查找“Geek”或（Geek不是贬义词）；拖尾填充将阻止它找到“Ozymandias是个极客”。如果你想去，

例如，如果字符串是

“仅适用于极客”

，而我只查找

“极客”

子字符串而不是

“极客”

，则表示该单词不存在

strstr（“仅适用于极客”，“极客”）

将为空

如何解决这样的问题？

您必须通过将

strstrstr（）

包装到一个函数中来处理它，也许

str\u word（）

（避免），该函数在找到单词后进行额外检查。或者，至少，这可能是处理这一问题最明智的方式

用空格填充搜索字符串无效。前导填充将阻止代码查找

“Geek”

或

（Geek不是贬义词）

；拖尾填充将阻止它找到“Ozymandias是个极客”。如果你想去，你可以考虑去一个强大的正则表达式库，但是它对于这个任务来说是多余的（而且POSIX不够强大，它不能识别字边界）。请注意，这允许函数在“Ozymandias是这样一个极客”中查找极客

请注意，尝试将常量正确性添加到该项中。您可以很容易地使用它：

const char *str_word(const char *haystack, const char *needle);

但是，如果传递了一个

const char*

，而没有一个用于删除行中某个位置的常量的强制转换，则无法返回非常量

char*

。返回一个

const char*

将取消调用代码的常量。这在以下情况下很重要：

char *word = str_word(line, "Geek");

您有一个包含一行输入的变量数组；您希望搜索该行中的单词，并返回一个非常量指针

测试代码：

#include <ctype.h>
#include <stdio.h>
#include <string.h>

extern char *str_word(char *haystack, const char *needle);

char *str_word(char *haystack, const char *needle)
{
    char *from = haystack;
    size_t length = strlen(needle);
    char *found;
    while ((found = strstr(from, needle)) != NULL)
    {
        if (found > haystack && isalpha((unsigned char)found[-1]))
            from += length;
        else if (isalpha((unsigned char)found[length]))
            from += length;
        else
            return found;
    }
    return NULL;
}

int main(void)
{
    const char search[] = "Geek";
    char haystacks[][64] =
    {
        "Geek",
        "(Geek is not pejorative)",
        "Ozymandias is a Geek",
        "Ozymandias is such a Geeky Geek",
        "No prizes for Geekiness",
        "Only for Geeky people",
        "Howling 'Geek' gets you nowhere",
        "A Geek is a human",
        "Geeky people run the tech world",
    };
    enum { NUM_HAYSTACKS = sizeof(haystacks) / sizeof(haystacks[0]) };

    for (int i = 0; i < NUM_HAYSTACKS; i++)
    {
        char *word = str_word(haystacks[i], search);
        if (word == NULL)
            printf("Did not find '%s' in [%s]\n", search, haystacks[i]);
        else
            printf("Found '%s' at [%s] in [%s]\n", search, word, haystacks[i]);
    }

    return 0;
}

你需要用空格检查一下你的“极客”？

p=strstrstr（…）；if（p&&isalpha（（无符号字符）p[4]）/*nope*/@MichaelBianconi空白<代码>'

，

'\t'

，

'\n'

，…虽然您已经成为SO的成员一段时间了，但您尚未接受任何问题的任何答案。在这里说“谢谢”的首选方式是投票选出好的问题和有帮助的答案（一旦你有足够的声誉这么做），并接受对你提出的任何问题最有帮助的答案（这也会给你的声誉带来一点提升）。请参阅该页，也可以添加断言，例如

assert（长度>0&&isalpha（（无符号字符）针[0]）&&isalpha（（无符号字符）针[length-1]）；

到函数的开头。如果不满足这些条件，结果可能是-嗯，如果不是错误的话，至少是意外的。你也可以通过传递函数指针来获得创造性；然后你可以用

str_匹配（“01 011010 01”，“011010”，isdigit）

以及

str_匹配来查找数字(“Ozymandias是一个极客”，“极客”，isalpha）

。玩得开心。你能解释一下这段代码的作用吗？如果（发现>干草堆和isalpha（（未签名字符）发现[-1]），从+=长度；发现什么[-1]意思？？在C语言中有效吗？我在Python中看到这种语法。以相反的顺序引用字符。参数名称假定您熟悉短语“大海捞针”“。测试

found>haystack

检查是否在“haystack”（即正在搜索的字符串）的开头未找到指针。这确保索引

找到[-1]

正在引用字符串的一部分，而不是试图引用字符串开头之前的字符。假设我们不在草堆的开头，

isalpha（（未签名字符）find[-1]）

检查匹配之前的字符是否是字母；如果是，则使用类似于

“aGeek”

[…continued…][…continuation…]如果大海捞针之前（出现）的字符是一个字母，那么这不是要查找的单词，代码跳过不匹配项。

from+=length

跳过整个匹配，因此“Geeky Geek”中的“Geek”下一次可以找到。如果干草堆中针前的字符不是字母，代码将检查针后的字符是否是字母；如果是，则跳过不匹配。如果针前的字符（如果有）和针后的字符都不是字母，则找到该单词。[…续2…][…续文2…]强制转换

（unsigned char）

可确保即使纯

char

类型是有符号类型且

found[-1]

或

found[length]

中的字符为负数，也会将正确的值传递给

isalpha（）

。中的所有

isxyz（）

函数都具有正确的值（一个

unsigned char

转换成

int

或EOF）。@poorniam-而且，为了防止不清楚，

found[-1]

意味着“找到的

found

指向的字符之前的字符”，这与Python的表示法无关。在C中，

a[x]=*（a+x）

，因此

found[-1]=*（found-1）

。

#include <ctype.h>
#include <stdio.h>
#include <string.h>

extern char *str_word(char *haystack, const char *needle);

char *str_word(char *haystack, const char *needle)
{
    char *from = haystack;
    size_t length = strlen(needle);
    char *found;
    while ((found = strstr(from, needle)) != NULL)
    {
        if (found > haystack && isalpha((unsigned char)found[-1]))
            from += length;
        else if (isalpha((unsigned char)found[length]))
            from += length;
        else
            return found;
    }
    return NULL;
}

int main(void)
{
    const char search[] = "Geek";
    char haystacks[][64] =
    {
        "Geek",
        "(Geek is not pejorative)",
        "Ozymandias is a Geek",
        "Ozymandias is such a Geeky Geek",
        "No prizes for Geekiness",
        "Only for Geeky people",
        "Howling 'Geek' gets you nowhere",
        "A Geek is a human",
        "Geeky people run the tech world",
    };
    enum { NUM_HAYSTACKS = sizeof(haystacks) / sizeof(haystacks[0]) };

    for (int i = 0; i < NUM_HAYSTACKS; i++)
    {
        char *word = str_word(haystacks[i], search);
        if (word == NULL)
            printf("Did not find '%s' in [%s]\n", search, haystacks[i]);
        else
            printf("Found '%s' at [%s] in [%s]\n", search, word, haystacks[i]);
    }

    return 0;
}

Found 'Geek' at [Geek] in [Geek]
Found 'Geek' at [Geek is not pejorative)] in [(Geek is not pejorative)]
Found 'Geek' at [Geek] in [Ozymandias is a Geek]
Found 'Geek' at [Geek] in [Ozymandias is such a Geeky Geek]
Did not find 'Geek' in [No prizes for Geekiness]
Did not find 'Geek' in [Only for Geeky people]
Found 'Geek' at [Geek' gets you nowhere] in [Howling 'Geek' gets you nowhere]
Found 'Geek' at [Geek is a human] in [A Geek is a human]
Did not find 'Geek' in [Geeky people run the tech world]