为什么glibc'；s strlen需要如此复杂才能快速运行？_C_Optimization_Glibc_Portability_Strlen

为什么glibc'；s strlen需要如此复杂才能快速运行？
c optimization
为什么glibc'；s strlen需要如此复杂才能快速运行？,c,optimization,glibc,portability,strlen,C,Optimization,Glibc,Portability,Strlen,我正在浏览strlen的代码，我想知道代码中使用的优化是否真的需要？例如，为什么像下面这样的工作不能同样好或更好 unsigned long strlen(char s[]) { unsigned long i; for (i = 0; s[i] != '\0'; i++) continue; return i; } 代码越简单，编译器就越容易优化吗链接后面页面上的strlen代码如下所示： /* Copyright（C）1991, 1993, 19
我正在浏览strlen的
代码，我想知道代码中使用的优化是否真的需要？例如，为什么像下面这样的工作不能同样好或更好
unsigned long strlen(char s[]) {
    unsigned long i;
    for (i = 0; s[i] != '\0'; i++)
        continue;
    return i;
}

代码越简单，编译器就越容易优化吗
链接后面页面上的strlen
代码如下所示：
<代码> /* Copyright（C）1991, 1993, 1997，2000, 2003自由软件基金会，Inc.
此文件是GNUC库的一部分。
托比约恩·格兰隆德写(tege@sics.se),
在Dan Sahlin的帮助下(dan@sics.se);
吉姆·布兰迪评论(jimb@ai.mit.edu).
GNUC库是自由软件；您可以重新分发它和/或
根据GNU小公众的条款对其进行修改
自由软件基金会发布的许可证；任何一个
许可证的2.1版，或（由您选择）任何更高版本。
分发GNU C库是希望它会有用，
但无任何保证；甚至没有任何关于
适销性或适合某一特定目的。见GNU
有关更多详细信息，请参阅较低的通用公共许可证。
您应该已经收到GNU Lesser General Public的副本
许可证以及GNUC库；如果没有，写信给免费的
软件基础，公司，59庙广场，套房330，波士顿，MA
02111-1307美国*/
#包括
#包括
#未定义strlen
/*返回以null结尾的字符串STR.Scan的长度
通过一次测试四个字节来快速删除空终止符*/
尺寸
斯特伦（str）
常量字符*str；
{
常量字符*字符ptr；
const unsigned long int*longword\u ptr；
无符号长整型长字、魔术位、himagic、lomagic；
/*通过一次读取一个字符来处理前几个字符。
执行此操作，直到字符PTR在长字边界上对齐*/
对于（char_ptr=str；（（unsigned long int）char_ptr）
&（sizeof（longword）-1））！=0；
++字符（ptr）
如果（*char_ptr=='\0'）
返回字符ptr-str；
/*所有这些说明性注释都是指4字节的长单词，
但这一理论同样适用于8字节长单词*/
longword_ptr=（无符号长整型*）字符_ptr；
/*这个数字的第31、24、16和8位是零。调用这些位
“洞”。请注意，在门的左边有一个洞
每个字节，末尾有一个额外的字节：
位：01111111111111110111111111 0 1111111111 0 11111111
字节：aaaaaa bbbbbbbbccccccccccccdddddd
1位确保进位传播到下一个0位。
0位为进位提供了空穴*/
魔法位=0x7EFL；
himagic=0x80L；
lomagic=0x01010101L；
如果（长单词大小>4）
{
/*64位版本的魔术*/
/*如果long有32位，则分两步进行移位以避免出现警告*/
magic_bits=（（0x7efel链接文件中的注释对此进行了解释：
 27 /* Return the length of the null-terminated string STR.  Scan for
 28    the null terminator quickly by testing four bytes at a time.  */

以及：
在C语言中，可以对效率进行详细的推理
与一次测试多个字节相比，遍历单个字符查找空值的效率要低一些，就像这段代码所做的那样
额外的复杂性来自于需要确保被测字符串在正确的位置对齐，以开始一次测试多个字节（沿着长字边界，如注释中所述），以及需要确保在使用代码时不违反关于数据类型大小的假设
在大多数（但不是所有）现代软件开发中，这种对效率细节的关注是不必要的，或者不值得付出额外代码复杂性的代价
在标准库中，像这样关注效率是有意义的，就像您链接的示例一样

如果您想了解更多有关单词边界的信息，请参阅和

我还认为这是一个更清晰、更详细的讨论。
你不需要，你永远不应该写这样的代码-特别是如果你不是C编译器/标准库供应商的话。它是用来实现strlen的代码，带有一些非常可疑的速度攻击和假设（未经断言测试或评论中未提及）：

无符号长
为4或8字节
字节是8位
指针可以强制转换为无符号long long
而不是uintpttr\t
只需检查2或3个最低阶位是否为零，即可对齐指针
可以通过无符号long
s访问字符串
可以读取超过数组末尾的内容，而不会产生任何不良影响

更重要的是，一个好的编译器甚至可以替换按原样编写的代码
size_t stupid_strlen(const char s[]) {
    size_t i;
    for (i=0; s[i] != '\0'; i++)
        ;
    return i;
}

（请注意，它必须是与size\t
兼容的类型）具有编译器内置strlen
的内联版本，或对代码进行矢量化；但编译器不太可能优化复杂版本

strlen
功能描述如下：
说明
strlen
函数计算s指向的字符串的长度
返回
strlen
函数返回终止空字符之前的字符数
现在，如果s
指向的字符串位于一个字符数组中，长度刚好足以包含字符串和终止NUL，那么如果我们通过空终止符访问字符串，例如在
char *str = "hello world";  // or
char array[] = "hello world";

因此，在完全可移植/符合标准的C中，要正确地实现这一点，唯一的方法就是在你的问题中写入它的方式，除了琐碎的转换-你可以假装是
size_t stupid_strlen(const char s[]) {
    size_t i;
    for (i=0; s[i] != '\0'; i++)
        ;
    return i;
}

char *str = "hello world";  // or
char array[] = "hello world";

int main(void) {
    char buf[12];
    printf("%zu\n", the_strlen(fgets(buf, 12, stdin)));
}

% ./a.out
hello world
=================================================================
==8355==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffffe63a3f8 at pc 0x55fbec46ab6c bp 0x7ffffe63a350 sp 0x7ffffe63a340
READ of size 8 at 0x7ffffe63a3f8 thread T0
    #0 0x55fbec46ab6b in the_strlen (.../a.out+0x1b6b)
    #1 0x55fbec46b139 in main (.../a.out+0x2139)
    #2 0x7f4f0848fb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #3 0x55fbec46a949 in _start (.../a.out+0x1949)

Address 0x7ffffe63a3f8 is located in stack of thread T0 at offset 40 in frame
    #0 0x55fbec46b07c in main (.../a.out+0x207c)

  This frame has 1 object(s):
    [32, 44) 'buf' <== Memory access at offset 40 partially overflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow (.../a.out+0x1b6b) in the_strlen
Shadow bytes around the buggy address:
  0x10007fcbf420: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fcbf430: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fcbf440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fcbf450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fcbf460: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x10007fcbf470: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00[04]
  0x10007fcbf480: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fcbf490: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fcbf4a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fcbf4b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007fcbf4c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==8355==ABORTING

...
#include <string.h>

size_t
strlen(const char *str)
{
    const char *s;

    for (s = str; *s; ++s)
        ;
    return (s - str);
}

DEF_STRONG(strlen);

strlen(const char *char_ptr)
{
  typedef unsigned long __attribute__((may_alias)) aliasing_ulong;

  // handle unaligned startup somehow, e.g. check for page crossing then check an unaligned word
  // else check single bytes until an alignment boundary.
  aliasing_ulong *longword_ptr = (aliasing_ulong *)char_ptr;

  for (;;) {
     // alignment still required, but can safely alias anything including a char[]
     unsigned long ulong = *longword_ptr++;

     ...
  }
}

   unsigned long longword;
   memcpy(&longword, char_ptr, sizeof(longword));
   char_ptr += sizeof(longword);