C++ 哈希函数：有没有办法进一步优化我的代码？_C++_Hash

C++ 哈希函数：有没有办法进一步优化我的代码？

c++ hash

C++ 哈希函数：有没有办法进一步优化我的代码？,c++,hash,C++,Hash,上面是散列函数我写了下面的代码。我不确定我是否能用另一种聪明的方法来提高效率。我使用的理解是，我根本不需要做mod，因为unsigned int通过溢出处理这个问题 int myHash(string s) { unsigned int hash = 0; long long int multiplier = 1; for(int i = s.size()-1;i>-1;i--) { hash += (multiplier * s[i]);

上面是散列函数

我写了下面的代码。我不确定我是否能用另一种聪明的方法来提高效率。我使用的理解是，我根本不需要做mod，因为unsigned int通过溢出处理这个问题

int myHash(string s)
{
    unsigned int hash = 0;
    long long int multiplier = 1;
    for(int i = s.size()-1;i>-1;i--)
    {
        hash += (multiplier * s[i]);
        multiplier *= 31;
    }
    return hash;
}

我想您可以使参数不必为函数调用复制字符串，改为s

const string&s

，或者使用

std:：string\u view

，如果您碰巧使用的是C++17。否则，它看起来很快，您应该将其余部分留给编译器处理。尝试使用

-O2

或您的等效编译器对其进行优化。

我想您可以使参数不必为函数调用复制字符串，而是使用s

const string&s

，或者使用

std:：string\u view

，如果您碰巧使用的是C++17。否则，它看起来很快，您应该将其余部分留给编译器处理。试着用

-O2

或与之相当的编译器对其进行优化。

让我先说一句，这可能不值得做——哈希函数不太可能成为程序中的瓶颈，因此，为了使哈希函数更有效而使其更复杂可能只会使其更难理解和维护，而不会使程序更快。所以不要这样做，除非你已经确定你的程序花了很大一部分时间计算字符串散列，并且确保你有一个好的基准例程，你可以在这个更改之前和之后运行，以验证它确实大大加快了速度，否则你可能只是在追逐彩虹

也就是说，更快速地散列长字符串的一种潜在方法是一次处理一个单词而不是一个字符，如下所示：

unsigned int aSlightlyFasterHash(const string & s)
{
   const unsigned int numWordsInString      = s.size()/sizeof(unsigned int);
   const unsigned int numExtraBytesInString = s.size()%sizeof(unsigned int);

   // Compute the bulk of the hash by reading the string a word at a time
   unsigned int hash = 0;
   const unsigned int * iptr = reinterpret_cast<const unsigned int *>(s.c_str());
   for (unsigned int i=0; i<numWordsInString; i++)
   {
      hash += *iptr;
      iptr++;
   }

   // Then any "leftover" bytes at the end we will mix in to the hash the old way
   const unsigned char * cptr = reinterpret_cast<const unsigned char *>(iptr);
   unsigned int multiplier = 1;
   for(unsigned int i=0; i<numExtraBytesInString; i++)
   {
       hash += (multiplier * *cptr);
       cptr++;
       multiplier *= 31;
   }
   return hash;
}

unsigned int aSlightlyFasterHash（常量字符串&s）
{
常量unsigned int numWordsInString=s.size（）/sizeof（unsigned int）；
常量unsigned int numExtraBytesInString=s.size（）%sizeof（unsigned int）；
//通过一次读取一个单词的字符串来计算散列的大小
无符号整数散列=0；
常量unsigned int*iptr=reinterpret_cast（s.c_str（））；
对于（unsigned int i=0；i让我先说一句，这可能不值得做——您的哈希函数不太可能成为程序中的瓶颈，因此使哈希函数更复杂以提高效率可能只会使它更难理解和维护，而不会使您的程序因此，除非您已经确定您的程序花费了相当大的时间计算字符串哈希，并且确保您有一个好的基准例程，可以在“之前”和“之后”运行，否则不要这样做这一变化是为了验证它确实显著加快了速度，否则您可能只是在追逐彩虹
也就是说，更快速地散列长字符串的一种潜在方法是一次处理一个单词而不是一个字符，如下所示：
unsigned int aSlightlyFasterHash(const string & s)
{
   const unsigned int numWordsInString      = s.size()/sizeof(unsigned int);
   const unsigned int numExtraBytesInString = s.size()%sizeof(unsigned int);

   // Compute the bulk of the hash by reading the string a word at a time
   unsigned int hash = 0;
   const unsigned int * iptr = reinterpret_cast<const unsigned int *>(s.c_str());
   for (unsigned int i=0; i<numWordsInString; i++)
   {
      hash += *iptr;
      iptr++;
   }

   // Then any "leftover" bytes at the end we will mix in to the hash the old way
   const unsigned char * cptr = reinterpret_cast<const unsigned char *>(iptr);
   unsigned int multiplier = 1;
   for(unsigned int i=0; i<numExtraBytesInString; i++)
   {
       hash += (multiplier * *cptr);
       cptr++;
       multiplier *= 31;
   }
   return hash;
}

unsigned int aSlightlyFasterHash（常量字符串&s）
{
常量unsigned int numWordsInString=s.size（）/sizeof（unsigned int）；
常量unsigned int numExtraBytesInString=s.size（）%sizeof（unsigned int）；
//通过一次读取一个单词的字符串来计算散列的大小
无符号整数散列=0；
常量unsigned int*iptr=reinterpret_cast（s.c_str（））；
对于（unsigned int i=0；i我会避免使用long
作为乘法器。至少如果您不100%知道您的处理器在与32位乘法相同的时间内进行64位乘法运算。真正的现代顶级处理器可能会这样做，较旧和较小的处理器几乎肯定会比其他处理器花费更长的时间来进行64位mul运算32位的
即使在不擅长乘法的处理器上，乘31也可以非常快，因为x*=31
可以转换为x=x*32-x；
或x=（x我会避免使用long
作为乘法器。至少如果你不知道你的处理器在32位乘法相同的时间内完成64位乘法。真正现代的顶级处理器可能会这样做，较旧和较小的处理器几乎肯定比32位处理器完成64位乘法运算需要更长的时间。
即使在不擅长乘法的处理器上，乘31也可以非常快，因为x*=31
可以转换为x=x*32-x；
或x=（x如果您链接到实际图像，请链接到图像，而不是Gallery也许代码复查是一个更好的论坛。请注意，unsigned int不能保证正好是32位宽（尽管它通常是那么宽）。如果希望依赖32位无符号变量的溢出行为，则如果改用uint32_t类型（通过#include），代码的可移植性将更高而不是unsigned int。我想说的是，答案是：我们不知道。这取决于你的编译器、它编译的处理器、你散列的数据、std:：string的实现，也许还有其他事情。知道的唯一方法是测量。（但我同意N00byEdge关于const string&s
）答案还取决于您的优化标准。最小化可执行文件的大小或内存使用可能与最小化CPU周期数或基于时间的度量非常不同。更担心的是功能是否正常运行（即可靠地生成所有可行输入的预期输出）。然后进行测试（配置文件、基准测试等）收集有关其性能的证据。然后才担心“优化”。目前（假设您的实现如图所示按预期工作），您所做的只是过早优化。如果您链接到实际图像，