C 使用双哈希法解决字符串冲突？_C_Hash_Hashmap_Hashtable_Double Hashing

C 使用双哈希法解决字符串冲突？

c hash

C 使用双哈希法解决字符串冲突？,c,hash,hashmap,hashtable,double-hashing,C,Hash,Hashmap,Hashtable,Double Hashing,我有一个输入文件，有大约10k个字符串。我需要使用开放寻址和双哈希来计算冲突的数量。然而，代码进入了一个无限循环，也就是说，它根本无法找到任何可填充的空位置！对于大于1000的输入根据给我的问题，我们应该使用这个。第一个散列：ASCII的求和第二个散列：划分，即mod sizeofPrime 我想这和我的哈希函数有关 insert(char* name) { int offset=0,index=0; index=hash1(name);//calculate the i

我有一个输入文件，有大约10k个字符串。我需要使用开放寻址和双哈希来计算冲突的数量。然而，代码进入了一个无限循环，也就是说，它根本无法找到任何可填充的空位置！对于大于1000的输入

根据给我的问题，我们应该使用这个。第一个散列：ASCII的求和第二个散列：划分，即mod sizeofPrime

我想这和我的哈希函数有关

insert(char* name)
{
    int offset=0,index=0;
    index=hash1(name);//calculate the index from string
    index=index % 10000;//to keep index between 0-10k range
    while((array[index].word[0]!='\0')) // checking if space in array empty
    {
        collision+=1;    //global counter
        offset=hash2(index); //offset
        index=index+offset;
        index=index%10000;
    }
    strcpy(array[index].word,name);
    return array;

}

我已经创建了一个用于存储字符串的结构数组，使用以下命令

struct table{
    char word[50];
};

然后创建一个数组，如下所示：

struct table*array=(struct table*)malloc(10000*sizeof(struct table));

下面是我使用的两个哈希函数：

int hash1(char* name) //This function calculates the ascii value
{
    int index=0;int i=0;
    for(;name[i]!='\0';i++)
    {
        index+=name[i];
    }
    return index;

}

int hash2(int index) //Calculates offset (mod with a large prime,Why do we do this?)
{
    int offset=index%1021;
    return offset;
}

下面是我的插入函数

insert(char* name)
{
    int offset=0,index=0;
    index=hash1(name);//calculate the index from string
    index=index % 10000;//to keep index between 0-10k range
    while((array[index].word[0]!='\0')) // checking if space in array empty
    {
        collision+=1;    //global counter
        offset=hash2(index); //offset
        index=index+offset;
        index=index%10000;
    }
    strcpy(array[index].word,name);
    return array;

}

如何改进哈希函数，使其访问阵列上的所有10k位置？或者完全改变它

根据给我的问题，我们应该使用这个。第一个散列：ASCII的求和

第二个散列：除法，即mod sizeofPrime

如果第二个散列只是第一个散列的函数，那么就根本不是双重散列。第二个散列必须是独立的。第一个散列也很糟糕，但对于这个简单的目的来说可能已经足够好了。试试类似FNV的东西。哦，确保你的第二个散列值永远不为0，如果它与你的表的大小相对优先，那就更好了。我如何更改我的第二个函数？因为我不允许使用其他哈希算法。我更新了ques。是的，我会确保第二个函数不会返回0。如果第一个哈希是赋值的一部分，那么您会被卡住。这将导致短字符串聚集在表的低端。你可以将第二个散列值设为字符串前8个字符的和，模化一个非常小的素数，比如97，再加上1。另外，你说输入大约有10000个字符串。那么为什么这个表只有10000个条目呢？为什么不给它一些空间，比如12000或15000？好的，谢谢，我会这样做。