Python 如何使用给定的类创建哈希表

Python 如何使用给定的类创建哈希表,python,hash,linked-list,nodes,Python,Hash,Linked List,Nodes,我目前正在完成我的计算机科学作业,在最后一点上遇到了问题,我正在寻求一些建议 使用以下类: class CounterLinkedList: __n_comparisons__ = 0 def __init__(self, head=None): self.head = head self.__n_accesses__ = 0 def __repr__(self): node = self.head st

我目前正在完成我的计算机科学作业,在最后一点上遇到了问题,我正在寻求一些建议

使用以下类:

class CounterLinkedList:
    __n_comparisons__ = 0

    def __init__(self, head=None):
        self.head = head
        self.__n_accesses__ = 0

    def __repr__(self):
        node = self.head
        string = str(node)
        while node.next_node:
            string += " -> " + str(node.next_node)
            node = node.next_node
        string = "[" + string + "]"
        return string

class MyString:
    '''A wrapped string that counts comparisons of itself
   against strings and delegates all other operations to the
   string itself.'''
    def __init__(self, i):
        self.i = i

    def __repr__(self):
        return repr(self.i)

    def __getattr__(self, attr):
        '''All other behaviours use self.i'''
        return self.i.__getattr__(attr)    


class CounterNode:
    def __init__(self, word, count=1):
        self.word = MyString(word)
        self.count = count
        self.next_node = None

    def __repr__(self):
        return str(self.word) + ": " + str(self.count)


def _c_mul(a, b):
    """Substitute for c multiply function"""
    return ((int(a) * int(b)) & 0xFFFFFFFF)


def nice_hash(input_string):
    """Takes a string name and returns a hash for the string. This hash value
    will be os independent, unlike the default Python hash function."""
    if input_string is None:
        return 0  # empty
    value = ord(input_string[0]) << 7
    for char in input_string:
        value = _c_mul(1000003, value) ^ ord(char)
    value = value ^ len(input_string)
    if value == -1:
        value = -2
    return value


def hash_word(item, slots):
    return nice_hash(item) % slots
应输出:

0: ['words': 1 -> 'no': 1 -> 'list': 1]
1: ['repeat': 1]
2: ['with': 1]
3 
到目前为止,我的代码是:

'''test'''
from classes_2 import CounterNode, CounterLinkedList, hash_word


def word_counter_hash(words_list, slots):
    """test"""
    hash_list = [None]*slots
    num_comparisons = 0
    for new_word in words_list:
        if len(words_list) >= 0:
            n = CounterNode(new_word, 1)
            new_list = CounterLinkedList(n)
            hash_value = hash_word(new_word, slots)
            if hash_list[hash_value] == None:
                del hash_list[hash_value]
                hash_list.insert(hash_value, new_list)
            else:
                first_node = new_list.head
                first_node.next_node = CounterNode(words_list[hash_value], 1)
                first_node = (new_list)
                del hash_list[hash_value]
                hash_list.insert(hash_value, new_list)

    return hash_list, num_comparisons  
但是,我的输出与上面的不同:

0: ['words': 1 -> 'list': 1]
1: ['repeat': 1]
2: ['with': 1]
0

我正在寻求关于如何才能走上正轨的任何建议,如有任何帮助,我将不胜感激。

首先要注意的是:

nice_hash("list") % 3
#>>> 0

nice_hash("no") % 3
#>>> 0

nice_hash("words") % 3
#>>> 0
这些都将在第一个长方体上碰撞。因此,让我们尝试删除除第一个框以外的所有框:

slots = 1
counts, comparisons = word_counter_hash(['a', 'b', 'c'], slots)
print(counts[0])
#>>> ['c': 1 -> 'a': 1]
这再现了这个问题。好的现在我们可以替换hash\u word:

def hash_word(item, slots):
    return 0
它工作不好并不重要;它再现了这个问题。事实上,我们根本不需要这个函数,我们只需要硬编码散列值0

通过这些简化,我们得到:

def word_counter_hash(words_list):
    hash_list = [None]

    for new_word in words_list:
        new_list = CounterLinkedList(CounterNode(new_word, 1))

        if hash_list[0] == None:
            del hash_list[0]
            hash_list.insert(0, new_list)

        else:
            first_node = new_list.head
            first_node.next_node = CounterNode(words_list[0], 1)
            first_node = (new_list)
            del hash_list[0]
            hash_list.insert(0, new_list)

    return hash_list

counts = word_counter_hash(['a', 'b', 'c'])
print(counts[0])
#>>> ['c': 1 -> 'a': 1]
请注意:

del hash_list[idx]
hash_list.insert(idx, X)
这只是一种非常缓慢的方式

hash_list[idx] = X
所以我们有

def word_counter_hash(words_list):
    hash_list = [None]

    for new_word in words_list:
        new_list = CounterLinkedList(CounterNode(new_word, 1))

        if hash_list[0] == None:
            hash_list[0] = new_list

        else:
            first_node = new_list.head
            first_node.next_node = CounterNode(words_list[0], 1)
            first_node = (new_list)
            hash_list[0] = new_list

    return hash_list

counts = word_counter_hash(['a', 'b', 'c'])
print(counts[0])
#>>> ['c': 1 -> 'a': 1]
此行不起任何作用:

first_node = (new_list)
由于我们从未真正使用过
first\u node
,我们可以像这样重写它:

def word_counter_hash(words_list):
    hash_list = [None]

    for new_word in words_list:
        new_list = CounterLinkedList(CounterNode(new_word, 1))

        if hash_list[0] == None:
            hash_list[0] = new_list

        else:
            new_list.head.next_node = CounterNode(words_list[0], 1)
            hash_list[0] = new_list

    return hash_list

counts = word_counter_hash(['a', 'b', 'c'])
print(counts[0])
#>>> ['c': 1 -> 'a': 1]
然后我们可以重复删除
散列列表[0]=新列表
行:

def word_counter_hash(words_list):
    hash_list = [None]

    for new_word in words_list:
        new_list = CounterLinkedList(CounterNode(new_word, 1))

        if hash_list[0] != None:
            new_list.head.next_node = CounterNode(words_list[0], 1)

        hash_list[0] = new_list

    return hash_list

counts = word_counter_hash(['a', 'b', 'c'])
print(counts[0])
#>>> ['c': 1 -> 'a': 1]
因此,我们:

  • 列一个新的清单

  • 如果已经有一个列表,则将
    新列表的第二个元素设置为
    单词列表[0]
    (原来
    单词列表[哈希值]

  • 设置新列表

现在,第二个看起来是错误的

因此,你应该做一些类似的事情:

  • 如果节点为
    None
    ,则创建一个新列表

  • 如果节点不是
    None
    ,则遍历它

    • 如果你发现有东西已经在那里,增加它的数量

    • 如果没有,请将当前节点作为新节点添加到末尾

像这样:

def word_counter_hash(words_list):
    hash_list = [None]

    for new_word in words_list:
        if hash_list[0] == None:
            hash_list[0] = CounterLinkedList(CounterNode(new_word, 1))

        else:
            node = hash_list[0].head

            while True:
                if node.word == new_word:
                    node.count += 1
                    break

                elif not node.next_node:
                    node.next_node = CounterNode(new_word, 1)
                    break

                node = node.next_node

    return hash_list

counts = word_counter_hash(['a', 'b', 'c', 'c', 'a'])
print(counts[0])
#>>> ['a': 2 -> 'b': 1 -> 'c': 2]
我试着让它变得相当简单,难以想象


我让您将
hash\u list[0]
更改为
hash\u list[hash\u value]

此代码有点长。有没有可能把它缩减一点,或者它是所有需要的?@Veedrac,不幸的是,庞大的类代码是需要的,因为它为赋值设置了限制,以及我们制定答案的基础。赋值可能需要它,但现在我们只需要重现错误。删除与此无关的内容会有所帮助。@Veedrac对此表示抱歉,我已经尽可能地减少了它,非常感谢您提供了这个非常详细的答案!它使事情更容易理解
def word_counter_hash(words_list):
    hash_list = [None]

    for new_word in words_list:
        if hash_list[0] == None:
            hash_list[0] = CounterLinkedList(CounterNode(new_word, 1))

        else:
            node = hash_list[0].head

            while True:
                if node.word == new_word:
                    node.count += 1
                    break

                elif not node.next_node:
                    node.next_node = CounterNode(new_word, 1)
                    break

                node = node.next_node

    return hash_list

counts = word_counter_hash(['a', 'b', 'c', 'c', 'a'])
print(counts[0])
#>>> ['a': 2 -> 'b': 1 -> 'c': 2]