Python 3.x python中的极小极大AI_Python 3.x_Artificial Intelligence_Numpy Ndarray_Minmax

Python 3.x python中的极小极大AI

python-3.x artificial-intelligence

Python 3.x python中的极小极大AI,python-3.x,artificial-intelligence,numpy-ndarray,minmax,Python 3.x,Artificial Intelligence,Numpy Ndarray,Minmax,我正在尝试创建一个minimax类型的AI，它将经历4层移动，并尝试根据某种启发式选择最佳的移动。问题是在我的状态机中，如果我到达一个非法移动的节点，那么我将返回值None，而不是启发式函数将给出的正常点值。当在我的极小极大函数中处理这个问题时，我有点不确定如何以最好的方式处理它。到目前为止，看起来是这样的，我想知道这是否有意义 def ai_min_max(board, ai_mancala, player_mancala, ai_choices, player_choices, target

我正在尝试创建一个minimax类型的AI，它将经历4层移动，并尝试根据某种启发式选择最佳的移动。问题是在我的状态机中，如果我到达一个非法移动的节点，那么我将返回值None，而不是启发式函数将给出的正常点值。当在我的极小极大函数中处理这个问题时，我有点不确定如何以最好的方式处理它。到目前为止，看起来是这样的，我想知道这是否有意义

def ai_min_max(board, ai_mancala, player_mancala, ai_choices, player_choices, target_depth, cur_depth, maxTurn, position):
    #base case where we call our heuristic function to tell us what the value of this state is
    if cur_depth == target_depth :
        #return the heuristic value for this state
        return first_heuristic(board, ai_mancala, player_mancala, ai_choices, player_choices, position)

    #if we are currently on a level where we are maximizing our function
    if maxTurn :
        #set the value to negative infinity
        max_eval = float("-inf")
        #go through the 10 possible choices you can make
        for x in range(len(ai_choices)) :
            new_position = position + [x]
            my_eval = ai_min_max(board, ai_mancala, player_mancala, ai_choices, player_choices, target_depth, cur_depth +1, False, new_position)
            #update the current max only if we have a valid movement, if not then do not update
            if my_eval is not None:
                max_eval = max(max_eval, my_eval)
        if max_eval == float("-inf") :
            return float("inf")
        return max_eval

    #if it is the minimizing player's turn
    else :
        min_eval = float("inf")
        for x in range(len(player_choices)) :
            new_position = position + [x]
            my_eval = ai_min_max(board, ai_mancala, player_mancala, ai_choices, player_choices, target_depth, cur_depth +1, True, new_position)
            if my_eval is not None:
                min_eval = min(min_eval, my_eval)
        #if there were no valid moves
        if min_eval == float("inf") :
            return float("-inf")
        return min_eval

通常在minimax实现中，您从来不会对非法移动进行递归调用-这些移动从一开始就不会生成。但是，在某些情况下，可以更容易（或更便宜）实际应用这一举措，以查明其是否合法。例如，如果您必须应用一个复杂的计算来确定某个移动是否合法，那么您不想重复两次（一次是在生成潜在移动时，一次是在搜索移动时）。所以，我假设这里就是这样

考虑到这一点，在上面的代码中返回一个特殊值是否有意义

不，有更好的方法。在min节点，当移动非法时，可以将-inf返回给父节点；在max节点，可以将inf返回给父节点。这样一来，非法行为的价值可能会降低，并且会在没有任何其他特殊情况的情况下由搜索的其余部分自然处理。这使得主minimax/alpha-beta循环更加简单

唯一复杂的是，如果一个最大的玩家在根上的所有招式都失败了，它可能会返回非法的招式。您可以在主搜索之外处理此案件-与完整搜索相比，测试单个移动非常便宜-如果返回的移动是非法的，只需返回任何合法移动。

非常感谢，所有这些都很有意义！我最终是如何做到这一点的，正如您所说的，在我以前的函数调用中返回-inf和正inf！