Java 极小极大算法错误_Java_Algorithm_Tic Tac Toe_Minimax

Java 极小极大算法错误

java algorithm

Java 极小极大算法错误,java,algorithm,tic-tac-toe,minimax,Java,Algorithm,Tic Tac Toe,Minimax,我一直在尝试学习minimax算法，但我偶然发现了一个无法解决的错误。代码：如果玩家选择此选项： [O][ ][ ] [O][ ][ ] [ ][x][x] => [O][x][x] [ ][ ][ ] [ ][ ][ ] 然后人工智能的行为正常。我不知道是什么错了，甚至不知道我是否正确理解了极大极小算法 ****编辑**** 添加此代码仍然有相同的问题 private int[] evaluateMove(int [] board, int currentTu

我一直在尝试学习minimax算法，但我偶然发现了一个无法解决的错误。代码：

如果玩家选择此选项：

[O][ ][ ]    [O][ ][ ]
[ ][x][x] => [O][x][x]
[ ][ ][ ]    [ ][ ][ ]

然后人工智能的行为正常。我不知道是什么错了，甚至不知道我是否正确理解了极大极小算法

****编辑**** 添加此代码仍然有相同的问题

    private int[] evaluateMove(int [] board, int currentTurn) {
    int bestScore;
    int currentScore;
    int bestMove = -1;
    if (currentTurn == 1) {
        bestScore = Integer.MIN_VALUE;
    } else {
        bestScore = Integer.MAX_VALUE;
    }

    List<Integer> nextMoves = generatemoves(board);
    if (nextMoves.isEmpty()) {
        bestScore = evaluateTheBoard(board);
    } else {
        for (int move : nextMoves) {
            int[] nextBoard = new int[9];
            for (int i = 0; i < nextBoard.length; i ++) {
                nextBoard[i] = board[i];
            }
            nextBoard[move] = currentTurn;
            currentScore = evaluateMove(nextBoard, nextTurn())[0];
            if (currentTurn == 1) {
                if (currentScore > bestScore) {
                    bestScore = currentScore;
                    bestMove = move;
                }
            } else {
                if (currentScore < bestScore) {
                    bestScore = currentScore;
                    bestMove = move;
                }
            }
        }
    }
    return new int[] {bestScore, bestMove};
}

private int[]evaluateMove（int[]板，int currentTurn）{
智力最佳分数；
int-currentScore；
int-bestMove=-1；
如果（currentTurn==1）{
bestScore=Integer.MIN_值；
}否则{
bestScore=Integer.MAX_值；
}
列表下一个移动=生成移动（板）；
if（nextMoves.isEmpty（））{
最佳分数=评估板（板）；
}否则{
对于（整数移动：下一个移动）{
int[]nextBoard=新int[9]；
对于（int i=0；i最佳分数）{
最佳分数=当前分数；
最佳移动=移动；
}
}否则{
如果（当前分数<最佳分数）{
最佳分数=当前分数；
最佳移动=移动；
}
}
}
}
返回新的int[]{bestScore，bestMove}；
}

我认为你误解了在这样的游戏中如何展望未来。不要对

evaluateLine

返回的值进行“合计”

这里是tic tac趾板的最小最大分数的伪代码（evaluateBoard应该返回的内容）。请注意，

evaluateBoard

需要有

currentTurn

的概念

function evaluateBoard(board, currentTurn)

// check if the game has already ended:
if WhiteHasWon then return -10
if BlackHasWon then return +10

// WhiteHasWon returns true if there exists one or more winning 3-in-a-row line for white. 
// (You will have to scan for all 8 possible 3-in-a-row lines of white pieces)
// BlackHasWon returns true if there exists one or more winning 3-in-a-row line for black

if no legal moves, return 0 // draw

// The game isn't over yet, so look ahead:
bestMove = notset
resultScore = notset
for each legal move i for currentTurn,
   nextBoard = board
   Apply move i to nextBoard
   score = evaluateBoard(nextBoard, NOT currentTurn).score
   if score is <better for currentTurn> than resultScore, then   
      resultScore = score
      bestMove = move i
return (resultScore, bestMove)

功能评估板（板，当前回合）
//检查游戏是否已结束：
如果怀特哈斯旺赢了，那么返回-10
如果Blackhaswen，则返回+10
//如果存在一个或多个white的三连胜行，WhiteHasWon将返回true。
//（您必须扫描所有8行可能的3行白色碎片）
//如果BlackHaswen存在一个或多个连续三行的致胜行，则返回true
如果没有合法移动，则返回0//draw
//游戏还没有结束，所以展望未来：
最佳移动=未设置
resultScore=notset
对于当前回合的每个合法移动，
下一个板=板
将移动i应用到下一个板
分数=评估板（下一个板，不是当前回合）。分数
如果分数低于ResultCore，则
结果分数=分数
最佳移动=移动i
返回（结果存储、最佳移动）

此版本与您的版本和我的版本之间的一个关键区别是，我的版本是递归的。你的只有一层深。我从

evaluateBoard

内部调用

evaluateBoard

，如果我们不小心，这将是一个无限循环（一旦板填满，它就不能再深入，所以它实际上不是无限的）

另一个不同点是，你的东西在不应该的时候会合计。tic-tac-toe得到的分数是-10,0，或者只有在游戏结束时才得到10分。你应该选择那个玩家当时可以选择的最好的移动方式，而完全忽略所有其他的可能性，因为你只关心“最好”的路线。游戏分数等于最佳游戏的结果

扩展

在minimax中是混乱的，这就是为什么negamax更干净。白人更喜欢低分，黑人更喜欢高分，所以你需要一些if语句来让它选择合适的首选分数。您已经有了这个部分（在最佳移动代码的末尾），但是它需要在递归内部进行评估，而不是仅仅在末尾进行评估。

minimax是一种评分约定，而不是一种算法。在极小极大下，有利于一方的位置为负，而有利于另一方的位置为正。你对分数的实际操作是算法的工作。这里的算法只是对未来位置的暴力强制，但命名算法的示例包括Alpha-Beta修剪、MTD（f）、Negascout等，这些都不是tic-tac-toe所必需的，因为这些只是对经典暴力的性能优化。此外，依我看，negamax评分比minimax评分好，通常代码更干净。此外，由于tic-tac-toe通常会导致平局，特别是当你移到第二位时，计算机通常会看到它唯一的选择是平局，这可能解释了你所看到的。当最好的选择是平局时，它只会玩任何游戏，阻止对手真正获胜。当你说“玩得很奇怪”时，是不是真的让玩家赢了？因为这表明一个bug没有贯穿整个故事，但有可能是两个动作的结合吗？i、 e.得分相同？如果是，你会怎么做？@shole在这个例子中，玩家是x，可以用这些动作进行平局游戏。问题是AI让玩家赢了。谢谢你的回答。但我有一些问题。伪代码的顶部只检查黑人或白人赢了还是平局？还是我仍然需要evaluateLine函数？我也不知道如何从evaluateBoard函数中提取正确的移动，如果你能澄清这一点的话。也许我读错了，但我认为evaluateLine是在寻找表示胜利的三行。你需要检查游戏结束的地方是寻找任何3连胜的队伍（赢得游戏）。如果白色有一个（或多个），那么分数是-10，黑色也一样。我更新了答案，以展示如何从evaluateBoard中提取最佳移动，如果你想从该函数中得到答案。你也可以更进一步，建立一个完整的预期移动序列，这样你就可以看到电脑喜欢的游戏路线。我已经对你的解决方案进行了一些讨论。A

    private int[] evaluateMove(int [] board, int currentTurn) {
    int bestScore;
    int currentScore;
    int bestMove = -1;
    if (currentTurn == 1) {
        bestScore = Integer.MIN_VALUE;
    } else {
        bestScore = Integer.MAX_VALUE;
    }

    List<Integer> nextMoves = generatemoves(board);
    if (nextMoves.isEmpty()) {
        bestScore = evaluateTheBoard(board);
    } else {
        for (int move : nextMoves) {
            int[] nextBoard = new int[9];
            for (int i = 0; i < nextBoard.length; i ++) {
                nextBoard[i] = board[i];
            }
            nextBoard[move] = currentTurn;
            currentScore = evaluateMove(nextBoard, nextTurn())[0];
            if (currentTurn == 1) {
                if (currentScore > bestScore) {
                    bestScore = currentScore;
                    bestMove = move;
                }
            } else {
                if (currentScore < bestScore) {
                    bestScore = currentScore;
                    bestMove = move;
                }
            }
        }
    }
    return new int[] {bestScore, bestMove};
}

function evaluateBoard(board, currentTurn)

// check if the game has already ended:
if WhiteHasWon then return -10
if BlackHasWon then return +10

// WhiteHasWon returns true if there exists one or more winning 3-in-a-row line for white. 
// (You will have to scan for all 8 possible 3-in-a-row lines of white pieces)
// BlackHasWon returns true if there exists one or more winning 3-in-a-row line for black

if no legal moves, return 0 // draw

// The game isn't over yet, so look ahead:
bestMove = notset
resultScore = notset
for each legal move i for currentTurn,
   nextBoard = board
   Apply move i to nextBoard
   score = evaluateBoard(nextBoard, NOT currentTurn).score
   if score is <better for currentTurn> than resultScore, then   
      resultScore = score
      bestMove = move i
return (resultScore, bestMove)