C++ Tic-Tac-Toe和Minimax-在微控制器上创建不完美的AI_C++_Artificial Intelligence_Microcontroller_Tic Tac Toe_Minimax

C++ Tic-Tac-Toe和Minimax-在微控制器上创建不完美的AI

c++ artificial-intelligence

C++ Tic-Tac-Toe和Minimax-在微控制器上创建不完美的AI,c++,artificial-intelligence,microcontroller,tic-tac-toe,minimax,C++,Artificial Intelligence,Microcontroller,Tic Tac Toe,Minimax,我已经在微控制器上创建了一个Tic-Tac-Toe游戏，包括一个完美的AI（完美的意思是它不会输）。我没有为此使用极大极小算法，只是一个具有所有可能和最佳移动的小状态机。我现在的问题是，我想实现不同的困难（简单、中等和困难）。到目前为止，人工智能将是最困难的。因此，我一直在思考如何以最佳方式完成这项工作，最终我想使用minimax算法，但它可以计算所有比赛位置的所有分数，因此有时我也可以选择次优分数，而不是最佳分数。因为我不能总是在微控制器本身上进行所有这些计算，所以我想创建一个可以在我的计

我已经在微控制器上创建了一个Tic-Tac-Toe游戏，包括一个完美的AI（完美的意思是它不会输）。我没有为此使用极大极小算法，只是一个具有所有可能和最佳移动的小状态机。我现在的问题是，我想实现不同的困难（简单、中等和困难）。到目前为止，人工智能将是最困难的。因此，我一直在思考如何以最佳方式完成这项工作，最终我想使用

minimax

算法，但它可以计算所有比赛位置的所有分数，因此有时我也可以选择次优分数，而不是最佳分数。因为我不能总是在微控制器本身上进行所有这些计算，所以我想创建一个可以在我的计算机上运行的小程序，它可以为我提供所有可能的电路板状态阵列（关于对称性，等等，以最小化所用的存储空间）及其相应的分数。为此，我首先尝试实现minimax算法本身，关于

深度

，以便正确计算每个状态的

分数。然后，它应该在一个数组中返回所有的最佳移动（目前）。然而，它似乎并不那么有效。我试着用一些printf
行调试它。以下是迄今为止minimax
函数和我的主要函数的代码：
    static int minimax(int *board, int depth)
{
    int score;
    int move = -1;
    int scores[9];
    int nextDepth;

    printf("\n----- Called Minimax, Depth: %i -----\n\n", depth);

    if(depth%2 ==1){
        player = -1;
    } else {
        player = 1;
    }

    printf("Player: %i\n---\n", player);

    if(isWin(board) != 0){
        score = (10-depth)*winningPlayer;

        printf("Player %i won on depth %i\n", winningPlayer, depth);
        printf("Resulting score: (10-%i)*%i = %i\nScore returned to depth %i\n---\n", depth, winningPlayer, score, depth-1);

        return score;
    }

    score = -20;
    nextDepth = depth+1;

    printf("Next depth is %i\n---\n", nextDepth);

    int i;
    for(i=0; i<9; i++){
        if(board[i] == 0) {

            if(nextDepth%2 ==0) {
                player = -1;
            } else {
                player = 1;
            }

            printf("Found vacant space at position %i\n", i);
            printf("Set value of position %i to %i\n---\n", i, player);

            board[i] = player;
            int thisScore = minimax(board, nextDepth);

            printf("Value of the move at position %i on next depth %i is %i\n---\n", i, nextDepth, thisScore);

            scores[i] = thisScore;
            if(thisScore > score){

                printf("New score value is greater than the old one: %i < %i\n---\n", thisScore, score);

                score = thisScore;
                move = i;
                g_moves[nextDepth-1] = move;

                printf("Score was set to %i\n", thisScore);
                printf("Remembered move %i\n---\n", move);

            }
            board[i] = 0;

            printf("Value of position %i was reset to 0 on next depth %i\n---\n", i, nextDepth);

        }
    }

    if(move == -1) {

        printf("Game ended in a draw.\n Returned score: 0\n---\n");

        return 0;
    }

    printf("Move at position %i was selected on next depth %i\n", move, nextDepth);
    printf("Returning score of %i to depth %i\n---\n", score, depth);


    return score;
}

int isWin(int *board)
{
    unsigned winningBoards[8][3] = {
        {board[0], board[1], board[2],},
        {board[3], board[4], board[5],},
        {board[6], board[7], board[8],},
        {board[0], board[3], board[6],},
        {board[1], board[4], board[7],},
        {board[2], board[5], board[8],},
        {board[0], board[4], board[8],},
        {board[2], board[4], board[6],},
    };

    int i;
    for(i=0; i<8; i++){
        if( (winningBoards[i][0] != 0) &&
            (winningBoards[i][0] == winningBoards[i][1]) &&
            (winningBoards[i][0] == winningBoards[i][2])){
                winningPlayer = winningBoards[i][0];
                return winningPlayer;
            }
    }
    return 0;
}

此外，我还有一些变量：
//1  = Beginning Player
//-1 = second Player
static int player;
static int winningPlayer = 0;
static int g_moves[9];

/* 0 1 2
 * 3 4 5
 * 6 7 8
 */
int initBoard[9] = {
    0, 0, 0,
    0, 0, 0,
    0, 0, 0,
};

int board[9];

以及我的致胜功能：
    static int minimax(int *board, int depth)
{
    int score;
    int move = -1;
    int scores[9];
    int nextDepth;

    printf("\n----- Called Minimax, Depth: %i -----\n\n", depth);

    if(depth%2 ==1){
        player = -1;
    } else {
        player = 1;
    }

    printf("Player: %i\n---\n", player);

    if(isWin(board) != 0){
        score = (10-depth)*winningPlayer;

        printf("Player %i won on depth %i\n", winningPlayer, depth);
        printf("Resulting score: (10-%i)*%i = %i\nScore returned to depth %i\n---\n", depth, winningPlayer, score, depth-1);

        return score;
    }

    score = -20;
    nextDepth = depth+1;

    printf("Next depth is %i\n---\n", nextDepth);

    int i;
    for(i=0; i<9; i++){
        if(board[i] == 0) {

            if(nextDepth%2 ==0) {
                player = -1;
            } else {
                player = 1;
            }

            printf("Found vacant space at position %i\n", i);
            printf("Set value of position %i to %i\n---\n", i, player);

            board[i] = player;
            int thisScore = minimax(board, nextDepth);

            printf("Value of the move at position %i on next depth %i is %i\n---\n", i, nextDepth, thisScore);

            scores[i] = thisScore;
            if(thisScore > score){

                printf("New score value is greater than the old one: %i < %i\n---\n", thisScore, score);

                score = thisScore;
                move = i;
                g_moves[nextDepth-1] = move;

                printf("Score was set to %i\n", thisScore);
                printf("Remembered move %i\n---\n", move);

            }
            board[i] = 0;

            printf("Value of position %i was reset to 0 on next depth %i\n---\n", i, nextDepth);

        }
    }

    if(move == -1) {

        printf("Game ended in a draw.\n Returned score: 0\n---\n");

        return 0;
    }

    printf("Move at position %i was selected on next depth %i\n", move, nextDepth);
    printf("Returning score of %i to depth %i\n---\n", score, depth);


    return score;
}

int isWin(int *board)
{
    unsigned winningBoards[8][3] = {
        {board[0], board[1], board[2],},
        {board[3], board[4], board[5],},
        {board[6], board[7], board[8],},
        {board[0], board[3], board[6],},
        {board[1], board[4], board[7],},
        {board[2], board[5], board[8],},
        {board[0], board[4], board[8],},
        {board[2], board[4], board[6],},
    };

    int i;
    for(i=0; i<8; i++){
        if( (winningBoards[i][0] != 0) &&
            (winningBoards[i][0] == winningBoards[i][1]) &&
            (winningBoards[i][0] == winningBoards[i][2])){
                winningPlayer = winningBoards[i][0];
                return winningPlayer;
            }
    }
    return 0;
}

如果您需要任何其他信息来帮助我，如果我自己有，我会很乐意给您
提前谢谢
编辑：
因此，我重写了我的minimax
函数，以便它现在使用控制台（cmd:./NAME\u OF_file>DEST\u NAME.txt，在相应文件夹中）在.txt文件上打印所有可能的电路板状态。代码如下：
int minimax(int *board, int depth)
{
    g_node++;
    int player;
    int move = -1;
    int score = -20;
    int thisScore = -20;
    int i;

    if(isWin(board) != 0){
        printf("\nNode: %i\n", g_node);
        printf("Board state:");
        for(i=0;i<9;i++) {
            if((i%3) == 0)
                printf("\n");
            printf("%2i ", board[i]);
        }
        printf("\n");
        printf("has a score of %i\n", (10-depth)*winningPlayer);
        return (10-depth)*winningPlayer;
    }


    if(depth%2 ==1){
            player = -1;
        } else {
            player = 1;
        }
    for(i=0; i<9; i++){
        if(board[i] == 0){
            board[i] = player;
            thisScore = minimax(board, depth+1);
            if(thisScore > score){
                score = thisScore;
                move = i;
            }
            board[i] = 0;
        }
    }

    printf("\nNode: %i\n", g_node);
    printf("Board state:");
    for(i=0;i<9;i++) {
        if((i%3) == 0)
            printf("\n");
        printf("%2i ", board[i]);
    }
    printf("\n");

    if(move == -1){
        printf("has a score of 0\n");
        return 0;

    }
    printf("has a score of %i\n", score);
    return score;
}

编辑2:
现在，我添加了另一个名为printScoredBoards的函数，该函数基本上应该执行我在上一次编辑中描述的操作，但是它有一个问题。
因为在第五步之后，如果你的对手玩得够笨，你总是有可能赢，而且因为minimax
尝试了所有的可能性，包括那些可能性，用下面的代码，我得到了一个得分为15秒的空牌
void printScoredBoards(int *board, int depth)
{
    int player;
    int scoredBoard[9] = {0,0,0,0,0,0,0,0,0,};
    int i;
    if(isWin(board) == 0){
        if(depth%2 ==1){
            player = -1;
        } else {
            player = 1;
        }

        for(i=0; i<9; i++){
            if(board[i] == 0){
                board[i] = player;
                scoredBoard[i] = minimax(board, depth+1)+10;
                printScoredBoards(board, depth+1);
                board[i] = 0;
            }
        }
        printf("Scored board:");
        dumpTable(scoredBoard);
        printf("\n");
    }
}

void打印记分板（int*board，int-depth）
{
国际球员；
int记分板[9]={0,0,0,0,0,0,0,0，}；
int i；
如果（isWin（板）==0）{
如果（深度%2==1）{
玩家=-1；
}否则{
玩家=1；
}
对于（i=0；i我认为试图通过全深度分析获得第二个最佳移动是过分的。不要通过限制最小值的深度来探索整棵树（2个提前移动允许获胜，但AI仍然强大），或者只是对一个真正不完美的AI使用随机移动。
对于tic tac toe这样一个非常小的状态空间，你可以只保留完美AI，每次“掷骰子”来决定你选择的是最佳步骤还是另一个随机步骤。此外，medium可以有一个检查来防止其他玩家在可能的情况下获胜。实际上，我这不是一个坏主意，我想如果这件事变成了一条死胡同，我会这么做，但我仍然想知道我在哪里犯了错误。很可能是我一直在忽略的事情。我也会支持洛罗，我只是想，计算能力足以做minmax，或者至少用alpha-beta做minmax-修剪。它是什么样的微控制器？我想大致了解一下它的计算能力。它上面有一个MB91F464AB芯片~编辑了我的原始帖子~你精确限制深度是什么意思？你不探索整个游戏树，但只前进了两步。（int depth=getDepth（board）；变成int depth=2；）那么，我应该用什么标准来为每一步得分呢？@litimlin:你可以简单地做Lose<（notFinished/Draw）
~编辑我的原始帖子~
void printScoredBoards(int *board, int depth)
{
    int player;
    int scoredBoard[9] = {0,0,0,0,0,0,0,0,0,};
    int i;
    if(isWin(board) == 0){
        if(depth%2 ==1){
            player = -1;
        } else {
            player = 1;
        }

        for(i=0; i<9; i++){
            if(board[i] == 0){
                board[i] = player;
                scoredBoard[i] = minimax(board, depth+1)+10;
                printScoredBoards(board, depth+1);
                board[i] = 0;
            }
        }
        printf("Scored board:");
        dumpTable(scoredBoard);
        printf("\n");
    }
}