Java 在Tictoe minimax算法中实现alpha-beta修剪

Java 在Tictoe minimax算法中实现alpha-beta修剪,java,algorithm,artificial-intelligence,tic-tac-toe,minimax,Java,Algorithm,Artificial Intelligence,Tic Tac Toe,Minimax,在我的方法newminimax49中,我有一个minimax算法,该算法利用了本文中向我建议的其他通用改进。该方法使用一个简单的启发式电路板评估函数。我的问题基本上是关于alpha-beta修剪,即我的minimax方法是否使用alpha-beta修剪。据我所知,我相信这是真的,然而,我用来实现它的东西似乎太简单了,不可能是真的。此外,其他人建议我使用alpha-beta剪枝,正如我所说的,我认为我的minimax方法已经做到了,这让我相信我在这里做的是其他事情。这是我的新minimax49:

在我的方法newminimax49中,我有一个minimax算法,该算法利用了本文中向我建议的其他通用改进。该方法使用一个简单的启发式电路板评估函数。我的问题基本上是关于alpha-beta修剪,即我的minimax方法是否使用alpha-beta修剪。据我所知,我相信这是真的,然而,我用来实现它的东西似乎太简单了,不可能是真的。此外,其他人建议我使用alpha-beta剪枝,正如我所说的,我认为我的minimax方法已经做到了,这让我相信我在这里做的是其他事情。这是我的新minimax49:

//This method returns a 2 element int array containing the position of the best possible 
//next move and the score it yields. Utilizes memoization and supposedly alpha beta 
//pruning to achieve better performance. Alpha beta pruning can be seen in lines such as:
/*if(bestScore==-10)
     break;*/
//This basically means that if the best score achieved is the best possible score
//achievable then stop exploring the other available moves. Doing thing I believe
//I'm applying the same principle of alpha beta pruning.
public int[] newminimax49(){
    int bestScore = (turn == 'O') ? +9 : -9;    //X is minimizer, O is maximizer
    int bestPos=-1;
    int currentScore;
    //boardShow();
    String stateString = "";                                                
    for (int i=0; i<state.length; i++) 
        stateString += state[i];                        
    int[] oldAnswer = oldAnswers.get(stateString);                          
    if (oldAnswer != null) 
        return oldAnswer;
    if(isGameOver2()!='N'){
        //s.boardShow();
        bestScore= score();
    }
    else{
        //s.boardShow();
        int i=0;
        for(int x:getAvailableMoves()){
            if(turn=='X'){  //X is minimizer
                setX(x);
                //boardShow();
                //System.out.println(stateID++);
                currentScore = newminimax49()[0];
                revert(x);
                if(i==0){
                    bestScore = currentScore;
                    bestPos=x;
                    if(bestScore==-10)
                        break;
                }
                else if(currentScore<bestScore){
                    bestScore = currentScore;
                    bestPos=x;
                    if(bestScore==-10)
                        break;
                }
            }
            else {  //O is maximizer
                setO(x);
                //boardShow();
                //System.out.println(stateID++);
                currentScore = newminimax49()[0];
                revert(x);
                //boardShow();
                if(i==0){
                    bestScore = currentScore;
                    bestPos=x;
                    if(bestScore==10)
                        break;
                }

                else if(currentScore>bestScore){
                    bestScore = currentScore;
                    bestPos = x;
                    if(bestScore==10)
                        break;
                }
            }
            i++;
        }
    }
    int[] answer = {bestScore, bestPos};                                    
    oldAnswers.put (stateString, answer);                                   
    return answer;
}
种子设定者:

//Sets an X at a certain location and updates the turn, countX and lastAdded variables
public void setX(int i){
    state[i]='X';
    DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='X';
    turn='O';
    countX++;
    lastAdded=i;
}

//Sets an O at a certain location and updates the turn, countO and lastAdded variables
public void setO(int i){
    state[i]='O';
    DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='O';
    turn='X';
    countO++;
    lastAdded=i;
}
恢复,简单地恢复移动。例如,如果一个X被放置在位置0,则revert(0)会在其位置设置一个“-”,并更新setX更改的变量:

public void revert(int i){
    state[i]='-';
    DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='-';
    if(turn=='X'){
        turn = 'O';
        countO--;
    }
    else {
        turn = 'X';
        countX--;
    }
}

所以,这看起来像是阿尔法-贝塔删减吗?如果不是的话,我怎么能做到呢?

你已经在使用某种“简化”的阿尔法-贝塔:目前,每当一个玩家找到一个获胜的位置时,你都在删减

一个合适的AB会给自己传递一个Alpha值和一个Beta值,以分别确定玩家将达到的最小值和最大值。在那里,每当分数低于或等于对方玩家当前的“最坏情况”时,你就会删减


在您的情况下,您将不仅能够删减获胜分数(如您目前所做的),还可以删减某些分数为0的分数

是的,这似乎是互联网上的通用词。问题是,对于使用简单求值函数的minimax方法,没有太多关于如何实现它的信息。你有没有想过我将如何做一些类似于你在这里为我的newminimax49方法所建议的事情?@Omar只需以更复杂的计算方式实现它:向方法中添加两个整数参数(alpha/beta),以正确的alpha-beta方式更新这些值,你会没事的。要了解更多信息,我仍然建议使用chessprogramming wiki。。。
public char isGameOver2(){
    char turnOpp;
    int count;
    if(turn=='X'){
        count=countO;
        turnOpp='O';
    }
    else {
        count=countX;
        turnOpp='X';
    }
    if(count>=n){ 
        //^No win available if each player has less than n seeds on the board

        //Checking begins
                //DDState[RowCol.get(lastAdded)[0]][RowCol.get(lastAdded)[1]]=turn;

                //Check column for win
                for(int i=0; i<n; i++){
                    if(DDState[i][RowCol.get(lastAdded)[1]]!=turnOpp)
                        break;
                    if(i==(n-1)){
                        //DDState[RowCol.get(x)[0]][RowCol.get(x)[1]]='-';
                        return turnOpp;
                    }
                }

                //Check row for win
                for(int i=0; i<n; i++){
                    if(DDState[RowCol.get(lastAdded)[0]][i]!=turnOpp)
                        break;
                    if(i==(n-1)){
                        //DDState[RowCol.get(x)[0]][RowCol.get(x)[1]]='-';
                        return turnOpp;
                    }
                }

                //Check diagonal for win
                if(RowCol.get(lastAdded)[0] == RowCol.get(lastAdded)[1]){

                    //we're on a diagonal
                    for(int i = 0; i < n; i++){
                        if(DDState[i][i] != turnOpp)
                            break;
                        if(i == n-1){
                            //DDState[RowCol.get(x)[0]][RowCol.get(x)[1]]='-';
                            return turnOpp;
                        }
                    }
                }

                //check anti diagonal 
                for(int i = 0; i<n; i++){
                    if(DDState[i][(n-1)-i] != turnOpp)
                        break;
                    if(i == n-1){
                        //DDState[RowCol.get(x)[0]][RowCol.get(x)[1]]='-';
                        return turnOpp;
                    }
                }

                //check for draw
                if((countX+countO)==(n*n))
                    return 'D';
            }
    return 'N';
}
public void boardShow(){
    if(n==3){
        System.out.println(stateID);
        for(int i=0; i<=6;i+=3)
            System.out.println("["+state[i]+"]"+" ["+state[i+1]+"]"+" ["+state[i+2]+"]");
        System.out.println("***********");
    }
    else {
        System.out.println(stateID);
        for(int i=0; i<=12;i+=4)
            System.out.println("["+state[i]+"]"+" ["+state[i+1]+"]"+" ["+state[i+2]+"]"+" ["+state[i+3]+"]");
        System.out.println("***********");
    }   
}
public int score(){
    if(isGameOver2()=='X')
        return -10;
    else if(isGameOver2()=='O')
        return +10;
    else 
        return 0;
}
//Sets an X at a certain location and updates the turn, countX and lastAdded variables
public void setX(int i){
    state[i]='X';
    DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='X';
    turn='O';
    countX++;
    lastAdded=i;
}

//Sets an O at a certain location and updates the turn, countO and lastAdded variables
public void setO(int i){
    state[i]='O';
    DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='O';
    turn='X';
    countO++;
    lastAdded=i;
}
public void revert(int i){
    state[i]='-';
    DDState[RowCol.get(i)[0]][RowCol.get(i)[1]]='-';
    if(turn=='X'){
        turn = 'O';
        countO--;
    }
    else {
        turn = 'X';
        countX--;
    }
}