This is the ipython notebook you should use as a template for your agent. Your task for this assignment is to implement a winning AI for the game of Isolation, as specified in the assignment pdf you have been issued.

The following random agent just selects a move out of the set of legal moves. Note that your agent, when asked for a move, is already provided with the set of moves available to it. This is done for your convenience. If your agent attempts to perform an illegal move, it will lose, so please refrain from doing so. It is also provided with a function that, when invoked, returns the amount of time left for your agent to make its move. If your agent fails to make a move in the alloted time, it will lose.



In [98]:

    
from random import randint

class RandomPlayer():
    """Player that chooses a move randomly."""
    def move(self, game, legal_moves, time_left):
        if not legal_moves: return (-1,-1)
        return legal_moves[randint(0,len(legal_moves)-1)]

The following are functions that might be useful to you in developing your agent:

game.get_legal_moves(): Returns a list of legal moves for the active player.

game.get_opponent_moves(): Returns a list of legal moves for the inactive player.

game.forecast_move(move): This returns a new board, whose state is the result of making the move specified on the current board.

game.get_state(): This returns a 2D array containing a copy of the explicit state of the board. 

game.is_winner(player): Returns whether your player agent has won.

game.is_opponent_winner(player): Returns whether your player's opponent has won.    

game.print_board(): Returns a string representation of the game board. This should be useful for debugging.



In [99]:

    
class HumanPlayer():
    """Player that chooses a move according to
    user's input."""
    def move(self, game, legal_moves, time_left):
        print('\t'.join(['[%d] %s'%(i,str(move)) for i,move in enumerate(legal_moves)] ))
        
        valid_choice = False
        while not valid_choice:
            try:
                index = int(raw_input('Select move index:'))
                valid_choice = 0 <= index < len(legal_moves)

                if not valid_choice:
                    print('Illegal move! Try again.')
            
            except ValueError:
                print('Invalid index! Try again.')
        return legal_moves[index]

This is the first part of the assignment you are expected to implement. It is the evaluation function we've been using in class. The score of a specified game state is just the number of moves open to the active player.



In [100]:

    
class OpenMoveEvalFn():
    
    def score(self, game, maximizing_player):
        if maximizing_player:
            eval_func = len(game.get_legal_moves())
        else:
            eval_func = len(game.get_opponent_moves())
        return eval_func

The following is a 
Custom evaluation function that acts
however you think it should. This is not
required but highly encouraged if you
want to build the best AI possible.



In [101]:

    
class CustomEvalFn():

    def score(self, game, maximizing_player):
        #maximize your own moves and minimize opponents moves
        if maximizing_player:
            my_moves = len(game.get_legal_moves())
            opponents_moves = len(game.get_opponent_moves())
        else:
            opponents_moves = len(game.get_legal_moves())
            my_moves = len(game.get_opponent_moves())
            
        #maximize this
        eval_func = my_moves - opponents_moves
        return eval_func

Implement a Player below that chooses a move using your evaluation function and a depth-limited minimax algorithm with alpha-beta pruning. You must finish and test this player to make sure it properly uses minimax and alpha-beta to return a good move in less than 500 milliseconds.



In [102]:

    
class CustomPlayer():
    
    def __init__(self, search_depth=15, eval_fn=CustomEvalFn(), threshold = 50):
        self.eval_fn = eval_fn
        self.search_depth = search_depth
        self.thresh = threshold

    def move(self, game, legal_moves, time_left):

        if game.move_count<2:
            return legal_moves[randint(0,len(legal_moves)-1)]
        best_move = self.alphabeta_id(game, time_left, self.search_depth)
        # you will eventually replace minimax with alpha-beta
        return best_move

    def utility(self, game, maximizing_player):
        
        if game.is_winner(self):
            return 50000

        if game.is_opponent_winner(self):
            return -50000

        return self.eval_fn.score(game, maximizing_player)
    
    
    def minimax(self, game, time_left, depth=float("inf"), maximizing_player=True):
        #terminal states
        #if maximizing_player and realize opponent has won
        #if minimizing player and realize max has won
        if (maximizing_player and game.is_opponent_winner(self)) or (not maximizing_player and game.is_winner(self)):
            return ((-1,-1), self.utility(game, maximizing_player))
        #if realize that time_left isn't much, we have an arbitrary threshold of 50 ms here/max depth is reached
        if depth==0 or time_left()<self.thresh:
            return ((-1,-1), self.utility(game, maximizing_player))
        
        #get actions
        actions = game.get_legal_moves()
        best_move = (-1,-1)
        
        #if maximizing player, get minimax value for all actions and choose the move which has maxMIN value  
        if maximizing_player:
            best_val = float("-inf")
            for a in actions:
                _, score_of_action = self.minimax(game.forecast_move(a), time_left, depth-1, False);
                best_move, best_val = (a, score_of_action) if score_of_action>=best_val else (best_move, best_val)
                if time_left()<self.thresh:
                    return (best_move, best_val)
        #if minimizing player find minimax value for all actions and choose one which has minMAX value
        else:
            best_val = float("inf")
            for a in actions:
                _, score_of_action = self.minimax(game.forecast_move(a), time_left, depth-1, True);
                best_move, best_val = (a, score_of_action) if score_of_action<=best_val else (best_move, best_val)
                if time_left()<self.thresh:
                    return (best_move, best_val)
                
        return (best_move, best_val)   
    
    
    def alphabeta(self, game, time_left, depth=float("inf"), alpha=float("-inf"), beta=float("inf"), maximizing_player=True):
        #terminal states
        #if maximizing_player and realize opponent has won
        #if minimizing player and realize max has won
        if (maximizing_player and game.is_opponent_winner(self)) or (not maximizing_player and game.is_winner(self)):
            return ((-1,-1), self.utility(game, maximizing_player))
        #if realize that time_left isn't much, we have an arbitrary threshold of 75 ms here/max depth is reached
        if depth==0 or time_left()<self.thresh:
            return ((-1,-1), self.utility(game, maximizing_player))
        
        #get actions
        actions = game.get_legal_moves()
        best_move = (-1,-1)
        
        #if maximizing player, get alphabeta value for all actions and choose the move 
        #which has max value plus prune those which do not adher to alphabeta limits
        if maximizing_player:
            best_val = float("-inf")
            for a in actions:
                _, score_of_action = self.alphabeta(game.forecast_move(a), time_left, depth-1, alpha, beta, False);
                if score_of_action>best_val:
                    best_move = a
                    best_val = score_of_action
                if best_val>beta:
                    return a, best_val
                alpha = max(alpha, best_val)
                if alpha>=beta:
                    return a, alpha
                if time_left()<self.thresh:
                    return (best_move, best_val)
        else:
            best_val = float("inf")
            for a in actions:
                _, score_of_action = self.alphabeta(game.forecast_move(a), time_left, depth-1, alpha, beta, True);
                if score_of_action<best_val:
                    best_move = a
                    best_val = score_of_action
                if best_val<alpha:
                    return a, best_val
                beta = min(beta, best_val)
                if beta<=alpha:
                    return a, beta
                if time_left()<self.thresh:
                    return (best_move, best_val)
        
        return (best_move, best_val)
    
    def alphabeta_id(self, game, time_left, max_depth):
    
        for depth in range(1, max_depth):
            best_move, _ = self.alphabeta(game, time_left, depth)
            if time_left()<self.thresh:
                break
        return (best_move)

The following are some basic tests you can use to sanity-check your code. You will also be provided with a test server to which you will be able to submit your agents later this week. Good luck!



In [103]:

    
"""Example test you can run
to make sure your AI does better
than random."""
from isolation import Board
if __name__ == "__main__":
    r = RandomPlayer()
    h = CustomPlayer()
    game = Board(h,r)
    game.play_isolation(500)









    



####################
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 

####################
####################
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
1 | 0 | 0 | 0 | 0 | 

####################
####################
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 2 | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
1 | 0 | 0 | 0 | 0 | 

####################
####################
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 2 | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
X | 1 | 0 | 0 | 0 | 

####################
####################
0 | 0 | 0 | 0 | 2 | 
0 | 0 | 0 | X | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
X | 1 | 0 | 0 | 0 | 

####################
####################
0 | 0 | 0 | 0 | 2 | 
0 | 0 | 0 | X | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | 1 | 

####################
####################
0 | 0 | 0 | 2 | X | 
0 | 0 | 0 | X | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | 1 | 

####################
####################
0 | 0 | 0 | 2 | X | 
0 | 1 | 0 | X | 0 | 
0 | 0 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | 0 | 0 | X | X | 
0 | 1 | 0 | X | 0 | 
0 | 2 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | 1 | 0 | X | X | 
0 | X | 0 | X | 0 | 
0 | 2 | 0 | 0 | 0 | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | 1 | 0 | X | X | 
0 | X | 0 | X | 0 | 
0 | X | 0 | 0 | 2 | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | X | 0 | X | X | 
0 | X | 1 | X | 0 | 
0 | X | 0 | 0 | 2 | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | X | 0 | X | X | 
0 | X | 1 | X | 0 | 
0 | X | 2 | 0 | X | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | X | 0 | X | X | 
0 | X | X | X | 0 | 
0 | X | 2 | 1 | X | 
0 | 0 | 0 | 0 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | X | 0 | X | X | 
0 | X | X | X | 0 | 
0 | X | X | 1 | X | 
0 | 0 | 0 | 2 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | X | 0 | X | X | 
0 | X | X | X | 0 | 
0 | X | X | X | X | 
0 | 0 | 1 | 2 | 0 | 
X | X | 0 | 0 | X | 

####################
####################
0 | X | 0 | X | X | 
0 | X | X | X | 0 | 
0 | X | X | X | X | 
0 | 0 | 1 | X | 2 | 
X | X | 0 | 0 | X | 

####################
####################
0 | X | 0 | X | X | 
0 | X | X | X | 0 | 
0 | X | X | X | X | 
0 | 0 | X | X | 2 | 
X | X | 0 | 1 | X | 

####################
Illegal move at -1,-1.
Player 1 wins.



In [104]:

    
"""Example test you can run
to make sure your basic evaluation
function works."""
from isolation import Board

if __name__ == "__main__":
    sample_board = Board(RandomPlayer(),RandomPlayer())
    # setting up the board as though we've been playing
    sample_board.move_count = 3
    sample_board.__active_player__ = 0 # player 1 = 0, player 2 = 1
    # 1st board = 16 moves
    sample_board.__board_state__ = [
                [0,2,0,0,0],
                [0,0,0,0,0],
                [0,0,1,0,0],
                [0,0,0,0,0],
                [0,0,0,0,0]]
    sample_board.__last_player_move__ = [(2,2),(0,1)]

    # player 1 should have 16 moves available,
    # so board gets a score of 16
    h = OpenMoveEvalFn()
    print('This board has a score of %s.'%(h.score(sample_board)))









    



---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-104-3eac1772c942> in <module>()
     21     # so board gets a score of 16
     22     h = OpenMoveEvalFn()
---> 23     print('This board has a score of %s.'%(h.score(sample_board)))

TypeError: score() takes exactly 3 arguments (2 given)