minmax-wt-alpha-beta-pruning
A JavaScript library implementing
a generic minmax engine with alpha-beta pruning that can work with any game supplied by a client programmer.
Sections: Installation, Githu repo, Entry Point, Features, Main Concepts, How to use and Implementation details.
Installation
npm install minmax-wt-alpha-beta-pruning
Github repo
If you clone the github repo for experimentation, run make first. Look at the top-level Makefile
which installs the dependencies, builds and runs Flow (for static type checking) and runs the tests (Mocha).
For quick examples on how to use the library look at the test/ directory and in particular the files:
- minmax-test-silly-letter-game.js and
- minmax-test-number-sequence-game.js
… which implement two trivial games.
Entry Point
The library exports a single function:
import {minmax} from 'minmax-wt-alpha-beta-pruning';
Features
This library implements a general-purpose minmax algorithm (with alpha-beta pruning).
It can work with any two-player game that a client
programmer may define.
The client programmer is expected to plug a number of functions to the library and the engine comes up with the best move.
The functions supplied by the client programmer describe the game rules
and provide some hints of "strategy".
The concepts of the
min-max algorithm do not leak out to client programmers who only concern themselves with the rules of the game.
Well, actually the only concept that does leak is the number of plies to look ahead but this is hardly
intrinsic to the min-max algorithm. In what follows I am using the terms "library" / "engine" interchangeably.
The basic min-max algorithm is rather obvious and straightforward, any coder is capable of independently
inventing it even if they have read nothing about it. The clever part is the performance optimization
afforded by the alpha-beta pruning idea. I am not going to describe that here. Refer to online resources
or some textbook. But note that, at the time of this writing, the algorithm as presented in the eponymous Wikipedia
article is wrong. Once you read about alpha-beta pruning it seems simple at first but it's tricky to get it
right. This being said, despite the bug in the algorithm, the Wikipedia article does a passable job at
explaining the intuition behind the alpha-beta pruning idea. A clearer explanation is given in this Cornell course note.
Main Concepts
The engine imagines every game supplied to it as consisting of the following things:
- some opaque data structure representing the game state
- some opaque data structure representing a possible move
- a function that accepts a single argument: an object representing the state of the game and returns the list of moves that are possible at that game state (according to the rules of the game)
- a function that accepts two arguments: (a) a game state object, and (b) a move object, and returns a new game state (representing the new state of the game once the move is made)
- a function that accepts a single argument: an object representing the state of the game and returns a value that indicates whether the game has finished (and pronounces the winner or declares a tie or some other more nuanced outcome — e.g. in scoring games)
The engine makes no assumptions at all as to the data structure representing the game state. It is assumed
to be an opaque object whose properties the library does not try to access in any way. It is also assumed that this
object encodes which player is to move next (this is, after all, part of the game state so it should
come as no surprise). How the game state object, implicitly or explicitly, encodes which player is to move
next is up to the client programmer.
Similarly, the engine makes no assumptions as to the data structure representing moves in the game.
Again, it is treated as an opaque object that the library simply passes around.
Finally, the engine is totally ignorant of whatever data structure the client programmer uses to represent
the two players or the two "sides" and is not even dealing with such objects. The only objects it deals with
are, as explained:
… and they are both treated opaquely by the library. I.e. the library does not create them or access them: it simply accepts them and passes them around.
To invoke the engine, i.e. to call the minmax function, the client programmer has to supply four functions:
- Three of these functions describe the rules of the game.
- The fourth function provides some strategy or "cleverness" for the engine to use. How simple or nuanced this fourth function will be, is up to the client programmer.
The four functions provided by the client programmer have no concept of the minmax algorithm.
The client programmer does not need to have any understanding
of the minmax algorithm. In particular, the concept of a maximizing player and a minimizing player is kept
internal to the library and is not leaked to the client programmer.
This being said, some familiarity with the algorithm is helpful in order to have some mental model of how the
engine uses these functions, how they work together and also to appreciate the performance implications of
increasing the search depth ("number of plies" — explained later).
The four functions that the client programmer supplies are described in the subsections that follow:
terminalStateEval
This function serves a dual purpose: it pronounces whether the game has ended and, if so, the
outcome / score of the game (e.g. win or lose for a player or tie or some other score).
This function should accept an object representing the game state and return either:
- null if the game has not ended at this state according to the rules of the game.
… or:
- a number signifying the score / outcome of the game if the game has indeed ended on that state. I.e. if we are
on a state in which no more moves are possible. We call such a state, a "terminal state".
So it should look like this:
function terminalStateEval(gameState) {
}
PAY EXTRA ATTENTION TO THE FOLLOWING POINT:
The game engine expects the terminalStateEval function to always report the outcome, from the perspective
of the player who is to move next. Obviously, once the game has reached a terminal state there's going to be no
more moves to make, so no player will move next but it is still clear to which player we are referring when
we say "the player who moves next".
If this wording bothers you, think of it as "the player who would normally move next if this were not
a terminal state" or "the player other than the player who just made their move".
To provide an example, in a game with only win/lose outcomes, suppose a player just made a move
that won the game for them. The terminalStateEval function should then return negative infinity or some other very low
value (see similar discussion in the description of the evaluate function) as the game was lost for the other player
(who would normally get to move next if this were not a terminal state).
Depending on the game, it is quite possible that your
terminalStateEval function may return a non infinity value for a terminal state. This could be the case, e.g. if
the game allows draws or uses some numerical scoring mechanism. Finally, you are not even required to
return infinity on a terminal state that results in
victory or defeat for one of the players. Any sufficiently large (or small) value will do. The only requirements
the engine has are that:
- for non-terminal states you return null
- for terminal states you return a number with the understanding that the greater the number, the better the
situation is assumed to be from the perspective the player who would normally get to make the next move (if this
were not a terminal state).
That's all.
Note that the terminalStateEval is not meant to be some general evaluation function for non-terminal states (that's the
evaluate function which is described later). For non-terminal states simply return null and be done with it. You
should be able to implement terminalStateEval such that it is very fast. In most games that I can think
of (except Go), it is very easy to determine if the game has ended and who's the winner or what's the score.
Clever heuristics
and AI stuff belong in the evaluate function, not in terminalStateEval.
As a last note, if you already know a bit or two about the min-max algorithm, you may have come across
the concept of "leaf nodes".
Leaf nodes are game states in the game tree in which either the game has ended or analysis cannot proceed
any further due to the fact that the maximum analysis depth has been reached. You may then be wondering
how terminalStateEval relates to the concept of leaf nodes in the minmax algorithm. The answer is that
it is unrelated. Do not be confused by the minmax implementation-specific concept of "leaf nodes". The
terminalStateEval is simply required to report whether the game has ended under the rules
of the game, not whether this game state is a leaf node in the game tree created by the minmax algorithm (to which
it has no visibility to begin with). Obviously a terminal game state will always be a leaf in the game tree
(the converse is not true), but that doesn't need to concern you.
listMoves
This function accepts an object representing a game state and returns a non-empty list of possible (valid)
moves according to the rules of the game. It's as simple as that.
So it should look like this:
function listMoves(gameState) {
const returnValue = []
return returnValue;
}
The game engine will never call listMoves on a terminal game state. As such, listMoves should always return a
non-empty list of valid moves (the engine actually makes that assertion internally). If there are
no valid moves to make, then we are by definition on a terminal game state and we have the guarantee that listMoves
is never called on such a state.
nextState
This function should accept two arguments:
- an object representing the state of the game
- an object representing a (valid) move
… and should return the new state of the game after the move is made. That's all. The previous game state object should not be mutated.
So it should look like this:
function nextState(gameState, moveToMake) {
}
evaluate
This is the most challenging function that the client programmer has to provide.
This function accepts an object representing the game state and returns a number that expresses how good the
position is from the perspective of the moving player, i.e. from the perspective of the player who moves next
(not from the perspective of the player who just finished moving). The greater the returned value, the better
the position is understood to be from the perspective of the moving player. Recall that the exact same
contract applies to the terminalStateEval function. The engine will never call evaluate on a terminal game
state, i.e. a state on which terminalStateEval has returned a non-null value; it will simply use the value
returned from terminalStateEval as the evaluation of such a state.
So your evaluate function should look like this:
function evaluate(gameState) {
}
The engine is using terminalStateEval, not evaluate, to realize if the game has ended. Therefore the
library will not automatically assume that a call to evaluate that returns positive of negative infinity necessarily translates to a terminal
state.
This means that you are free to use positive or negative infinities as valid return values of evaluate
to simply denote hugely favorable or hugely unfavorable
non-terminal states during the course of the game. In other words, as long as terminalStateEval returns null, then
the state is understood to be non-terminal, regardless of the return value of evaluate.
Also, you are not required to use negative values for game states that are unfavorable
to the moving player. Simply returning a low value in such a case is enough. In short, you have total freedom in deciding
what's the numerical range of your evaluate function. Much like you have total liberty in deciding what's the range
of your terminalStateEval function. This being said, I think a reasonable approach would be that
evaluate always returns values in an interval that lies strictly within the interval defined by the lowest possible
and the highest possible value that terminalStateEval may return.
By having the contract that evaluate and terminalStateEval always evaluate from
the perspective of the moving player we dispense with the minmax algorithm-specific notions of "maximizing"
and "minimizing" players. Naturally, the internal implementation of the algorithm uses these concepts,
but the client programmer is not exposed to them. Also note that since all that's passed to the evaluate
or the terminalStateEval functions is a gameState object, you should structure game state objects in such a way
that they encode the information of which player moves next.
The reason the engine accepts two functions that deal with evaluation is that these two functions serve different purposes
and should have different execution profiles:
- terminalStateEval is used to establish if a game state is terminal or not, and (if it's terminal) the score of the game
- evaluate is used to evaluate a non-terminal game state.
There's nothing subjective or AI-ish about terminalStateEval, hence it should be very fast and be considered part of the rules
of the game.
In contrast, evaluate is where heuristics, subjective evaluation of positions and AI-ish stuff come into play. You would want to make
evaluate reasonably smart but not too heavy as it is better to have a more lightweight evaluate function and be able to descend into
a higher ply depth (ply is described in the next section), than to have an exhaustive but slow evaluate that will lead you to reduce the ply depth (so as to keep the total minmax
running time reasonable).
Before we move onto the next section that shows how to actually use the library, observe again that of the four functions
supplied by the client programmer, three correspond to the rules of the game (terminalStateEval, listMoves
and nextState) and one (evaluate) is more or less subjective. I am pointing out this distinction yet again because
it is reflected in the API which is described next.
How to use
The library exports a single function minmax which the client programmer invokes by supplying four mandatory
arguments:
- the present state of the game
- a game rules object that bundles together the terminalStateEval, listMoves and nextState functions
discussed in the Main Concepts section
- the evaluate function (also discussed in the Main Concepts section)
- the number of plies to look ahead
The game rules object collects the three functions that fully describe the rules of the game from the
perspective of the library. For any particular game, e.g. for the game of Chess, you only have to write the
three game rules functions once, but you may have multiple evaluate functions with different degrees of
sophistication or running time requirements. In other words, function evaluate supplies the "strategy"
or the "cleverness".
After the mandatory arguments, come three optional arguments which you're unlikely to need and which
we'll describe later.
For the time being, you can consider the minmax function to have the following signature:
function minmax(gameState, gameRules, evalute, numOfPlies) {..}
You import function minmax like this:
import {minmax} from 'minmax-wt-alpha-beta-pruning';
Function minmax returns an object with two properties 'bestMove' and 'evaluation':
-
bestMove: this is (drum roll) the best move to make (always from the perspective of the moving player). This can be
null in the following two cases:
- if the game state is a terminal state (there are no more moves to be made), or
- if the given number of plies is 0.
In the latter case, i.e. when called with 0 plies, the minmax function simply evaluates the game state without trying to find any moves.
-
evaluation: this is the evaluation of the game state from the perspective of the moving player (the greater, the better).
It is always non-null.
For most use cases you don't care about the evaluation returned and are just interested in the best move to make.
Note that the library treats the game state and move objects opaquely: it does not access them, nor does
it directly construct them. The library obtains move objects by invoking the
listMoves function supplied by the client programmer. Similarly, it obtains additional "game state" objects by invoking
the nextState function (also supplied by the client programmer).
mandatory arguments to the minmax function
The four arguments needed in every invocation of the minmax function are described below:
gameState
An opaque object which the library doesn't access but simply passes to the functions supplied
in the next arguments.
gameRules
An object that bundles together the three functions listMoves, nextState, terminalStateEval
described in the Main Concepts section:
{
listMoves : function (gameState) {}
nextState : function (gameState, move) {}
terminalStateEval: function (gameState) {}
}
These three functions fully define the rules of the game from the perspective of the engine.
evaluate
This is simply the evaluate function described in the Main Concepts section. This function supplies the "strategy" of the game.
numOfPlies
This is the number of plies that the algorithm is instructed to look ahead. I.e. it's the depth of the
game tree constructed by the algorithm. A ply is a move by a single player.
The term 'ply' is used for clarity, since different games define 'turn' and 'move' quite differently. E.g. a move in
Chess consists of each player taking a turn, i.e. a move in Chess consists of two plies. The numOfPlies argument
must be an integer greater than or equal to 0. Calling the minmax function with a ply argument of 0
essentially results in the engine simply evaluating the game state without trying to find the best move.
Depending on the branching factor of the game, the worst case
size of the game tree that the algorithm will need to evaluate is the average branching number raised to the
number of plies. The alpha-beta pruning optimization will serve to reduce this worst-case size.
optional arguments to the minmax function
In addition to the four mandatory arguments that the minmax function takes:
function minmax(gameState, gameRules, evalute, numOfPlies) {..}
… there's three more optional arguments that should normally not concern a client programmer.
These are:
I am not going to explain what alpha and beta are. Refer to some textbook. There is no reason why you
might consider passing arguments other than the default for these two parameters (negative and positive
infinity respectively), except for experimentation or to make the algorithm dumber. But you can always
make the algorithm dumber by decreasing the number of plies.
The statisticsHook is only provided for, you guessed it, statistics and I've only used it during active
development and testing. It allowed me to accumulate statistics such as number of nodes visited, number of
pruning incidents, etc. It is used in the testing code (directory /test). It served to provide me with
some visibility and assurance that the algorithm operated in the way I expected it to operate. I don't
think it's of any utility to the client programmers.
Implementation details
There is a common further optimization in alpha-beta pruning which is to arrange the children nodes in the
game tree in such a way so as to maximize the possibility of pruning.
For the client programmer to activate
this kind of optimization they only thing they have to do is to return the list of valid moves in the listMoves
function in a way that places stronger moves, as much as possible, towards the beginning of the list of valid moves.
Note that
returning the list of moves
from listMoves in the suggested order (stronger moves first) is not necessary for alpha-beta pruning to occur.
Alpha-beta pruning
will very likely happen even if you return the moves in no particular order. It is simply that by returning the moves
in this manner, you maximize the number of pruning incidents that are likely to occur (the more nodes / sub-trees are pruned, the better the
space requirements and the running time of the algorithm).
An obvious caveat is that there is a trade-off involved in this, and so the client programmer wouldn't
want to spend too much time evaluating and sorting possible moves inside listMoves
(or perhaps no time at all).
For client programmers who wish to use Flow for static type checking
NOTE: DON'T READ THIS SECTION IF YOU ARE NOT USING Flow.
The library is statically typed using Flow.
Of course, you, as a client programmer, don't have to use Flow.
However, if it so happens that you are familiar with Flow, then you may wish to have a look
at the following two files (in addition to reading this documentation):
- minmax-interface.js: type definitions
- index.js: the single exported function (minmax) and some exported Flow definitions
The library type-checks under Flow with no errors.
It is not a requirement for the client programmer to use Flow but if they do, then they are able to
import a number of exported types or look at the type definitions in file minmax-interface.js.
These type definitions communicate clearly
the API of the library and can be consulted alongside this documentation.
E.g. the type of the minmax function is defined in minmax-interface.js thus:
export type MinMaxFT<GameStateGTP, MoveGTP> =
(gameState : GameStateGTP
, gameRules : IGameRules<GameStateGTP, MoveGTP>
, evaluate: EvaluateFT<GameStateGTP>
, plies: number
, alpha?: number
, beta?: number
, statisticsHook?: TMinMaxStatistics<GameStateGTP>
) => TMinMaxResult<MoveGTP>;
You can see from the above type definition that function minmax uses generic type parameters (as
both GameStateGTP and MoveGTP are generic type parameters that parametrize the MinMaxFT
type — incidentally, "GTP" stands for "Generic Type Parameter") and is agnostic as to
the actual structures the client programmer uses to represent game state or game moves.
All Flow types defined in minmax-interface.js that are likely to be found useful by a
client programmers using Flow are imported and re-exported from the index.js file which
defines the public interface.