0:14

In this video I'm going to talk about Minimax.

A Minimax strategy is a game playing strategy

in which you try to minimize the maximum loss.

All right?

Now at first this might seem counter-intuitive.

Right?

When you're playing a game you want to do the best that you can.

But the problem here is your opponent.

Right?

Your opponent's not going to just let you do whatever you want.

So the basic idea here is that you look at if I were to

make this move, what are all the possible counters that my opponent can make.

All right?

When you look at all those counter moves, all right, what you want to do.

Minimize your maximum loss, right?

Because I'm going to do the thing that makes you lose

the most, and you want to pick the move that

minimizes that maximum loss, and that ends up being your

best move and your best strategy for playing the game.

So let's take a look at how this works.

So the first thing I want to do is represent my game as a tree.

Okay?

And so what I'm doing here is I'm saying

that this is the initial state, or the current state.

Okay, it doesn't really matter.

All right.

And that could be a board position, so

if I'm playing tic-tac-toe, this is a board position.

All right?

And, then, this is the first move.

Okay, so let's say that player A is the first to go.

So this is A's move, all right?

And this is the next move.

So these are B's moves.

Okay, so from the current board position what I'm saying

here is A could go to here, here, or here.

So A has three possible moves, and after A has made those moves,

B has two possible counter moves to each move that A could have made.

1:45

Now, let's imagine that this is just a two game, and we're done.

At that point, I want to score these bottom boards here.

And let's just give them numbers; let's say that

this is one, zero, minus one, zero, zero, one.

Okay, and let's look at this print from the perspective of a, all right?

Player A is the first to move so a

positive one means, you know, something good for A.

A negative one means something bad for A and

good for B and a zero perhaps means a draw.

Okay, so, perhaps this means a one is a wins.

So this is A wins,

2:24

this is a draw, this is A loses.

Okay?

All right, so what's actually happening here now

is that this first level, all right, okay.

It's a maximum level.

I want to choose the maximum score of my three possibilities.

This is a minimum level.

Okay?

So what's going to happen here is I picked the minimum of what's

going to happen, so I've got zero here, minus one here, and zero here.

2:55

Okay?

And now at this top level it's a maximum level so I

pick the maximum of zero minus one, zero and here is a zero.

And so what that basically means is, A tried to

minimize his maximum loss and ended up with a zero.

Okay, [LAUGH], all right, fine.

So what does that mean?

He draw, he ended up with a draw.

And let's intuitively figure out what that means.

Okay, so, so if A were to move down here.

Okay, there's two thing that could happen.

He could win or he could draw.

So, B is not dumb.

B is going to make, pick the, make the move this way to draw.

Okay, so the best I can do if I go over here is a draw.

All right?

If I go here, all right if this is the move I make.

All right I've got two possibilities all right.

And B has a winning move now.

So B is going to pick this move and so B wins the game all right.

So if I go here B is going to win.

So this is a bad move.

All right if I go over here all right.

B has two choices, B can force a draw.

All right, or B can lose at the game.

So B's obviously going to force a draw.

All right so there's no move that A can make

in this game, that allows A to win the game.

The best that A can do, is force a draw.

Okay, and if B actually screws up, I guess A could win the game.

All right?

And so this is the basic idea, you

alternate back and forth taking the max and

the min, and when you get down to

the bottom you have to actually score the board.

This is easy if you can build the tree all the way

out to the end of the game, then the score becomes obvious.

Yes, right.

You assign some value to winning, some value to losing and some value to a draw.

All right, if you can't go all the way

then you have to have some juristic function that

just sor, sort of gives you some notion of

how good it is to be at a particular position.

So we're trying to do this for chess for instance.

You're not going to be able to go all the way to the end of the game.

You're going to have to evaluate particular chess board positions and

say this is a good position, this is a bad position.

And then you can run minimax on that.

4:42

Let's make this more concrete by looking at tic-tac-toe, okay.

I could draw the tree for tic-tac-toe starting at the empty

board but that's going to get way too big for the slide.

So instead, we're going to start the tree from here.

Okay, this is the current state of the game.

Okay, and I can run minimax from any state okay.

So if this is where we're at, then I

can run minimax here to decide what to do next.

Okay and if you look at this board it's now X's term, so these are all the

possible moves that X could make, these are

all the moves that O could make in response.

And these are the final moves that X could make.

All right.

And we're to try, we're trying to decide what X should do here.

So we have to figure out how do we actually score these boards, all right.

And I'm going to make the arbitrary decision to decide that

if X wins, we're going to score the board plus one.

If it's a draw.

5:33

We're going to square the board zero.

And, if O wins, we're going to score the board minus one.

All right, and what this means basically is that

every time it's X's term, a we're going to

try and maximize the score, and every time it's

O's term we are going to minimize the score.

All right, and this is going to allow us to

choose the best moves for X the best moves for O.

6:20

Now let's move over to the far right and think about

what's happening at the right hand side of this tree, okay?

So if we're at this board position, okay?

It's now X's turn to move, right?

X only has one choice but this is a maximizing

level, so we try to maximize what we can get.

Our only choice gives us a score of zero, so the max of zero is zero.

All right, that does actually look like a zero, all right.

Over here on this board right next to it, again we only have

one move, we're trying to maximize, we end up with a zero, okay.

Now, let's consider the parent node here.

This node right here, okay, it's O's turn.

So, this is a minimizing node.

O have two choices, O can either move here or here.

All right?

Unfortunately both of these moves you know, have a

score of zero, so minimizing them, I get zero here.

O has no, you know, winning move or no better move than either, you know, each

move is equally good or equally bad depending

on the way you want to look at it.

Okay, so let's move over to the middle know.

All right?

So, if I look at the bottom, it has a score of zero.

If I look over here.

All right, O actually won the game.

All right, so this has a score of minus one.

All right, let's look at this node here, okay.

It is you know, X's turned so, this is going to be you know, maximizing level.

So, I look down there, and the score is zero, so I maximize that by taking zero.

Okay.

Now, let's step up here.

7:53

Okay, if we're here, it's O's turn.

All right?

So it's a minimizing level.

So I look at the two scores beneath me,

minus one and zero, the answer here is minus one.

This is a board position by which O can win the game.

All right, so this has a score minus one.

All right, let's move back over to the right here.

8:10

Okay.

This game is over.

Okay so there is three in a row for O,

O wins the game, this has a score of minus one.

Over here it is X's turn, X has one move,

it's maximizing at maximum of zero is zero, okay fine, zero.

Okay, we look over here up at the parent.

Okay, it is X's turn, so we are going, or I

mean, I'm sorry, it is O's turn, so we're going to

minimize, all right, we minimize, we look at minus one and

zero, okay, we give this board a score of minus one.

So if X chooses to make this move, O has a winning move.

If X chooses to make the center move, all right.

So, I guess this is the move here.

If X choose to make this move O can win the game.

If instead X chooses to make this move, O can win the game, or finally

if X blocks that winning spot here, okay, the game is going to end up in a draw.

All right.

So, we go back up to the top and say, what is the the score here.

Well, all right.

We want to maximize because it's X's turn, so the

maximum of minus one minus one and zero is zero.

Which indicates that X should move here, right

doesn't matter what O does after that, okay?

All right and this is the basic idea behind Minimax.

It not only tells you what the score is, it actually tells you what move to make.

Okay, I can trace back through here.

Right.

If I go this way, if X does make a bad move and put us in the left.

Then O should move this way, right?

Because that's the minimum node.

Similarly O should move this way because that's the minimum node.

All right, so you should move along if you're X,

along the maximum path if you're O among the minimum path.

Okay?

9:47

So key idea of Minimax here is to think about the game as a tree, right?

Where the root of the tree is the current state of the game.

And then each successive level of the tree is a move by you

or a set of moves that are possibly by either you or your opponent.

Right, and then you try to evaluate that tree in such

a way that you figure out what you're best move is.

Assuming that you're going to do the best thing at you can at each level

and your opponent is going to do the best thing that they can at each level.

So how do I define best?

Well, the idea here is that best is defined as minimizing your maximum loss.

All right, so you want to force your opponent

into a situation where they can do the

least damage so to speak, because you know they're trying to, you know, win as well.

So they're trying to give you the maximum loss that they can.

And so by restricting them into minimizing the maximum that

they can do to you, you've made your best move.

And they are going to be doing exactly the same thing at each level.

All right, and so you alternate back

and forth between minimizing and maximizing, all right.

And you end up picking the best move that you can.