Adversarial Search 1 – Game Trees

a previous video featured an enemy agent with a known deterministic behavior in such situations the standard search methods we learned about previously work fine because the enemy is simply part of the Environment State however an agent does not know how its opponents will behave then a more appropriate method is needed this video deals with adversarial search using the game of tic-tac-toe also known as knots and crosses as an example now in this game players take turns placing their symbol in one of these nine spaces either X or an O the goal is to get three of your symbol in a row either horizontally vertically or diagonally now the players take turns like this you'll see that oh now as a chance of getting three in a row an appropriate move for X would be to place here to block that but if X were to make a mistake and play there then o could play in this position get three in a row and win now how will we search in this state space to make proper decisions we can do it with an adversarial game tree that looks something like this now portions of this tree are missing because the full game tree for tic-tac-toe is too expansive to show in its complete detail but as you can see we start from an empty board state and because we assume that X goes first all of these board states are the different moves that X can make the upper left the upper center upper right and so on all the way to the lower right now from each of these nine states o or the circle can make a follow-up move in eight more States now I've used ellipses to indicate that there are several missing states here but you can see that there are many possibilities from here all the way down to there from this state and then from this state X's position is still the upper center but ou can fill in any of these other spots all the way down at the lower right and then each of these states would also have its own eight children in this tree all the way to the end now I've expanded particular sections of this tree to make some interesting points and to show you how we can search this tree in order to make wise move choices now if we start from here and branch out the next move belongs to X and all of these empty positions are available therefore I've depicted all of them in the spots shown below and then I've expanded two of them now very quickly you can see that if we branch out down this route and I will go ahead and zoom in here so you can see this better all of Oz available moves from this state are now shown now we can already see that victory is near specifically x is only one spot away from getting three in a row here and if o makes any move other than to block that position then X can immediately win so these are all of the moves that o can make do not block that position and although I've not shown it X can win one step beneath all of these states simply by playing in this upper left corner so all of these states are victories for X and we'll see later how this information is used as we work our way back up the tree but let's go deeper for now so if a goes there than it is X's turn again and here all the available moves for x and one step down it is o's turn again and once again i've only expanded select portions of the tree which will the reason for that will become clear as we work our way back up in a minute and one more level down here I've expanded two branches as we near the end partly because as we go deeper and deeper in this particular game the number of options decreases that's not true of all games but it is true in this scenario and then from this one position going down it is Oh's move again and then finally X now in practice you would need to expand the game tree entirely and reach all of the in States to then use the method I'm going to show you now so in this state X wins that's three in a row now it was X's turn to make a move here X didn't really have any option as to what move it would make but it certainly likes this outcome so that is a good state for X it's a state where X wins if I go back up one level to position where o is making the moves if I am player oh I would not like this state X guaranteed wins in this state so I would prefer not to make that move in fact o wants to make this move where it gets three in a row and wins so this is a state where a wins now regardless of which player our algorithm is controlling the game truer be made the same way but if say we are controlling player X we would model O's moves assuming that o is an optimal player so X would love to make this move and X knows that if oh we're at this point o would love to make this move therefore we go back up one level to where it is X's turn because X knows Oh will favor this move X will not choose that move because it will lead to Oh winning even though there's a chance of X winning two layers down X assumes that Oh will make the smart move instead and therefore avoids that state X would also not make this move because o could play there and win as in this state but X would make that move to win immediately therefore because X is and control this is a level where X wins similarly at this level of the tree X can play there in win immediately so in both this set of states and this set of states X can win back up one level Oh is making the decision here X was guaranteed to win here so Oh will not go that way X is guaranteed to win here so Oh will not go that way either unfortunately for Oh X can win easily in all of these states because if o goes to either of these states X is still one move away from winning so OH doesn't like any of these states that means that X wins at this level no matter what happens therefore X loves this state doesn't even really need to look at the others but we'll get into those details later therefore X wins at this level of the tree Oh does not like that state because it leads to X's victory and as I demonstrated earlier X is actually only one step away from winning in all of these other states so Oh actually doesn't like any of these states which once again if we go up one level means that X loves this state X is guaranteed to win from there now let's see what happens if instead of going down this path we had gone down this other path here so we're going to go all the way to the bottom and work our way up just like before so if we get all the way down this far this is a draw now that's not great but at least it's not a loss and if you get down this far X really has no option but to make that move anyway if you go up one layer there are two options one where there's a draw and then this one where a wins so since oh is making the decision here Oh would favor this move not that one so this is a state where a wins now similarly at this level Oh would win in that state so Oh would favor that position Oh would not want to go this route which actually ends up in this same state there's a draw state there so Oh wins in both of those states and because o is making the decision there it wins automatically if X makes the mistake of going either of these directions so X doesn't like either of these states but actually oh is only one step away from winning here that would be down one level so X isn't like that state either Oh wins at this level no matter what so let's go up again here o is making the decision since o wins in all of those states o likes that position now in each of these if I go one move down X could win there there or there but since oh is making the decision it won't pick either of those moves so o will win and that means that X going up again does not like this state it leads to a situation where Oh will win however if X doesn't play here then O's next move down will once again be an immediate win so Oh actually wins in all of these states all of these states are bad for X that means going up that Oh loves this state if Oh makes this move in the center here then Oh will win no matter what happens that lower down in the tree assuming optimal play from both players in contrast one move down from each of these states where X makes a move X could win but since oh is making the decision it will avoid all of those states so this is a level where a wins and if we go back up because o wins this level no matter what that means that X does not like this state now it would take some work to fully expand the rest of this it's not fully necessary we know that because X has a guaranteed path to victory here it can win at this level it would just make that move and then everything beneath that guarantees X will win we go up one level from there that means that o does not like this state it would not go in that direction if it can avoid it but to go any further we'd have to fully examine all of these other states which we don't have room for here but we can see that if the board were already in this state so Oh had made the mistake of moving here then of all of these moves X could create this game tree and figure out that it should take this action it should make the move that puts the game in this state if we carry this reasoning all the way up at the top which we could if we had fully expanded the game tree then each of these states would be labeled as a guaranteed win a guaranteed loss or a guaranteed tie or uncertain and this early in the game pretty much everything is still uncertain if there are uncertainties X would simply favor the path that led to the most opportunities for victory and that is how an agent can make decisions in an adversarial world using a game tree

As found on YouTube