In the cat and mouse game, played on an n x m grid, the cat and mouse start in opposite corners. The cat wins by capturing the mouse, while the mouse wins by evading capture for a defined number of moves. The game’s outcome depends on board size and who moves first. On an 8×7 board, there are 3,136 possible combinations of cat and mouse positions, requiring a policy of the same size for each player. Using hill-climbing optimization, both players can develop optimal strategies. The cat’s policy specifies moves for each combination of positions, and the mouse’s policy does the same. Over 100,000 iterations, both players refine their strategies, with the cat winning 50,000 games and the mouse 50,000. For larger boards, like 20×20 or 30×35, simpler, more general policies are necessary, reducing the complexity of the policy space.
Source: towardsdatascience.com
