字幕列表 影片播放 列印英文字幕 Hi I'm Tommy Thompson, this is AI and Games and welcome to part 3 of the AI of Total War. As the core systems of Total War have been established and redefined in the franchise - a point I have discussed in the first two parts of this series - there is always a need to strive for better. RTS games continue to be one of the most demanding domains for AI to operate within and as such we seek new inspiration from outside of game AI practices. With this in mind, I will be taking a look at 2013's Total War: Rome II - one of the most important games in the franchise when it comes to the design and development of AI practices. So let's take a look at what happened behind the scenes and what makes Rome II such a critical and vital step in Total Wars future progression. In part 2 of this series we concluded with an overview of the dramatic changes to the underlying AI systems in Total War with the release of Empire, followed by Napoleon in 2009 and 2010 respectively. What was once a more simple and more manageable state-driven and reactive AI system had made way for an adoption of the Goal Oriented Action Planning system. A technique popularised by First Encounter Assault Recon. The GOAP implementation within Total War was ambitious but struggled on launch with Empire, requiring patching and updating both post-launch as well as in the following year's Napoleon. The same AI tech was adopted in 2011's Total War: Shogun 2, with it proving to be a less challenging experience for the systems involved. Shogun 2 returned to Japan, which provided a much more balanced mix of ranged combat and melee, with less emphasis on gun-driven combat. Even the campaign AI didn't struggle with the same problems as Empire and Napoleon, with a smaller and less chaotic structure. But while it seems Creative Assembly was becoming content with the combat systems, the campaign AI still needed more work. This resulted in some significant changes under the hood during the Fall of the Samurai DLC for Shogun 2, which among other things includes the naval warfare of Empire and Napoleon. One of the new problems this creates for players is that the army and naval logic were until that point separate, meaning the AI needed to be rewritten to consider how naval strategy could influence ground troops, such as being bombarded on the coast line. At that point, the campaign AI's planning approach couldn't foresee these issues well enough and was often stuck being reactive in its planning process rather than deliberative and forging ahead on its own ambitions. To resolve this, a new campaign AI system was prototyped in Shogun 2, which was later expanded to create some rather seismic changes in Total War: Rome II. 2013's Total War: Rome II was a return to one of the most well-known entries in the franchise, but with it came a rather seismic change for the campaign AI under the hood. The drive for a more deliberative system that could consider the overlap between mechanics resulted in a growing number of sub-systems responsible for individually managing the budgeting of money, conducting diplomacy, selecting tasks for attacking and defending - be they attacking enemy forces or laying siege to settlement - deciding what issues take high priority, figuring out how to navigate an army safely across the map, not to mention managing construction and taxes. All of these require the AI to consider the overall suite of resources it has at its disposal and how best to utilise them. The system is still reliant on the belief, desire and intention system mentioned in part 2, but now the sheer number of combinations here are staggering. Even if the system has decided on a smaller subset of tasks it wants to complete in a given turn, there are still tens of thousands of different possible outcomes for that one turn. The map for military deployment is quoted to have around 800,000 individual hex points alone. How can the system hope to approach this sort of task at this scale? The answer comes in the form of Monte Carlo Tree Search: an AI algorithm that had recently taken academic research by storm and is making big waves in general intelligence AI research. MCTS allows for the system to consider all of the different possiblities, explore the ones that seem the most fruitful but also continue to consider alternatives. In time, those alternatives might yield some strong outcomes, so this system is able to keep doing things it knows are good for it, but also consider other opportunities along the way. Now before we get into the meat of how the campaign AI in Rome II is managed through MCTS, I need to take a moment to talk about how the algorithm works. Monte Carlo Tree Search is a type of reinforcement learning algorithm: a branch of machine learning algorithms that look at a problem and find good decisions by considering all possibilities, while largely focussing on the ones it finds to be most useful. This is really useful when you have a problem that is incredibly large and has a large number of possibilities, given we might find a good decision to make, but we can't say with any certainty it's the best decision. In order to have a better understanding of whether there are better options to take, we need to consider alternatives periodically and see if they would be more useful. This is known in reinforcement learning as the exploration/exploitation trade off. We want to exploit the actions and strategies we have found to be the best, but must also continue to explore the local space of alternative decisions and see whether they could replace the current best. This is a difficult process to resolve, given that sometimes we need to really explore a series of decisions to discover that an action that might look bad now, might actually prove to be a really good idea somewhere down the line. This is what MCTS does best: it explores all potential options for a given decision point, isolates the best ones and then dictates which one is the best, both considering it's short and long-term ramifications. The key component of MCTS the ability to run a playout: where the AI effectively plays the game from a given starting point, all the way to the end by making random decisions. Now it can't actually play the game to the end, so MCTS uses what's called a forward-model: an abstract approximation of the game logic that allows it consider the outcome of playing action X in state Y, resulting in outcome Z. The algorithm gathers up all the decisions it can make in a given state of the game, then runs thousands of random playouts across them in a structured and intelligent fashion. It gathers data from each of these rollouts and concludes the process by selecting the action that had the best rollout score. It's both incredibly powerful and strangely stupid in its execution. The smart part comes in how each rollout is decided upon and executed, to do this it relies on four key steps: selection, expansion, simulation and backpropagation. Selection takes the current state of the game and selects decisions down the tree to a future state a fixed depth down the tree. Next up comes expansion: provided the state we reached didn't end the game (either as a win or a loss), we expand it one step down to and simulate the outcome. Simulation is the random playout phase: it plays a game of completely random decisions from this point until it reaches either a terminal state (where it wins or loses) or a simulation cap is reached. It then gives back a result of how well it performed as a score. This is passed to the backpropagation phase. In backpropagation: we update the perceived value of a given state, not just to the state we ran the rollout, but every state that led to it. So any score - be it positive or negative - works its way back up the tree to the starting point. Through those four phases, we can take decisions to a fixed point in the tree, simulate their outcome and then propagate back the perceived value of it. Now doing this once isn't enough, you have to do it thousands of times and balance which playouts to make. Different MCTS algorithms balance it out so they shift focus to different parts of the tree periodically to ensure there are no better solutions to be found it didn't otherwise spot. But once the playout limit is reached, it's done and takes the action leading to the best scoring state. What makes this system even more powerful, is that it's what we call an anytime algorithm: meaning that it will always give an answer regardless of how many playouts we let it take. So in a context like a game, where CPU and memory resources are pretty tight, if it needs to stop evaluating the game at a moments notice, it will still give the best answer it could within that time. Despite this, giving it a massive amount of CPU resource won't result in godlike AI, given the knowledge accrued from repeatedly running playouts eventually levels out. Alright, with all the science out of the way, how does this all work in Rome II? First I need to explain how the Rome II campaign AI manages itself. It's broken down into three chunks: pre-movement, task allocation and post-movement. - Pre-movement identifies threats and areas of opportunity for the player. It also budgets resources, conducts diplomacy and selects skills for armies. - Task allocation is conducted by a highly complex Task Management System - which is the focus of the MCTS. The task system handles armies, navies, agents and actions related to diplomacy. - Lastly there is post-movement: once all units and such are moved and decisions made, the AI will then focus on construction of buildings, setting taxes and technology research. MCTS is responsible for managing two critical components of the task allocation systems: the distribution of resources such that the AI can approach different tasks it wants to complete and the execution of specific tasks. The tasks themselves are driven by a variety of different task generation systems with their own focus or perspective. So while there is a task generator for armies, there is also once for navies, diplomacy actions and much more. The thing is that there are often way more valid tasks to execute than there are available resources: the actual units on the map and money to spend. As such, the system then prioritises which tasks it would complete by selecting the most viable and then allocating resources to them. In addition, task viability also carries some filtering to stop it trying to do anything too stupid, such as removing actions that could cause diplomatic tensions, filtering actions that could impact long-term strategies and also factoring what it had done recently so it avoids contradicting itself. Once filtered, the tasks are then assessed using the MCTS algorithm to grade their effectiveness and priority. With the best and more desirable looking opportunities graded a higher priority. After this, the MCTS is called on again in order to run resource coordination: or rather now that it knows what it wants to do, it still needs to figure out how exactly to do it. As such, once the system has made some approximations of appropriate targets and their locations on the map, it will run more MCTS approaches on army movement and army recruitment. Factoring the makeup of its own forces as well as the opponents in order to determine where best to move current forces, as well as what types to recruit for future turns. In each case, the MCTS is limited such that it doesn't search all the way to the goal, given that Total War as a game is so large that it would take too long for it to simulate completing the game. In addition, the game is complex that simulating that far out won't yield any useful outcome. In fact, it was quoted that the system is only capable of looking one turn ahead before starting random playouts due to the complexity of the game. Given the nature of Total War, the MCTS can only exhaustive search the entire state space for the best action during the opening turns of the game. Over time the number of possible states grows exponentially, to a point that it is simply beyond the algorithms reach. Despite this, the anytime property of the algorithm ensures we will still get a useful and intelligent decision from the system. Rome II launched in September of 2013 to a largely positive response, but with a few problems. Most notably, the campaign AI took quite a long time to make its decisions in the launch build: taking several minutes to conduct campaign movements that most players conduct in a minute or two, resulting in aggressive patching of the game for several weeks after launch. In time this led to a noted improvement in campaign decision making that was received favourably (though not univerally) among fans and critics. Revolutions aren't easy, nor are they clean and the legacy of Total War: Rome II is no exception. But it is nonetheless a major milestone for the development of AI systems and practices in the commercial video games and has led the way for many a successor that is seeking to adopt MCTS as part of its own AI toolchain. MCTS is a hot topic in contemporary AI research and has shown many useful applications in fields of expert play and general intelligence. To learn more about how it all works, be sure to check out the AI 101 on MCTS here on AI and Games. Thanks for watching this third entry in the AI of Total War. In part four, I'll be looking at how the MCTS implementation was improved in Total War: Attila, combined with a deep dive into just how exactly does the diplomacy AI work in more recent iterations of the game.
B1 中級 美國腔 全面戰爭:羅馬II》戰役人工智能的背後(第3部分,共5篇)|人工智能與遊戲。 (Behind the Campaign AI of Total War: Rome II (Part 3 of 5) | AI and Games) 8 0 wei 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字