字幕列表 影片播放 列印英文字幕 [MUSIC PLAYING] ANKUR KOTWAL: Today I want to talk to you about machine learning for game development. But before we dive straight into that, I want to take a step back and show you some of the magical use cases we've seen in consumer apps. So for those of you that use Gmail, you may be aware that the spam filter in Gmail is actually built on an ML model. And this model evolves over time, because users can tag something as spam or not spam. And the model actually adapts over time based on that. One of my favorite use cases is actually Google Photos. So if you wanted to fire up Google Photos right now and go into the Search field, and type in the word car, it's going to give you back all the images in your personal library that have cars in them. That's not because you went and labeled them. It's not because someone at Google went and labeled them. It's because there's an ML model behind it that's actually able to recognize objects, landmarks, locations, and even your pets. So it turns out that games and game development is actually a rich area for machine learning research. And in 2014, DeepMind joined us and started using games as a way to do their machine learning research. So in 2015 they talked about how they use some classic video games for ML research. So what we see on the left hand side there is "Breakout." And what they did is they used a form of machine learning called reinforcement learning, where the model itself only had access to the inputs that it could provide the game, the visuals, so the screen, and the score. And the goal was to try and get as high a score as possible. So it didn't know how to play the game. It was just working out its own way. And within a few hours, it was one of the best "Breakout" players in the world. Actually, if you look at the strategy it employs, it creates a gap on the left hand side, and lets the ball go through the gap, and let it bounce off the top wall and clear the bricks itself. A year later, DeepMind surprised everybody by building an ML model that was actually able to play the game of Go and defeat some world champions, the world champions at the time. And more recently in the last few months, DeepMind and Blizzard have been talking about the work that they've been doing together with "StarCraft II," and building an AI that can play "StarCraft II" competitively. I recommend you go and check it out. Actually, again, they pitted the AI Alpha Star against some pro esports players and were able to defeat them. So what we're seeing is games are a great place for ML research. And we want to be able to find a way that we can use games in our own applications, in our own game development. But not everybody has a team of machine learning experts like DeepMind does. And as you can see from the numbers here, less than 2% of all developers have any machine learning expertise. And fewer still are deep learning researchers. But at Google Cloud, what we want to do is democratize machine learning. We want to find a way to make ML available to everybody so that you can innovate and find use cases where it's useful. So today, I'm going to be talking to you about how you can use machine learning for specific areas of game development. We're going to start with player experience, move into data analytics, and talk about game development. But the important aspect here is, we're kind of going to go from easy mode to hard mode. So think of these as difficulty in your games. So let's get started with player experience. We've been doing ML research for years. And what we have done with that research is exposed it as a set of APIs that you can readily use today. Now, these are pre-built models based on our vast data sets. And we've just exposed them as REST API endpoints where you can consume them either on your server, or directly through your clients as well. But these are generic APIs. And I'll cover a few of them. Let's look at some specific examples where game developers could benefit from them. So we're living in a world where, increasingly, we have a global gaming audience. We have players that are able to connect with each other from vastly different parts of the world. And language is, frankly, a challenge. Some game developers have used techniques like emotes to try and get around this, where we limit the type of vocabulary that can be used, and that makes it easy to translate. But when you're in the thick of battle, in a battle royale game, and one of your squad members speaks a different language to the other folks, it's really hard to coordinate. So we can do better here. Now, you may have heard of Google Translate. It's a consumer application where we can translate languages from a source language to a destination language. You may have seen it in Google Chrome, where if you go to a website that's in a language that's not your default language, Google Chrome offers to translate that for you. But we've exposed that as an API. And we call it the Google Translate API. I'm going to switch to a demo just to show you how it works. So can we switch to the demo machine, please? There we are. So here is actually just the landing page for the API. cloud.google.com/translate. And when we scroll down, you'll see that we've actually got a demo that you can see. So first thing I'm going to do is click the recapture. And switch languages. So the type of thing that we might have players say is, oh, good game. Have fun. And we can do it across different languages. So let's say we choose Dutch. You'll see it's responsiveness is incredibly fast. We can say something. Just good luck. Oops. And, again, we can switch to any language. What you'll see, though, when I expand this is that request URL is all that's needed. We have a couple of parameters, query parameters, where we say, here's what the source text is, what the source language is, the destination language. And then we provide our API key for billing purposes. That's it. And what we get back is a nice little bit of JSON that gives us the translated text. So if you have any sort of chat messaging in your games, this is a way that you could translate between languages. All right. Let's switch back to the slides, please. So even though we're enabling people to talk to each other, people aren't always friendly to each other. You may have noticed that on the internet. And what this does is it actually creates a bad experience for your other players. When you've got one person dominating the conversation, or a group of people that are being hurtful to others, it really causes problems. And the way developers treat this sort of scenario these days is that they provide a mechanism for players to report other players, report bad behavior. At which stage, you gather a bunch of diagnostic information, maybe some chat logs, maybe they're in-game recordings, and so on, and you pass it off to a team that has to triage it. That's a manual effort. Triaging that sort of work takes a lot of time. We have an API called the Cloud Natural Language API, which can actually detect sentiment in individuals' chat messages. So as a way of quickly triaging through reports, you can quickly identify where you may have some problem areas in your logs, and make that triage process a lot simpler. Increasingly what we're seeing in games these days is that developers are starting to adopt things like AR, where they're using the camera. And they need to be able to detect what type of objects that are in the scene, or what type of locations they're in, maybe landmarks. Another example that we see is lots of game developers are providing ways for your players to create user generated content. It might be items. It might be custom images. It might be even maps. Turns out that if you give people the ability to upload whatever they like, they can, again, upload things that are probably not appropriate for everyone there. Now, we have an API called the Vision API. And it is able to do things like object detection, and also able to flag explicit content. Now, Vision API is really cool. Because it's giving you the kind of power that we have in Google Photos, that example I used earlier, but giving it to you as an API that you can readily call. Now, before I switch to the demo, I just want to get some answers. Can anybody tell me what that is? I'm hearing Eiffel Tower. Any other thoughts? All right. So we'll switch to the demo machine. We're going to look at that image right now. This is our Vision API landing page, cloud.google.com/vision. And what we're going to do is drop an image on there, hit the recapture again. This is, again, just a demo. But what we're doing is uploading this image. And if you said Eiffel Tower, you were wrong. This is actually the Paris Hotel and Casino in Las Vegas. Now, when that API returned us a response, it actually told us what part of the image that it used to recognize what this landmark was. And so you can see there's a green highlight, or a green bounding box around that image. And the way that it knew that it wasn't the Eiffel Tower, The Eiffel Tower doesn't have a building below it, unlike the Paris Hotel and Casino. Now, because this is a real place in the world, we're able to get some useful information. We're able to see that this is a real landmark, where it is. We're able to get links to the web that says what kind of information this is. And we get object detection. We get labels. We can see that this is a landmark, that this is a tourist attraction, and actually, that we even have a lot of the sky in this image. And then we get Safe Search. So we can see what kind of image it was. I was going to give you an example of an explicit image, but legal said no. So we'll have to move on. All right. Let's move back to the slides, please. So that's the Vision API. And when we look at some of these use cases, you can see that just by using our APIs, you're able to solve kind of low hanging fruit, quality of life type problems. It can really improve your player experience. But for you, you're getting the benefit of an ML model without having any ML expertise yourself, because we've done that work for you, and we've exposed it. Now, even though we only looked at three of these APIs, we looked at Translation, Natural Language, and Vision, we have a number of others. We have Cloud Speech, which will do speech to text and text to speech. And we also have Video Intelligence, where we can look at videos and tell you where objects are in different scenes. So we can identify cars at the start, buses, planes, all sorts of objects. And we can transcribe videos as well. So you get actual captions for your videos with specific timestamps. So when you've got any sort of recording in your game, where your players are able to record their last few minutes of gameplay and so on, Video Intelligence can be a good way to kind of index that if you choose. But you may look at that and say, hold on. These are very fixed use cases. I have more needs than just the ones that the ML APIs provides. And we have some solutions for you there as well. I'm going to look at an example game. We're just going to make it up. Let's pretend we have a flight simulator. And it's a very realistic flight simulator. Look at my graphics. It almost looks like it's a photo. Right? It is a photo. But this simulator is so realistic. We've used procedural weather generation. And what we need to do is, as the weather conditions change, we need to be able to know how we fly, what type of planes can we use? How should the player do his or her thing? Well, it turns out that in order to predict weather, looking at clouds is an important way to do it. And it turns out that we have a few categories of clouds, just a bit over 10. And when we look at those clouds, the type of cloud is a great indicator of weather patterns. But the problem is, if we use the Vision API that I just showed you right now, the Vision API gives us some useful information. Hey, you've got some clouds there. Hey, there's a sky. But it doesn't go to that level of detail that says what type of cloud it is, because these are general purpose machine learning models. Now, traditionally you would have said, oh, your APIs don't work for me. I'm going to have to go and custom train an entire ML model myself. Now, for those of you that don't have ML expertise, that is a huge learning curve. It's not an insurmountable problem. But you have to suddenly learn ML. And data science is actually a profession on its own. So it's not something that you can easily pick up. Wouldn't it be great if we could get the benefits of having some ready to use model, but with our custom data set? Well, for that we have something called Cloud AutoML. This is huge, because it means that you can get custom built models for your use cases, but you don't need any ML expertise. And we do all of this through a simple graphical user interface. So when we look at AutoML Vision, the way that it works is that you provide a labeled set of images, your photo data set. You pass and upload it through our user interface to AutoML Vision. It trains up a model. And then it deploys it and serves it to you as an API that you can readily consume. Now, that's huge. Because one, there's a lot of work to just train a model. But it's also a lot of work to scale out and deploy a model, such that you can depend on it. We're going to look at that in a little bit. So the way machine learning, in general, works is something along these lines. Now, this is not meant to be intimidating. This is actually a screenshot from the TensorFlow Playground. TensorFlow is an open source framework that Google built for machine learning. It's the most popular ML framework in the world. On the TensorFlow website, this playground is a way for you to visually try and get some experience with building ML models. And what you'll see straight away is across the top there's a number of parameters. We call these hyper parameters. Things like the learning rate and activation function. And then we have these columns of squares. Each column is called a layer. And in those layers, we're trying to extract certain features to be able to identify something for our use case. The goal here is actually in this output field. So on this output image, these dots actually represent the test data. So when we train an ML model, we have a huge data set. The bigger the data set, the better. So you split it into what's called training so that the training data set is used for your ML model to learn. And then you have a test data set, which is what you use to validate that that model is working as appropriate. And in this example, we have our data. And each of these dots that you see in the output actually represents the test data. That's the right answer. So you see blue dots and orange dots. And what you want to see is that the background behind the dots matches the color of the dot. So that's where our ML model is predicting. And so what we've done here is we have a number of layers. And you'll see this spiral shape data pattern that we have. That's pretty good. It does its job. But it's complicated. And the process of machine learning is such that what you do is you start with a set of parameters. You train it. You test it. You look at whether you're happy with it. And then you tweak it, and you go back. Train it. Test it. Tweak it. Train it. Test it. Rinse and repeat, until you end up with a model that you're happy with. Now, training with huge data sets actually takes a lot of time. So you need a lot of computing resources. But it actually still takes a lot of real world time. So when it came to building AutoML, what we thought was, hey, rather than do this process serially one after the other, wouldn't it be great if we could do it in parallel? We could just come up with a whole set of ML models. And our AutoML controller could farm them off to different machines on Google Cloud. And we can use the highest spec machines that are available, the fastest CPUs, the fastest GPUs if we want, because GPUs are a great way to train ML models and use them for inferencing. But we go one step further. Google Cloud also offers TPUs. Now, TPUs stand for Tensor Processing Units. And this is custom silicon that Google has built, and we make available exclusively on Google Cloud. So where GPUs are orders of magnitudes faster than CPUs, TPUs are several times faster than GPUs too. Now, to give you an example, yesterday at the I/O Developer Keynote, we talked about an example of a company called Recursion Pharmaceuticals. Their training time with TPUs went from 24 hours down to 15 minutes. A whole day down to 15 minutes. So you imagine that you're trying to build an ML model. And you're constantly training and tweaking. That cycle that you have to do, you want to condense it as much as possible. So AutoML says, ah, don't worry about all this. We'll just send them all off. And you decide how long it trains for. So the training side of the billing works based on training time. We give you an hour for free. And so at the end of that time, we look at the model that performs the best. And we serve it. Based on our data, what we found is that AutoML produces ML models that are more complicated than what humans produce. But they also perform better. So let's have a look at an example of AutoML Vision, if we can just switch to the machine. Thank you. So here is the console for AutoML Vision. And what we have is a labeled set of images. So remember our cloud example, the flight simulator? What I've done is I've uploaded a ton of images. There's about 1,800 images in there. And these images have been labeled. On the left hand side there's a label. So I said there's 10 types of clouds. We're only doing five labels for this example. And you'll see that the cirrus, cumulonimbus and cumulus have about 500 images. I didn't have as many for the alto cumulus and alto stratus. So there are only about 200 and 135. But when I go through the training, what you can see is that I said do it for 12 hours. And it came up with a model that's about almost 90% in its accuracy. When I look at how it evaluated against the test data, what we found is that this thing is called a confusion matrix, which basically says, how well did this model do at identifying against the test data that we provide? So if you look at the ones where we provided 500 images, you get about 90% accuracy. The one where we had fewer images, it's only in the 60s. Anyways, we're going to now go ahead and predict what these two clouds look like. Now, typically, this is just our console. So we're just doing some sample data here. You would have an API end point. And I'll show you that in a second. But anyways, you'll see here with 99% accuracy, or confidence, it said that this is a cumulus, which is correct. And then you can see that with 73.7% confidence, it says that this is a cumulonimbus. And it's far more confident about that one than the next one down. So our model's actually done a really great job. Now, in terms of an API end point, we actually provide you some example code. Whether you want to use like the curl statement, or even some sample Python code on how you can call your API yourself. So let's switch back to the slides. So now we've solved our flight simulator problem. Right? We're going to use clouds to detect whether. And we're done. Now, AutoML Vision meant that we had to write zero code. And we didn't even need to worry about how this thing is going to be deployed on infrastructure and served globally. But you might be saying, ah, that's just a Vision example. Give me some other normal, like regular examples. I want to talk to you about a new product, one that we just launched a few weeks ago. It's called AutoML Tables. And pardon the pun, but it's a game changer. With AutoML Tables, you take your structured data. Now, that data can be a spreadsheet. It can be from your database. And you pump it into AutoML Tables, where it will train an ML model for you, and deploy it at scale. So you may be saying, well, what are some examples that I could use for AutoML Tables? Well, actually, your imagination is the limit here. You could look at tailoring game difficulty to match the types of players that you have in any given encounter. So you in real time change the difficulty of maybe that boss that you're players are fighting against. You could look at ways to convert players into paying players through in app conversions. Look at your historical data to train AutoML tables. You could use it to identify fraud. This is a very common challenge that game developers tell us. That, hey, we constantly get these fraudulent transactions. And, of course, you can also help it identify cheating in your games based on, again, historical data. So AutoML Tables is huge. And Cloud AutoML in general is, again, like our ML APIs is broken into several categories. We have the site type things, Vision and Video Intelligence. We have Natural Language and Translate. I do want to call out, Natural Language and Translate, in particular, is great for games. So I gave you an example earlier of Translate. And we said, good luck. Have fun. We said, good game. Most gamers don't talk like that. They use shorthand. Right? And especially when you look at your own games, you're going to have a set of custom inventory items, characters. If you build card games, maybe card decks. With AutoML, you can train those models to learn your custom terminology and adapt to it. And so then you can provide translations, and natural language sentiment detection as well. And then, of course, we talked about AutoML Tables, which is a really big deal. So so far we've looked at kind of off the shelf products that we offer you. Now, I want to talk about where ML is really useful for game development. We offer a number of products that really help you with building these ML models. But ML, as I said earlier, is that initial process of training where we tweak some parameters. We train a model. We test it. And then we go back to the start and do it again and again until we find a model that we're happy with. The second half of it is just as important. Once you have a model that you're happy with, how do you deploy it at scale? Do you deploy it locally as part of your game if it's a mobile game? Do you send it down to the console or desktop? Or do you run your AI in your cloud where you need to be able to have your AI simulated there? How do you get the reach across the world, depending on where your players are? Well, for that we have what we call the Google AI platform. And the Google AI platform offers training and prediction services. We have powerful and flexible tools for ML and data science, and really a scalable and robust infrastructure that Google runs its own services on. The other thing we've done is we've made available a virtual machine image that's specifically useful for deep learning. And you can run it on our product called Compute Engine, which is where we host VMs. But let me give you a hypothetical example of a game. This game itself is actually real. This is a mini game called "Snowball Storm." And it's part of Google Santa Tracker. If you've never heard of Santa Tracker, please check it out at SantaTracker.google.com. But in this game we built a little bit of an homage to the battle royale genre. So at the start of it, you're a little elf. You parachute down onto an island made of ice. And you have to try and throw snowballs at other players to eliminate them. The ice melts around the edges over time. And so we're constricting players into a location. Currently, the AIs and the bots in this game are not built on any ML model. They're just very simple. If we were to build something like this for our own game, we could build a pipeline akin to this. So I'll walk you through this. On the left hand side what we'll have is some players playing against one another, some player versus player matches. What we'd want to do is record and log as much of that information is we can. Because the way we're going to train this bot is using a technique called supervised learning. So earlier I said DeepMind used reinforcement learning, where the AI kind of taught itself. With supervised learning, we're going to capture all this data, and train our bot based on how other players play the game. So it's going to learn off other information. So what we'll do is capture all of that. We'll log that. And we're going to store it in a database that we call Google BigQuery. And with BigQuery, it's a petabyte scaled NoSQL database. So it's incredibly performant. And when I say petabyte scaled, it can scan through petabytes of data within seconds. So we're going to store all of our logs, player logs into BigQuery. We're going to export that and make it available on Cloud storage, which is really just a place to store your files. You could think about it as a private storage. Kind of like Google Drive, but not with a user interface or anything like that. It's literally for storage of your private files for your enterprise. So you export from BigQuery, store into Cloud Storage. And then we're going to build a model using TensorFlow in our training phase. And we're going to use Compute Engine here, which is what I said earlier about where we run virtual machine images. We'll use that deep learning image that I talked about earlier to train an AI model that learns of our historical player data. And then, once we have that model, we'll export that model, and store it in cloud storage. Now, I've, of course, glossed over the really hard part of this, which is the training. But this is where you do require some ML expertise. So when you or your organization are ready to kind of step up and attempt this, just know that it's more than just running TensorFlow. You need to have an entire pipeline of ingesting this data. You need to be able to do it at scale. And ideally, with the training phase, you want to do it as fast as possible, so you can get really quick cycles of iteration. The next side of it is inference. So we have our AI model. And what we want to be able to do is make predictions. Now, our AI might be running server side. And in this case, it would. So it's easy for us in the Cloud ML engine to just be making inference API calls, to say, hey, what should my next move be for this bot and that bot? And then when it comes to running on our client, all we're doing is asking for predictions. So you might say, well, how can I use machine learning for game development? What are some use cases for it? The most obvious one is actually for QA and testing. So today when we have to make changes to level design or game balance, we have to spend lots and lots of time testing it to make sure we haven't broken that balance, we haven't introduced new bugs. And today we rely on people to do it. If you're able to build an ML bot that's able to play the game for you, you can do testing across the board in parallel. You could farm it out to thousands of machines. You don't even need to have the game playing back in real time speed. You could simulate it as fast as possible. So again, in terms of your process for QA and testing, you can get rapid turnaround cycles for it. Imagine a future where we have smarter non player characters, where our players are able to have realistic dialogue with them, and make realistic choices. Those NPCs could even be used to guide players. So we see this a lot with some of our partners, where they have this really complex, let's say, card game, where people have to build a really well-balanced deck. But it's really hard for a new player to know what kind of choices to make. Perhaps it's a battle royale game where you're trying to build a squad that's well balanced. For someone that's new to the game or is a casual player, they're not able to make those right choices. ML can help us build great NPCs or recommendation engines to help our players through that process, and help them make the right choices to stay competitive. Now, if we have bots that can help us with QA and testing, those bots can also be used against real players. So in a squad based game, isn't it frustrating when one of our players drops, and now we're left with an AI that's either really bad at the game, or exceptionally good? I guess it's not so bad if they're exceptionally good and they're on your team. But when they're exceptionally good on somebody else's team, it doesn't work out very well. If we're able to build a good comprehensive set of bots, you can provide bots that are tailored to the skill level, or complements that team particularly well in those events. So we've covered a few scenarios today of where machine learning, I think, is really going to revolutionize with what happens with game development. So if you're looking to just get started with applying ML in your games, and you don't have any ML expertise, the APIs are a great way to get started, and capture some of those quality of life improvements for your games. Because your players will notice. And integration with those APIs is really, really simple. It's literally an API call. AutoML is able to help you build custom models for your scenarios with your data, again, with zero ML expertise needed. And then when you're ready to take the step up to building your own AI that's able to solve problems, like do automated QA testing, or have smarter NPCs, then step up to something like the AI platform, which will not only just help you train your models at speed, but it'll help you serve them and deploy them out at scale. So I want to say thank you for joining me today. Your time is really valuable to us. We have a couple of links here. The first one is our full set of AI products at cloud.google.com/products/ai. These are not game specific, but have uses within games. And then we have a set of gaming solutions at cloud.google.com /solutions/gaming as well. Please reach out to me on Twitter if you want to get in touch. And have a great I/O. Thank you, everybody. [MUSIC PLAYING]
A2 初級 遊戲開發者的機器學習(Google I/O'19) (Machine Learning for Game Developers (Google I/O'19)) 2 0 林宜悉 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字