Mikelis' Game Blog

Designing, Building, Testing and Shipping AAA-Style Video Game AI

A massive conspectus of industry knowledge I’ve learned as a video game AI Programmer & Systems Designer.

Mikelis' Game Devlog


This article has been co-authored with Giacomo Salvadori, a game design professor at The Sign Academy. Here we talk about our personal insights, views, and opinions.


Designing and producing excellent video game AI systems can be a daunting task. We can all tell when a AAA game has great AI companions or enemies. Still, even if we ask senior game designers, it’s more likely than not that we will hear one or two aspects but not a holistic description of “good” AI.

They might tell us that a great AI is believable, realistic, or challenging. But we can’t honestly call headcrabs or soldiers from Half-Life either of these things and yet they are often praised among some of the best-designed AIs of the time. We might hear that fluency of motion, low error rates, and coherence with the world are essential. But while this works for companion AIs in games like The Last of Us, it’s hard to say that nuke-happy Gandhi in Civilization games and geometry-clipping Minecraft NPCs are any less of a fun AI.

Perhaps it is the fun factor that we should maximize? I think we’re on the right track, but it’s easier said than done! With a variety of video game genres on the market, “fun” is a somewhat subjective term. What is fun and engaging for one fanbase can be much less so for another. What is fun in focus tests is not necessarily fun in the market. And even AI that was critically acclaimed in one franchise title can meet lukewarm reviews in the next.

But I believe that some elements are shared among the most critically lauded AI designs, implementations, and production processes. And today, I will try to do my best and summarize them. There are probably more, but this is not meant to be a complete and exhaustive resource— it’s a beginner's guide.

Glossary

Some terms used in this article might not be familiar for some AI specialists, as some of this industry knowledge is not standardized. As such, I’m providing a glossary below:

  1. AI agent. The object encapsulating one AI instance in the game scene, including its brain, body, data, and processes. In Unreal Engine, this is often split up into a behavior tree, a blackboard, an AI controller, and a character. But as this article is aimed at a broader audience, and indeed this can be implemented as a single class, we consider it the AI agent.
  2. AI variables. AI variables are fields of data stored in the AI agent at run-time, helping it make decisions.
  3. Idiom, paradigm. Ways of approaching a problem.
  4. State machines. A computer science data structure made of states. At any time, a state machine can be executing one or more states. They are called active states. A state machine can transition from a set of active states to a bunch of other states, and then they become active ones. This is known as a transition function or condition. State machines can nest other state machines inside them and execute them like an active state. Such a data structure is called a nested state machine.

Pre-production Decisions

Before any work starts on AI design and production schedules or GDDs are built around it, we need to make some essential decisions with our team.

1. AI Autonomy

AI autonomy, also known as AI level of autonomy, is an industry term broadly describing all or groups of AI agents in our game.

For every group of AIs (generally: filler NPCs, companion characters, enemies, or abstract AIs like the Left 4 Dead director or a chess AI), try to assign one of the following levels of autonomy (lowest to highest):

  1. Scripted AI. This is the lowest level of AI autonomy. Most commonly, companion AIs are heavily scripted, and they switch to a higher level of autonomy when the gameplay requires. For example, you will find that most Half-Life 2 story AIs will either guide you through the story or fight members of the enemy team. Bagley in Watch Dogs: Legion is an even more scripted AI. All of its appearances are pre-recorded and coordinated with gameplay events.
  2. Rule-based AI. These AIs are rare in AAA titles but more common in visual novels and games like chess. When the AI recognizes a game state matching a particular set of rules, it will perform a specific action. For example, every time a chess game is in a given configuration, the AI will make the same chess move. This rule-outcome relationship is pre-programmed. Sometimes it is determined by decision trees like Markov chains or even ones produced with the help of neural nets. Rule-based AIs offer substantially more variety in the gameplay than scripted AIs, but still less variety than the average AAA player expects today.
  3. State-based AI. There are several names for this level of AI autonomy. Still, it always uses some form of a state machine — either a finite state machine, or a hierarchical finite state machine, or a behavior tree. Often, these AIs will use several state machines. One of them could describe how they move (run, crouch, walk), and another could describe their actions (idle, perform a level-themed action, chase the player, talk with another AI, hide from the player, attack the player, be alert and look for the player, and similar). Another state machine could describe AI’s gameplay mode (war, peace, diplomacy, anger, benevolence). Yet another state machine might describe the AI’s emotional state (happy, sad, crying, scared) and feed its state into the animation state machine for the face. It is prevalent in AI companions to switch between a scripted mode and a state-based behavior model. With this level of autonomy, we can already see how the AIs get a lot more dynamic and unpredictable.
  4. Utility-based AI. Utility-based AIs are very common in tycoon-style games, even in some RPGs and action titles. Utility AIs make decisions based on utility functions. For example, an AI in Bioshock might have a healing utility function (among many others) that takes its current health as an input. As health goes lower, AI will consider healing to have more and more utility. Generally, utility-based AIs always switch to a task that it finds to have the highest utility value. With very well-programmed utility functions, it can be highly rational.
  5. Planning AI. This level of autonomy is already much rarer in the game industry, but it is still sometimes used. The primary trait of planning AI is that it can generate a plan to achieve a particular goal. For example, if a cim (citizen) in Cities: Skylines must traverse to a different part of a city, they might come up with a plan that looks like this: spawn on the sidewalk, spawn a car, drive to a train station, wait for a train, take a train, exit a train station, go to a bus stop, wait for a bus, take a bus, get off at a particular bus stop, and go to the final destination on foot, then despawn. While these AIs can strategize very well, they are often used as filler agents, like pedestrians in Grand Theft Auto-style games.
  6. Academic AI. This level of autonomy is the highest one we will mention. It involves AIs that are based on machine learning and are generally too unpredictable and unreliable to be used in video games at runtime. Very large or highly qualified teams can make them viable for games. Still, usually, these will be experimental titles. As a side note - academic AI might be used in game production to dynamically generate or upscale content like textures, sounds, videos, and similar. But offline use like this, where all AI-generated content will be vetted by game developers before it goes into the game, is very different from allowing AI to generate it when the game runs. Only the latter of these is commonly referred to as “game AI”.

Low autonomy AIs are generally favored by game development companies as of the time of writing this article. There are many tools for designing and implementing finite state machines, behavior trees, and scripted AI in games. Unless a project has many dedicated AI designers and programmers, they often do not use AIs of higher autonomy than utility-based.

On the other hand, high-autonomy AIs tend to be easier to scale. Iterating on scripted AIs is a very time-intensive process. While changing what utility AI does usually involves the programmers or designers tweaking weights in utility function classes.

2. AI Complexity

AI complexity is another thing that is good to know early on in the pre-production cycle. This corresponds directly with how much time our team will spend building and designing the AI.

In most AAA games, AI complexity typically relates to the amount of time the player will spend interacting with the AI agents. Suppose the player will not see an AI agent for no more than two minutes. In that case, a basic implementation may be good enough. Now suppose we are talking about companion AIs that accompany our player through the entire game. In that case, a much higher level of complexity and nuanced behavior should be considered.

Complexity also tends to increase with higher AI autonomy. Scripted AIs will always be less complex when they ship to the market than planning AIs. More complex AIs will require more iterations, more bug fixes, and better qualified AI programmers and designers. Be realistic when considering this. Remember that a lot of the industry does tend to choose lower autonomy AIs.

Do not fall into the assumption that a more complex AI is better. Studies have shown that players are not great at all with identifying what strategies the AIs employ. Hence, a simpler AI is the more intelligent choice. When the team starts focus testing, make sure to ask the testers to describe how AI thinks. You might be surprised to see that players tend to attribute more autonomy to AI than it generally has.

Finally, it is not uncommon for architects of game AI to think our AI is simplistic, but the players see it differently. In Halo: Combat Evolved, AIs do not even have sight or hearing senses. They simply go to a location within range of the player and fire bullets at AI agents in a different team. And this is not seen as bad AI by the players.

3. Sense-think-act

The sense-think-act paradigm is commonly used in game AI technical systems design, robotics, and many real-world AI applications. Early in pre-production, we should try to generate two lists of items — one for things that the AI will sense and one for actions that it should perform. Then, connect them with thoughts of our chosen level of autonomy and complexity.

Let’s quickly do a sample analysis of the sense-think-act pattern of an AI agent archetype for a Half-Life headcrab. The headcrab (rule-based low-complexity enemy AI) should sense when a player is near and perform a jumping attack action. Otherwise, it should wander on a navmesh. For thinking, it should distinguish the player by its character belonging to a different team. It should have a range within which it will jump to attack. Once the jump is triggered, it can succeed or fail based on whether the headcrab character gets close to the player during the action. After the jump is activated, there is a cooldown before another one can begin.

Mikelis' Game Devlog

An example matrix for the sense-think-act paradigm of Half-Life headcrab AI.

A sense-think-act matrix is one of the most powerful communication techniques between AI programmers and AI designers. It clearly describes both what kind of experience the AI is supposed to give to the player, as well as a concrete list of items or deliverables to be implemented. But even only using it as a design tool can help us stay organized.

Demonstration of Half-Life headcrabs.

4. Direct and Summary Variables

Most game AI agents will react to gameplay rules and the world around them in two principal ways — directly and through an overall understanding of the game and their own state. For this, we need to get a bunch of data into our AI agents. From a data model perspective, we say that we store the information about immediate information (AI senses like touch, sight, and hearing) and general information (AI traits, awareness of enemies, health, tiredness, awareness of interactions in the world, and others) in its direct and summary variables respectively.

Before finalizing the AI design, defining what direct and summary variables our AI will have is crucial. Direct variables tend to be more involved with the “sense” part of sense-think-act, and summary variables tend to modify the thinking and actions of the AI brain.

For example, an AI enemy might have the capacity to know when the player is near. In this case, we might choose to store a reference to the player’s character as a direct variable in the AI agent — a simple pointer to a player object is fine. With this information, the AI can think and act based on how the distance between the characters changes.

On the other hand, we might have a stamina summary variable for the AI with a maximum bound to which it returns over time. But as the AI chases the player, it depletes. And when it reaches zero, the AI will end the chase. Notice that this variable summarizes an aspect of the game instead of simply and directly referring to an object in memory. Scalar summary variables can also be used the other way around — by increasing to trigger certain actions rather than decreasing. For example, many games like Far Cry or Assassin’s Creed will have a player awareness summary variable that will go from 0 to a high bound as the player stays in AI’s field of view, only triggering a chase after this summary variable reaches its high bound.

Summary variables are a way to introduce persistence in AI, which is to say — enabling it to change its behavior based on circumstances present over time organically. Whereas direct variables tend to trigger relatively immediate actions. Direct variables tend to point directly to other objects in the game space, while summary variables describe an abstract aspect of the game which can be influenced by many of the game’s elements. However, in some cases, the distinction can be a bit blurry. So it can be helpful to look at the duality as a spectrum.

4.1. A Note on Summary Variables

Summary variables are potent in creating an illusion of complex AI when combined with random-walk and genetic algorithms. They are a fantastic tool to build more well-rounded AI agents without expanding the scope of work and team competency requirements typically associated with complex AIs.

For example, suppose an awareness bar is shown for each enemy AI agent. We can make awareness go up and down at different speeds for all AIs by introducing a random deviation up to 3% of the current awareness value every 1 second. To the player, this will appear as if AIs have different personalities or there are complex factors affecting their awareness change. Still, in reality, it’s just a random-walk algorithm we implemented in 20 minutes.

Genetic summary variables allow for quick customization of AI agents based on hashes pre-set by game designers or generated when the play session begins dynamically. For example, we can configure the following summary variables using a numeric hash: height, speed, awareness decay rate, stamina, weapons in possession, and similar. To the unsuspecting player, this will make it seem like we have built several AI agent archetypes. But all we did was implement another easy algorithm. Moreover, suppose we use normal distribution probability curves in our genetic algorithm. In that case, the subtle changes in AI agent configurations will build an overall more diverse and genuine-feeling set of enemies.

For a better illustration, here is an example of a set of AI summary and direct variables for a Far Cry-like enemy:

In particular, let’s have a look at how a player awareness summary variable could work. Over time, it could decay naturally, and it would go up only when the player is seen. Once a threshold value is reached, the AI would go from an idle state into an attack state. However, in addition to this, we can add an arbitrary value to it at random times. This would make it seem to the player that there is more going on to the awareness summary variable calculation than there actually is.

Mikelis' Game Devlog

The awareness indicators can vary their decay rate to simulate complexity.

Pre-production Summary

To recap, in the pre-production of game AIs, we should consider doing the following:

  1. Thinking about AI autonomy: What kind of game is this? How often will the player see or interact with the AI, and for how long? What is the lowest level of autonomy we can use?
  2. Thinking about AI complexity: Is it possible to build fewer AI agent archetypes and distill them into their essential features?
  3. Thinking about AI senses and actions: What senses will the AI agent possess? What actions will they be able to perform? What is the cardinality or order of actions that it will do?
  4. Thinking about AI thinking: How will we connect the AI senses and actions with our chosen level of autonomy?
  5. Thinking about the direct and summary variables the AI will use: What direct and generic information is relevant to the AI? How will it affect the AI agent’s decision-making?
  6. Thinking about the big picture: What can be cut? What can be replaced with cheating? What can be simplified? Is the autonomy, number of senses and actions, and overall complexity appropriate for our team size? Look at other games and team sizes if necessary.

By now, we have discussed a series of significant decisions that will impact our production timelines, the scope of work, team competency requirements, and even several dirty tricks on how to do more with what we have.

The following section will provide a considerable overview of different design tools at our disposal to make AI generally liked by our players.

Good Game AI

Elusive as “good” game AI is, we have some clues on what is generally appreciated by players and what isn’t necessarily so. But before we dive into that, let’s start by quickly getting some common misconceptions about game AI out of the way.

Common Misconceptions

Mikelis' Game Devlog

Players will attribute randomness to strategy.

Generally Liked AI Features

Mikelis' Game Devlog

AI with good persistency spends a reasonable amount of time in each state. It does not switch states too often.

Mikelis' Game Devlog

Latency: great AIs have reaction times.

Mikelis' Game Devlog

A pre-defined AI patrol path is a very evident strategy from which the player can hide in interesting, safe spaces.

Mikelis' Game Devlog

You know what to do — this doesn’t require an explanation.

Mikelis' Game Devlog

AI using the same resources as the player in the game world.

Throughout Left 4 Dead 2, the survivors constantly talk about the story and gameplay events happening around them.

Mikelis' Game Devlog

This AI has a token to attack the player while others must wait.

Anthropomorphisation

Mikelis' Game Devlog

Emotional AI communicating through barks — a common way to anthropomorphize AI.

Anthromorphisation is the effort through which we make AI appear more human. It is by far the most effective way to build rapport between AIs and the player. There are many tools at our disposal to do so; below are some examples:

It is important to note that anthropomorphized AIs are always more relatable and generally more liked. But they might not be considered realistic, and that is more a by-product than the purpose of anthropomorphization. It is instead building rapport and a meaningful connection between the player and the AI agent.

Hustling

Hustling is the ability and choice the AI makes to cheat to put the player at an advantage.

Mikelis' Game Devlog

You can think of it as bad aim, but really it’s great hustling!

Attack tokens are a common way enemy AIs will hustle when attacking the player. Usually, the AI agent that is closest to the player, more vulnerable to the player, or one that hasn’t taken damage in a long time is the one that gets assigned the attack token. Then, only this AI agent (or in some cases, several) can attack the player until the token is reassigned after a while. Sometimes, if the player is doing poorly, no enemy might get an attack token, or one enemy AI might get it but be instructed to take more damage from the player or cause less damage to them. This AI behavior is frequently seen in the Assassin’s Creed and Far Cry game series.

It’s not trivial to hide AI agents' attempts at hustling, but here are some starting tips that we can use to make an AI enemy hustle:

Suppose the AI agent is a companion of the player. In that case, the AI is expected to be invincible, or it might bark about being hurt or running out of ammo but not actually suffer any ill effects.

Liked AI Features Summary

This section has elaborated many generally (and nearly universally) liked AI features that we can employ in our game designs and some common examples of their application.

The following section will discuss some of the AI systems that could inform our technical design.

Game Systems

Building a triple-A-style AI involves many game systems. And as we have seen, the scope of these projects can be daunting and extensive. So in this section, I want to discuss some of the game systems we can use in our technical AI systems design to make our work easier. Just in case we have not thought about using these tools yet.

Hierarchical Finite State Machines

Hierarchical finite state machines (HFSM) are visual representations of state machines that generally take care of the most difficult “think” part of the sense-think-act paradigm. The AI agent typically owns an HFSM and feeds information into its buffers or sinks about what happens in the world. The finite state machine then checks these sinks for its transition functions/conditions and calls actions.

Depending on the purpose of the state machine, it might be helpful to define individual states as action blocks that can be reused in an abstract and encapsulated way. For example, some AI action blocks might be: Go To World Location, Attack Character, and Play Action Sequence (for purpose). Scripts built of these action blocks can be nested in more general state machines denoting an overall AI state, like Idle, Sleep, Investigate, and similar.

But hierarchical finite state machines can handle more than just the “think” responsibility of the sense-think-act pattern. They can control character movement, emotions, animations (by default, Unreal Engine already handles character animations with state machines), and many other things. It is not uncommon for an AI agent to be running multiple state machines concurrently, creating a robust overall AI agent experience.

Many Unity and Unreal Engine plugins implement this functionality, as HFSMs are a common way to implement AI brains in AAA games.

Token Director

A token director is a game system issuing and withdrawing various tokens for AIs that could be necessary to unlock and perform states and actions. For example, if the AI wishes to heal, it might request a health pick-up token, and one would be issued to it by the token director. Then the AI might perform an action involving navigating to the health pick-up and consuming it.

Event Manager, Director or Messaging System

These are systems that inform all AI agents of what is going on in gameplay or the world so that they can react as they choose. You are invited to use my own event messaging plugin for Unreal Engine for this purpose or just to see how this system can work.

Last Known Position Director

A last known position system is often used as a shared bit of memory between all or some AI agents. When at least one AI agent sees a character of its enemy team for a set period, it can report the last known position to the director. All AIs can access the information of the last known position of any character. After a while, these last known positions registered within the director expire. This is similar to how many games like Assassin’s Creed and Far Cry coordinate all enemies’ responses to the player. Barks are usually used to ground this aspect of information sharing in the game world. For example, an AI agent might say, “I see something over there!” before reporting a player’s position to the last known position director.

Measuring AI

We have talked a lot about designing and implementing AI and some of the systems that might be involved. I am sure that by now, we are well-covered on the technical systems design front. But how can we know if we’re doing well when we are producing our AI? There is a lot of work to be done before showing it to focus group testers and getting some feedback. And honestly, AI design is really a world of its own, so no one can blame us for being a bit lost.

Luckily, there are sets of very concrete questions we can ask ourselves at all stages of production to evaluate if our AI is turning out well. Although some of these questions do not have universal answers and will depend on your goals as a development team.

Early pre-production

The goal of pre-production for AI design and R&D is to discover a feasible implementation of AI within a production schedule. In addition to the questions mentioned in the Pre-production Summary above, some more complex ones need to be answered to measure if we are doing pre-production well:

  1. Can we reduce the level of complexity of our AI? Some people like Elon Musk recommend reducing the complexity of technical design by 10% every few months. Doing it up-front is an advantage.
  2. Are we reminding ourselves of the differences in perspective players and AI developers have? Are we not falling for the common misconceptions?

Late pre-production and production

In late pre-production, the R&D should be complete, and we should have a little bit of experience building the new AI systems with the team. So it’s an excellent time to evaluate how the systems are turning out on a technical and design level. Here are some of the questions we can ask:

  1. What is the technical debt ratio of our AI systems? It should be less than 10% if we calculate it as a function of remediation cost (estimated time to fix all defects) and over-development cost (time to rewrite the entire system). In particular, the formula I propose is TechnicalDebt = RemediationCost / DevelopmentCost.
  2. Are the programmers given enough time to refactor AI systems? At least 20% of the time should be spent refactoring systems while iterating in a rapid prototyping way, as high-complexity AI systems can get out of control quickly.
  3. Are we allocating quality assurance and focus group resources for qualitative AI assessments? In other words, are we asking our players enough open-ended questions?
  4. Is our AI feature scope growth (number of features and feature requests now compared to the number of features last month) sustainable?
  5. How is the AI using relevant surrounding information? This includes stimuli sources, game mechanics, and objects of importance.
  6. Is the AI using direct and summary information as designed, and is this acceptable?
  7. Can the AI complexity be reduced? Perhaps it is possible to replace features with cheats? Is it possible to reduce the complexity of our systems without notably changing how it is perceived by the players?
  8. Is the AI acting predictably? Is its strategy obvious?
  9. How persistent is the AI? Is it not switching states or actions too often? Would a person change their mind as often as our AI does?
  10. How is our effort to implement hustling going?
  11. Have we built all debugging tools necessary for resolving AI bugs? How much extra time do we spend to fix issues after feature requests? Suppose we see that programmers spend more than 10% of their time fixing problems that pop up after feature requests. In that case, it is essential to consider how this will impact the production schedule and if it is a matter of insufficient debugging tools, expertise, high technical debt, or iterating too fast.

Quantitative AI assessments

It is already possible to run quantitative AI assessments with focus groups or quality assurance staff during the production period. Here are some questions we should ask our testers:

  1. How competent is AI at achieving the goals we have set up for it?
  2. What is the visible error rate? This can be measured in AI errors seen by the player per minute of playtime looking at AI agents.
  3. What is the visible error severity? Error severity can be broadly categorized as follows:
    • Critical: “This AI agent freezes, crashes, or does not perform basic functions.”
    • High: “This AI agent is idiotic, but it can perform basic functions.”
    • Medium: “I thought the AI was going to do something better than what it did.”
    • Low: “I have an idea on how to improve the AI agent in this circumstance.”
    • Notable: “I know the AI agent made a mistake, but it was not less fun and still met my expectations.”

Mikelis' Game Devlog

AI error of medium severity: “I thought the AI was going to do something better than what it did”.

On the other hand, it is also important to ask our play-testers open-ended, qualitative questions like:

  1. What are the areas where our AI fell short of industry-standard implementations?
  2. What was fun in interacting with the AI?
  3. What was confusing or unpredictable when interacting with the AI?

Conclusion

Whew! That was a lot. But I hope you found it helpful. We learned about starting with a strong video game AI design from pre-production, what AI features are generally well-received by players, what kind of tools we have as programmers to implement them, and how to quality-assure our artificial intelligence agents.

Compiling and reviewing this information took several months. If you liked this article and want to let us know it helped you, don’t forget to hold down the clap button for +50 claps and share it with your industry friends!

Further Reading

If you’d like to learn more about video game AI design and programming, a fantastic textbook series “Game AI Pro” by Steve Rabin can be a great continuation of this article. It discusses many of the topics touched on in this article but takes them to an entirely new level of technical depth.

#Cpp #Design #Game AI #Game Development #Longform #Unreal Engine