Designing, Building, Testing and Shipping AAA-Style Video Game AI

30 Sep, 2021

A massive conspectus of industry knowledge I’ve learned as a video game AI Programmer & Systems Designer.

Mikelis' Game Devlog

This article has been co-authored with Giacomo Salvadori, a game design professor at The Sign Academy. Here we talk about our personal insights, views, and opinions.

Designing and producing excellent video game AI systems can be a daunting task. We can all tell when a AAA game has great AI companions or enemies. Still, even if we ask senior game designers, it’s more likely than not that we will hear one or two aspects but not a holistic description of “good” AI.

They might tell us that a great AI is believable, realistic, or challenging. But we can’t honestly call headcrabs or soldiers from Half-Life either of these things and yet they are often praised among some of the best-designed AIs of the time. We might hear that fluency of motion, low error rates, and coherence with the world are essential. But while this works for companion AIs in games like The Last of Us, it’s hard to say that nuke-happy Gandhi in Civilization games and geometry-clipping Minecraft NPCs are any less of a fun AI.

Perhaps it is the fun factor that we should maximize? I think we’re on the right track, but it’s easier said than done! With a variety of video game genres on the market, “fun” is a somewhat subjective term. What is fun and engaging for one fanbase can be much less so for another. What is fun in focus tests is not necessarily fun in the market. And even AI that was critically acclaimed in one franchise title can meet lukewarm reviews in the next.

But I believe that some elements are shared among the most critically lauded AI designs, implementations, and production processes. And today, I will try to do my best and summarize them. There are probably more, but this is not meant to be a complete and exhaustive resource— it’s a beginner's guide.

Glossary

Some terms used in this article might not be familiar for some AI specialists, as some of this industry knowledge is not standardized. As such, I’m providing a glossary below:

AI agent. The object encapsulating one AI instance in the game scene, including its brain, body, data, and processes. In Unreal Engine, this is often split up into a behavior tree, a blackboard, an AI controller, and a character. But as this article is aimed at a broader audience, and indeed this can be implemented as a single class, we consider it the AI agent.
AI variables. AI variables are fields of data stored in the AI agent at run-time, helping it make decisions.
Idiom, paradigm. Ways of approaching a problem.
State machines. A computer science data structure made of states. At any time, a state machine can be executing one or more states. They are called active states. A state machine can transition from a set of active states to a bunch of other states, and then they become active ones. This is known as a transition function or condition. State machines can nest other state machines inside them and execute them like an active state. Such a data structure is called a nested state machine.

Pre-production Decisions

Before any work starts on AI design and production schedules or GDDs are built around it, we need to make some essential decisions with our team.

1. AI Autonomy

AI autonomy, also known as AI level of autonomy, is an industry term broadly describing all or groups of AI agents in our game.

For every group of AIs (generally: filler NPCs, companion characters, enemies, or abstract AIs like the Left 4 Dead director or a chess AI), try to assign one of the following levels of autonomy (lowest to highest):

Scripted AI. This is the lowest level of AI autonomy. Most commonly, companion AIs are heavily scripted, and they switch to a higher level of autonomy when the gameplay requires. For example, you will find that most Half-Life 2 story AIs will either guide you through the story or fight members of the enemy team. Bagley in Watch Dogs: Legion is an even more scripted AI. All of its appearances are pre-recorded and coordinated with gameplay events.
Rule-based AI. These AIs are rare in AAA titles but more common in visual novels and games like chess. When the AI recognizes a game state matching a particular set of rules, it will perform a specific action. For example, every time a chess game is in a given configuration, the AI will make the same chess move. This rule-outcome relationship is pre-programmed. Sometimes it is determined by decision trees like Markov chains or even ones produced with the help of neural nets. Rule-based AIs offer substantially more variety in the gameplay than scripted AIs, but still less variety than the average AAA player expects today.
State-based AI. There are several names for this level of AI autonomy. Still, it always uses some form of a state machine — either a finite state machine, or a hierarchical finite state machine, or a behavior tree. Often, these AIs will use several state machines. One of them could describe how they move (run, crouch, walk), and another could describe their actions (idle, perform a level-themed action, chase the player, talk with another AI, hide from the player, attack the player, be alert and look for the player, and similar). Another state machine could describe AI’s gameplay mode (war, peace, diplomacy, anger, benevolence). Yet another state machine might describe the AI’s emotional state (happy, sad, crying, scared) and feed its state into the animation state machine for the face. It is prevalent in AI companions to switch between a scripted mode and a state-based behavior model. With this level of autonomy, we can already see how the AIs get a lot more dynamic and unpredictable.
Utility-based AI. Utility-based AIs are very common in tycoon-style games, even in some RPGs and action titles. Utility AIs make decisions based on utility functions. For example, an AI in Bioshock might have a healing utility function (among many others) that takes its current health as an input. As health goes lower, AI will consider healing to have more and more utility. Generally, utility-based AIs always switch to a task that it finds to have the highest utility value. With very well-programmed utility functions, it can be highly rational.
Planning AI. This level of autonomy is already much rarer in the game industry, but it is still sometimes used. The primary trait of planning AI is that it can generate a plan to achieve a particular goal. For example, if a cim (citizen) in Cities: Skylines must traverse to a different part of a city, they might come up with a plan that looks like this: spawn on the sidewalk, spawn a car, drive to a train station, wait for a train, take a train, exit a train station, go to a bus stop, wait for a bus, take a bus, get off at a particular bus stop, and go to the final destination on foot, then despawn. While these AIs can strategize very well, they are often used as filler agents, like pedestrians in Grand Theft Auto-style games.
Academic AI. This level of autonomy is the highest one we will mention. It involves AIs that are based on machine learning and are generally too unpredictable and unreliable to be used in video games at runtime. Very large or highly qualified teams can make them viable for games. Still, usually, these will be experimental titles. As a side note - academic AI might be used in game production to dynamically generate or upscale content like textures, sounds, videos, and similar. But offline use like this, where all AI-generated content will be vetted by game developers before it goes into the game, is very different from allowing AI to generate it when the game runs. Only the latter of these is commonly referred to as “game AI”.

Low autonomy AIs are generally favored by game development companies as of the time of writing this article. There are many tools for designing and implementing finite state machines, behavior trees, and scripted AI in games. Unless a project has many dedicated AI designers and programmers, they often do not use AIs of higher autonomy than utility-based.

On the other hand, high-autonomy AIs tend to be easier to scale. Iterating on scripted AIs is a very time-intensive process. While changing what utility AI does usually involves the programmers or designers tweaking weights in utility function classes.

2. AI Complexity

AI complexity is another thing that is good to know early on in the pre-production cycle. This corresponds directly with how much time our team will spend building and designing the AI.

In most AAA games, AI complexity typically relates to the amount of time the player will spend interacting with the AI agents. Suppose the player will not see an AI agent for no more than two minutes. In that case, a basic implementation may be good enough. Now suppose we are talking about companion AIs that accompany our player through the entire game. In that case, a much higher level of complexity and nuanced behavior should be considered.

Complexity also tends to increase with higher AI autonomy. Scripted AIs will always be less complex when they ship to the market than planning AIs. More complex AIs will require more iterations, more bug fixes, and better qualified AI programmers and designers. Be realistic when considering this. Remember that a lot of the industry does tend to choose lower autonomy AIs.

Do not fall into the assumption that a more complex AI is better. Studies have shown that players are not great at all with identifying what strategies the AIs employ. Hence, a simpler AI is the more intelligent choice. When the team starts focus testing, make sure to ask the testers to describe how AI thinks. You might be surprised to see that players tend to attribute more autonomy to AI than it generally has.

Finally, it is not uncommon for architects of game AI to think our AI is simplistic, but the players see it differently. In Halo: Combat Evolved, AIs do not even have sight or hearing senses. They simply go to a location within range of the player and fire bullets at AI agents in a different team. And this is not seen as bad AI by the players.

3. Sense-think-act

The sense-think-act paradigm is commonly used in game AI technical systems design, robotics, and many real-world AI applications. Early in pre-production, we should try to generate two lists of items — one for things that the AI will sense and one for actions that it should perform. Then, connect them with thoughts of our chosen level of autonomy and complexity.

Let’s quickly do a sample analysis of the sense-think-act pattern of an AI agent archetype for a Half-Life headcrab. The headcrab (rule-based low-complexity enemy AI) should sense when a player is near and perform a jumping attack action. Otherwise, it should wander on a navmesh. For thinking, it should distinguish the player by its character belonging to a different team. It should have a range within which it will jump to attack. Once the jump is triggered, it can succeed or fail based on whether the headcrab character gets close to the player during the action. After the jump is activated, there is a cooldown before another one can begin.

Mikelis' Game Devlog

An example matrix for the sense-think-act paradigm of Half-Life headcrab AI.

A sense-think-act matrix is one of the most powerful communication techniques between AI programmers and AI designers. It clearly describes both what kind of experience the AI is supposed to give to the player, as well as a concrete list of items or deliverables to be implemented. But even only using it as a design tool can help us stay organized.

Demonstration of Half-Life headcrabs.

4. Direct and Summary Variables

Most game AI agents will react to gameplay rules and the world around them in two principal ways — directly and through an overall understanding of the game and their own state. For this, we need to get a bunch of data into our AI agents. From a data model perspective, we say that we store the information about immediate information (AI senses like touch, sight, and hearing) and general information (AI traits, awareness of enemies, health, tiredness, awareness of interactions in the world, and others) in its direct and summary variables respectively.

Before finalizing the AI design, defining what direct and summary variables our AI will have is crucial. Direct variables tend to be more involved with the “sense” part of sense-think-act, and summary variables tend to modify the thinking and actions of the AI brain.

For example, an AI enemy might have the capacity to know when the player is near. In this case, we might choose to store a reference to the player’s character as a direct variable in the AI agent — a simple pointer to a player object is fine. With this information, the AI can think and act based on how the distance between the characters changes.

On the other hand, we might have a stamina summary variable for the AI with a maximum bound to which it returns over time. But as the AI chases the player, it depletes. And when it reaches zero, the AI will end the chase. Notice that this variable summarizes an aspect of the game instead of simply and directly referring to an object in memory. Scalar summary variables can also be used the other way around — by increasing to trigger certain actions rather than decreasing. For example, many games like Far Cry or Assassin’s Creed will have a player awareness summary variable that will go from 0 to a high bound as the player stays in AI’s field of view, only triggering a chase after this summary variable reaches its high bound.

Summary variables are a way to introduce persistence in AI, which is to say — enabling it to change its behavior based on circumstances present over time organically. Whereas direct variables tend to trigger relatively immediate actions. Direct variables tend to point directly to other objects in the game space, while summary variables describe an abstract aspect of the game which can be influenced by many of the game’s elements. However, in some cases, the distinction can be a bit blurry. So it can be helpful to look at the duality as a spectrum.

4.1. A Note on Summary Variables

Summary variables are potent in creating an illusion of complex AI when combined with random-walk and genetic algorithms. They are a fantastic tool to build more well-rounded AI agents without expanding the scope of work and team competency requirements typically associated with complex AIs.

For example, suppose an awareness bar is shown for each enemy AI agent. We can make awareness go up and down at different speeds for all AIs by introducing a random deviation up to 3% of the current awareness value every 1 second. To the player, this will appear as if AIs have different personalities or there are complex factors affecting their awareness change. Still, in reality, it’s just a random-walk algorithm we implemented in 20 minutes.

Genetic summary variables allow for quick customization of AI agents based on hashes pre-set by game designers or generated when the play session begins dynamically. For example, we can configure the following summary variables using a numeric hash: height, speed, awareness decay rate, stamina, weapons in possession, and similar. To the unsuspecting player, this will make it seem like we have built several AI agent archetypes. But all we did was implement another easy algorithm. Moreover, suppose we use normal distribution probability curves in our genetic algorithm. In that case, the subtle changes in AI agent configurations will build an overall more diverse and genuine-feeling set of enemies.

For a better illustration, here is an example of a set of AI summary and direct variables for a Far Cry-like enemy:

Direct: event buffer for stimuli (sight, touch), player character for distance, event buffer for gameplay events (player enters an area, leaves an area, cutscene starts, the cutscene ends).
Summary: awareness of enemies (random-walk decaying), stamina (decaying when running), stamina max bound (genetic), height (genetic), locomotion speed multiplier (genetic), aggressiveness (decaying), health, max health (genetic), risk-aversion (genetic), bark variation (genetic), voice variation (genetic).

In particular, let’s have a look at how a player awareness summary variable could work. Over time, it could decay naturally, and it would go up only when the player is seen. Once a threshold value is reached, the AI would go from an idle state into an attack state. However, in addition to this, we can add an arbitrary value to it at random times. This would make it seem to the player that there is more going on to the awareness summary variable calculation than there actually is.

Mikelis' Game Devlog

The awareness indicators can vary their decay rate to simulate complexity.

Pre-production Summary

To recap, in the pre-production of game AIs, we should consider doing the following:

Thinking about AI autonomy: What kind of game is this? How often will the player see or interact with the AI, and for how long? What is the lowest level of autonomy we can use?
Thinking about AI complexity: Is it possible to build fewer AI agent archetypes and distill them into their essential features?
Thinking about AI senses and actions: What senses will the AI agent possess? What actions will they be able to perform? What is the cardinality or order of actions that it will do?
Thinking about AI thinking: How will we connect the AI senses and actions with our chosen level of autonomy?
Thinking about the direct and summary variables the AI will use: What direct and generic information is relevant to the AI? How will it affect the AI agent’s decision-making?
Thinking about the big picture: What can be cut? What can be replaced with cheating? What can be simplified? Is the autonomy, number of senses and actions, and overall complexity appropriate for our team size? Look at other games and team sizes if necessary.

By now, we have discussed a series of significant decisions that will impact our production timelines, the scope of work, team competency requirements, and even several dirty tricks on how to do more with what we have.

The following section will provide a considerable overview of different design tools at our disposal to make AI generally liked by our players.

Good Game AI

Elusive as “good” game AI is, we have some clues on what is generally appreciated by players and what isn’t necessarily so. But before we dive into that, let’s start by quickly getting some common misconceptions about game AI out of the way.

Common Misconceptions

Realistic AI is universally more liked. This is simply not a universal truth, as the best reality-approximating AIs are in the academic realm and are pretty unsatisfactory for gamers. Whereas almost all game AI is much less true to life. However, in eSports games, realistic AIs that players can train against are appreciated. For example, Markov chains are sometimes used in strategy games and classic games like chess to emulate real-life opponents.
AI realism is closely related to difficulty. While an increase in difficulty may make it seem like the AI is more realistic momentarily, genetic algorithms and good use of summary variables make a much bigger impact. Pushing the idea of difficulty too far can make our AI agents unrealistically difficult, making them less liked. We will soon discuss anthropomorphization, a great tool we use to achieve AIs that feel realistic through rapport & empathy. But before we leave this point, I have an important distinction to make: players do indeed think that a harder AI agent is more intelligent, just not necessarily more realistic.
Latency in AI is terrible; persistence is not important; AI must always be in the correct state. This comes from programmers generally being taught to write low-latency fast systems. Players need to empathize with the AI as if it is human, not a program. The human trait of being slow to react to stimuli and slow to change their behavior makes it easier to ascribe motives, feelings, and personalities to AI, making it easier to empathize with. It is quite ironic when game developers want quick AI reactions to changes in the environment while taking a long time and a lot of convincing to change their own minds. Just like it is with humans, AI thinking should take a little bit of time.
More complex AI is better AI. This has been briefly mentioned above already, with several examples illustrating how it is not a valid claim. More complex AI just eats up development time.
Players perceive the AI agents in a similar way that designers see them. In actuality, players always perceive AI as more unpredictable, chaotic, intelligent, personable, motivated, biased, and cheating than it really is. Moreover, players generally fail to identify AI strategies. Therefore, as designers, we should know the difference between adding complexity to make the AI appear complex to us and making the AI seem complex to the player. The latter requires much lower actual complexity.
Nonsense AIs do not make sense. Even AIs that do not make sense in their actions are often justified by the players. For example, suppose an AI unit is traveling across a city in a city-builder game. In that case, there does not need to be a genuine reason for the travel — the players will generally believe there is a real reason, however. Generating AI behaviors from random hashes is very acceptable.
Players can tell precisely how challenging the AI is. Players are awful at estimating AI difficulty. So much so that many games employ a technique called hustling, where the AI will dynamically adjust its difficulty (more below) to make the player win more often. However, regardless of whether the AI hustles more or less, the players still feel that their win is just as justified.
Cheating is bad. Cheating AI allows for scope reductions and isn’t generally perceived negatively by the player base unless it is very evident and against the player’s goals. For example, think about Left 4 Dead 2 co-op companion AI that teleports if it gets stuck on geometry. Resolving the navigational issues in all maps could cost many weeks of development, whereas a teleporting cheat is an entirely appropriate solution. Not to mention that it absolves the community creators of levels from a lengthy AI locomotion quality assurance process.
Random AI will be seen as random. Closely related to an earlier point about nonsense AIs, it can be observed that players do not see random AI choices as random. They almost always ascribe a strategy to them, allowing us to reduce the design and development scope.

Mikelis' Game Devlog

Players will attribute randomness to strategy.

AI must always run in the world. It’s okay to emulate or suspend AI when the player is unlikely to see it, just as it is acceptable to tick or compute things on the AI agents less. Entire locomotion systems can be replaced with teleports. Distant AIs can be replaced with billboards. It is interesting to note that players of Cyberpunk 2077 have expressed that car AI billboards look bad in the distance. However, there have generally been no negative remarks on the same technique used in Grand Theft Auto V.

Generally Liked AI Features

Anthropomorphisation. Most players consider human-like AIs more relatable and are more likely to ascribe reasons for their shortcomings or complexity to their strategy. We will talk about anthropomorphization in the section below, as it definitely deserves its own discussion.
Persistency. As we talked about earlier, persistency is the tendency of AI to stick to what it is doing. It manifests in two ways — action switches and summary variable changes. Once the AI starts an action, it should not switch to a different action immediately. The accepted persistency for most activities like combat or patrolling is at least 2 seconds. The players will find it disruptive if the AI’s state changes more than that. For the second example, suppose the AI has a scalar summary variable that is decaying or increasing. In that case, it should do so slowly to emulate the slow liminal thinking states that humans go through when their emotions drift. The AI should not change its mindset about the world immediately when it receives some stimuli.

Mikelis' Game Devlog

AI with good persistency spends a reasonable amount of time in each state. It does not switch states too often.

Latency. Latency is often related to persistency, but they are disparate metrics. Latency is the tendency to have prolonged reaction times. For example, there is no need for AI to immediately respond to stimuli. Assuming a sink programming pattern, the stimuli can be collected over time and addressed at once 5 times a second. There is no need for immediate reactions, and indeed, they make the AI less believable.

Mikelis' Game Devlog

Latency: great AIs have reaction times.

Evidence of Strategy. As mentioned before, players like to ascribe a strategy to even random AI actions. But when that strategy does not materialize as they play, it leads to dissatisfaction as the player’s own strategy might be ruined. Consider patrolling AI in stealth games or stealth sections of Uncharted 4 or Horizon: Zero Dawn. The AI has a very obvious patrol strategy with pronounced patrol points. Players are satisfied when they can observe and learn AI’s patterns to outsmart it. This applies especially in strategy games where strategy is deeply involved in the main game loops.

Mikelis' Game Devlog

A pre-defined AI patrol path is a very evident strategy from which the player can hide in interesting, safe spaces.

Revenge. Revenge is a method of making AI appear more human. However, it is used more than what would be considered normal human behavior in many games. For example, taking Grand Theft Auto V, the AI might smash their cars into the player’s as revenge if they upset them. Revenge is likely an element of AI thinking where an anthropomorphic feature is enhanced by apparent and overt evidence of strategy. However, players seem to expect this level of overtness in revenge, so it deserves its own entry on this list.
Hustling. Hustling is AI’s ability to and affinity for cheating to make gameplay easier for the player. AI companions will hustle to help the player, and AI enemies tend to hustle to weaken themselves. We will discuss it in the chapter below.
Predictability. Predictability is closely related to evidence of strategy. However, predictability also involves meeting expectations that the players might bring to our game from other titles. Through predictability and the players’ knowledge of how AI works in other similar games, we can achieve better evidence of strategy. While this might seem that it could make our AI less unique, it makes it more understandable, putting the player at a strategic advantage that they enjoy.

Mikelis' Game Devlog

You know what to do — this doesn’t require an explanation.

Use of Game World. AIs that use the game world tends to be easier to empathize with as they inevitably carry out actions possible for the player. For example, if the AI needs to heal using the same health-pack pick-up as the player, it will be easier to like it. Consider how guns, ammo, pills, and health pick-ups are used in the Left 4 Dead series for a more concrete example.

Mikelis' Game Devlog

AI using the same resources as the player in the game world.

Use of Gameplay Events. Similar to using the game world, AIs better integrate into the game if they can react to gameplay events. This is often seen in companion AI. A good example is the Left 4 Dead voice-over system. The survivors trigger dialogues between them reacting to world events. If your team is working on an Unreal Engine game, you are welcome to use my plugin for event messaging between all objects and actors.

Throughout Left 4 Dead 2, the survivors constantly talk about the story and gameplay events happening around them.

Desynchrony. If you have ever implemented AI agents in a game where more than just a few can interact with the player, you will notice that they can sync their actions. For example, several AI enemies can attack the player at the same time, killing them instantly. Or several AI agents can play the same movement animation. What is worse, they may be taking the same navigation path in an orderly line. Desynchrony is important as it is the element that breaks up these synchronizations between AI agents. However, if our AI is supposed to be robotic, synchronization can add to its authenticity.
Tokens. Tokens are a method of ensuring that AI agents do not partake in the same action all at once and are closely related to desynchrony and hustling. For example, in games like the Assassin’s Creed series, when the player is surrounded by enemies, only one of them can attack at the same time. There is some permutation of a game system issuing an attack token to only one AI agent, which may hurt the player. Such a design pattern ensures that neither the player nor a different game system is overwhelmed with interactions. For example, if multiple AIs need to heal, they must do it using an in-game pick-up, tokens linked to particular pick-ups should be issued to some AIs, ensuring that at least some health-packs are left for the player. This behavior can be seen in the Left 4 Dead series. It guarantees an equal and fair distribution of resources and prevents multiple AIs from going to a location where only one interaction is possible, making the other AI return “empty-handed”.

Mikelis' Game Devlog

This AI has a token to attack the player while others must wait.

Purpose. Players tend to enjoy AI agents that have a goal in the world other than interacting with the player. This makes them more human-like and easier to relate to.
Motivations. Motivations are similar to purpose, but they are distinct from purpose because they are not why an AI agent is in the world but instead the agent’s outlook on it. For example, while the purpose of the AI might be to load crates into a truck until the player shows up, the motivation might be a daily work routine. New motivations may make AI change its purpose. For example, becoming concerned by the player, the AI, might turn into a guard rather than a worker loading a truck. This is seen in the Uncharted series.

Anthropomorphisation

Mikelis' Game Devlog

Emotional AI communicating through barks — a common way to anthropomorphize AI.

Anthromorphisation is the effort through which we make AI appear more human. It is by far the most effective way to build rapport between AIs and the player. There are many tools at our disposal to do so; below are some examples:

Making the AI trip and fall down when the player is watching.
Making the AI agent do idle human emotions, giving it character.
Making the AI periodically perform an action while idling in which it investigates an object around it.
Making the AI whistle.
Making the AI agent shiver or exhale smoke in cold environments or sweat in hot climates.
Making the AI convey pain through barks.
Making the AI convey their intentions through summary variables and barks. For example, making the AI use slow, languid movements when it is tired or using a different walk animation when its stamina variable is low.
Making the AI make human noises other than barks, like coughing, grunting, or yawning.
Making the AI make eye contact with other AI agents or the player as it is idling.
Using barks tied to the AI state, summary variables, or world events.
Using contextual dialogue without being prompted, for example, remarking about the level.
Using contextual dialogue in particular states, like the police radio scanner feature in Grand Theft Auto games.
Removing as many AI locomotion restrictions as possible, even if it allows inter-penetration of AI and other world geometry. Or adding dynamic navigation algorithms for moving through rough terrain or geometry missing navmeshes.
Giving AI agents individual personality quirks, like an unusual tone of voice, a stutter, unusual clothing, or an emotional state not usually expressed by AI agents.
Building a solid and expressive personality for a companion agent, whether grounded and realistic or not.
Using genetic algorithms and random walks with summary variables to emulate a natural variety common in animals and humans.
Making the AI agent interact with the world in a similar way to the player.
Making the AI agent face a similar moral decision as the player has faced.
Foreshadowing AI actions through either contextual dialogue, or emotions, or even explicit cues. For example, making the AI run from a distance very loudly before attacking the player.
Making AI agents make scripted mistakes, like taking longer to unlock a door, dropping the keys, dropping what it is carrying or having other AI agents point out these mistakes.
Making the AI fear unnecessary encounters with its enemies and dangerous situations. Most players prefer to have a stealth option in gameplay encounters, so an AI agent choosing to use it is easy to relate to.

It is important to note that anthropomorphized AIs are always more relatable and generally more liked. But they might not be considered realistic, and that is more a by-product than the purpose of anthropomorphization. It is instead building rapport and a meaningful connection between the player and the AI agent.

Hustling

Hustling is the ability and choice the AI makes to cheat to put the player at an advantage.

Mikelis' Game Devlog

You can think of it as bad aim, but really it’s great hustling!

Attack tokens are a common way enemy AIs will hustle when attacking the player. Usually, the AI agent that is closest to the player, more vulnerable to the player, or one that hasn’t taken damage in a long time is the one that gets assigned the attack token. Then, only this AI agent (or in some cases, several) can attack the player until the token is reassigned after a while. Sometimes, if the player is doing poorly, no enemy might get an attack token, or one enemy AI might get it but be instructed to take more damage from the player or cause less damage to them. This AI behavior is frequently seen in the Assassin’s Creed and Far Cry game series.

It’s not trivial to hide AI agents' attempts at hustling, but here are some starting tips that we can use to make an AI enemy hustle:

Make it break or lose its ranged or melee weapon, downgrading its class.
If it is a ranged attack enemy, make it not approach the player close enough to where their ranged attacks could not believably miss.
Try to hide AI agents behind a cover and give players breathing room. Make AIs shoot ranged attacks without seeing the player so that they would miss. This is commonly used in Uncharted games.
Position the AI so that there is a destructible object between it and the player, then make the AI target the destructible object rather than the player, making an empty show of force.
Make AI withdraw from an attack when hurt by the player.
If the player is moving, reduce ranged weapon accuracy by close to 0%. Make melee weapons miss most of the time.
Reduce the rate of AI firing their ranged weapon and up the instances of claiming they are out of ammunition and need to reload.
Reduce the damage AI attacks cause to the player.
Make the character fall over in the style of Grand Theft Auto IV gunfights.
Drastically reduce AI agent movement speed and attack rates when they are not rendered on the screen.
Make AI go to a point not visible by the player’s camera, bark about withdrawing from the engagement, and despawn.
Decrease the enemy count in an engagement after the player fails and restarts (likely done in Far Cry 5).
Boost player’s weapon damage and accuracy.
Spawn more supplies in the battle if the player is not doing well in the style of Left 4 Dead director.

Suppose the AI agent is a companion of the player. In that case, the AI is expected to be invincible, or it might bark about being hurt or running out of ammo but not actually suffer any ill effects.

Liked AI Features Summary

This section has elaborated many generally (and nearly universally) liked AI features that we can employ in our game designs and some common examples of their application.

The following section will discuss some of the AI systems that could inform our technical design.

Game Systems

Building a triple-A-style AI involves many game systems. And as we have seen, the scope of these projects can be daunting and extensive. So in this section, I want to discuss some of the game systems we can use in our technical AI systems design to make our work easier. Just in case we have not thought about using these tools yet.

Hierarchical Finite State Machines

Hierarchical finite state machines (HFSM) are visual representations of state machines that generally take care of the most difficult “think” part of the sense-think-act paradigm. The AI agent typically owns an HFSM and feeds information into its buffers or sinks about what happens in the world. The finite state machine then checks these sinks for its transition functions/conditions and calls actions.

Depending on the purpose of the state machine, it might be helpful to define individual states as action blocks that can be reused in an abstract and encapsulated way. For example, some AI action blocks might be: Go To World Location, Attack Character, and Play Action Sequence (for purpose). Scripts built of these action blocks can be nested in more general state machines denoting an overall AI state, like Idle, Sleep, Investigate, and similar.

But hierarchical finite state machines can handle more than just the “think” responsibility of the sense-think-act pattern. They can control character movement, emotions, animations (by default, Unreal Engine already handles character animations with state machines), and many other things. It is not uncommon for an AI agent to be running multiple state machines concurrently, creating a robust overall AI agent experience.

Many Unity and Unreal Engine plugins implement this functionality, as HFSMs are a common way to implement AI brains in AAA games.

Token Director

A token director is a game system issuing and withdrawing various tokens for AIs that could be necessary to unlock and perform states and actions. For example, if the AI wishes to heal, it might request a health pick-up token, and one would be issued to it by the token director. Then the AI might perform an action involving navigating to the health pick-up and consuming it.

Event Manager, Director or Messaging System

These are systems that inform all AI agents of what is going on in gameplay or the world so that they can react as they choose. You are invited to use my own event messaging plugin for Unreal Engine for this purpose or just to see how this system can work.

Last Known Position Director

A last known position system is often used as a shared bit of memory between all or some AI agents. When at least one AI agent sees a character of its enemy team for a set period, it can report the last known position to the director. All AIs can access the information of the last known position of any character. After a while, these last known positions registered within the director expire. This is similar to how many games like Assassin’s Creed and Far Cry coordinate all enemies’ responses to the player. Barks are usually used to ground this aspect of information sharing in the game world. For example, an AI agent might say, “I see something over there!” before reporting a player’s position to the last known position director.

Measuring AI

We have talked a lot about designing and implementing AI and some of the systems that might be involved. I am sure that by now, we are well-covered on the technical systems design front. But how can we know if we’re doing well when we are producing our AI? There is a lot of work to be done before showing it to focus group testers and getting some feedback. And honestly, AI design is really a world of its own, so no one can blame us for being a bit lost.

Luckily, there are sets of very concrete questions we can ask ourselves at all stages of production to evaluate if our AI is turning out well. Although some of these questions do not have universal answers and will depend on your goals as a development team.

Early pre-production

The goal of pre-production for AI design and R&D is to discover a feasible implementation of AI within a production schedule. In addition to the questions mentioned in the Pre-production Summary above, some more complex ones need to be answered to measure if we are doing pre-production well:

Can we reduce the level of complexity of our AI? Some people like Elon Musk recommend reducing the complexity of technical design by 10% every few months. Doing it up-front is an advantage.
Are we reminding ourselves of the differences in perspective players and AI developers have? Are we not falling for the common misconceptions?

Late pre-production and production

In late pre-production, the R&D should be complete, and we should have a little bit of experience building the new AI systems with the team. So it’s an excellent time to evaluate how the systems are turning out on a technical and design level. Here are some of the questions we can ask:

What is the technical debt ratio of our AI systems? It should be less than 10% if we calculate it as a function of remediation cost (estimated time to fix all defects) and over-development cost (time to rewrite the entire system). In particular, the formula I propose is TechnicalDebt = RemediationCost / DevelopmentCost.
Are the programmers given enough time to refactor AI systems? At least 20% of the time should be spent refactoring systems while iterating in a rapid prototyping way, as high-complexity AI systems can get out of control quickly.
Are we allocating quality assurance and focus group resources for qualitative AI assessments? In other words, are we asking our players enough open-ended questions?
Is our AI feature scope growth (number of features and feature requests now compared to the number of features last month) sustainable?
How is the AI using relevant surrounding information? This includes stimuli sources, game mechanics, and objects of importance.
Is the AI using direct and summary information as designed, and is this acceptable?
Can the AI complexity be reduced? Perhaps it is possible to replace features with cheats? Is it possible to reduce the complexity of our systems without notably changing how it is perceived by the players?
Is the AI acting predictably? Is its strategy obvious?
How persistent is the AI? Is it not switching states or actions too often? Would a person change their mind as often as our AI does?
How is our effort to implement hustling going?
Have we built all debugging tools necessary for resolving AI bugs? How much extra time do we spend to fix issues after feature requests? Suppose we see that programmers spend more than 10% of their time fixing problems that pop up after feature requests. In that case, it is essential to consider how this will impact the production schedule and if it is a matter of insufficient debugging tools, expertise, high technical debt, or iterating too fast.

Quantitative AI assessments

It is already possible to run quantitative AI assessments with focus groups or quality assurance staff during the production period. Here are some questions we should ask our testers:

How competent is AI at achieving the goals we have set up for it?
What is the visible error rate? This can be measured in AI errors seen by the player per minute of playtime looking at AI agents.
What is the visible error severity? Error severity can be broadly categorized as follows:
- Critical: “This AI agent freezes, crashes, or does not perform basic functions.”
- High: “This AI agent is idiotic, but it can perform basic functions.”
- Medium: “I thought the AI was going to do something better than what it did.”
- Low: “I have an idea on how to improve the AI agent in this circumstance.”
- Notable: “I know the AI agent made a mistake, but it was not less fun and still met my expectations.”

Mikelis' Game Devlog

AI error of medium severity: “I thought the AI was going to do something better than what it did”.

On the other hand, it is also important to ask our play-testers open-ended, qualitative questions like:

What are the areas where our AI fell short of industry-standard implementations?
What was fun in interacting with the AI?
What was confusing or unpredictable when interacting with the AI?

Conclusion

Whew! That was a lot. But I hope you found it helpful. We learned about starting with a strong video game AI design from pre-production, what AI features are generally well-received by players, what kind of tools we have as programmers to implement them, and how to quality-assure our artificial intelligence agents.

Compiling and reviewing this information took several months. If you liked this article and want to let us know it helped you, don’t forget to hold down the clap button for +50 claps and share it with your industry friends!

Cas Mikelis' Game Blog

Designing, Building, Testing and Shipping AAA-Style Video Game AI

Glossary

Pre-production Decisions

1. AI Autonomy

2. AI Complexity

3. Sense-think-act

4. Direct and Summary Variables

4.1. A Note on Summary Variables

Pre-production Summary

Good Game AI

Common Misconceptions

Generally Liked AI Features

Anthropomorphisation

Hustling

Liked AI Features Summary

Game Systems

Hierarchical Finite State Machines

Token Director

Event Manager, Director or Messaging System

Last Known Position Director

Measuring AI

Early pre-production

Late pre-production and production

Quantitative AI assessments

Conclusion

Further Reading