Login / Register

Latest official blog posts

Recurring Nightmares

12th August 2019

132-0.jpg

Almost every project I've worked on has a bit of code that always causes chronic problems. Some code gets written and it seems to work well in development and testing. Then someone on the team creating real world data causes it to malfunction.

So then the programmers fix the issue, and everyone continues on their merry way, until, some different real world data causes it to malfunction.

This cycle continues until the game ships. I'm not sure this problem comes from poor development - it's just that code can be so complex, no one can envision all the ways it will be used or all the ways it can break.

I've even seen this happen across multiple projects that reuse a game engine. A feature might be used flawlessly on one game, and unchanged, it breaks on the next game in development because it's being used differently or with different requirements.

I've seen this happen across all fields. Graphics, physics, collision, AI, audio, animation, controller input, even movie playback!

Sometimes, once true real world requirements of a system are known, a full rewrite helps the chronic bugs. Or eventually all the edge cases are discovered. Sometimes not.

Unfortunately I've got an issue like this now. A while back I decided I wanted to support arbitrary placement of objects. Any shape, any rotation, with potential intersections, added and removed dynamically to the play field. I ended up implementing an algorithm from a paper called Fully Dynamic Constrained Delaunay Triangulations. It's a bit complicated, but was fun to implement and works really well. The paper covers all the cases needed to be robust, but there are really tricky issues that come up.

Right now it handles pretty much whatever I throw at it. It makes nice big triangular maps of pathable space that I can run A-Star on for pathfinding, and the map can be analyzed quickly to allow different sized units.

Here's some pretty pictures of it, and some test paths that have a non-zero unit radius.

132-1.jpg

132-2.jpg

So every few weeks, the random level generator ends up placing objects in configurations that breaks my code. Arghghgh. Sometimes it breaks building a map. Sometimes it breaks when quitting and a specific object is removed. The worst is if it breaks when I place an object manually, because it's really hard to find and recreate the exact placement that caused the issue.

And so I stop whatever task I was doing, and start debugging the triangulation code. This is not easy - Some maps have 20,000 objects, and 175,000 triangles in the pathing mesh. I end up doing a mix of visual and code debugging. By drawing the pathing mesh at the point the error occurs, as well as before and after, and stepping slowly through the code to figure out what is happening, I can figure out what's causing the bug. It usually takes me several hours to determine the problem. Sometimes more. Finding a quality fix is typically hard. So I take a break, sleep the night, and generally have a good idea for a solution in the morning. Implement and test.

Then I wonder, "Okay, What was I doing before I ran into this bug??"

For this triangulation system, the culprit is floating point math. Every time. The algorithm is good. I haven't had to change the major details of it a single time since I got the initial implementation working. But because math on a computer is an approximation using a fixed number of bits, math that works out on paper does not always behave the same way on a computer.

For example, one of my issues dealt with computing the circumcircle of a triangle. The algorithm just didn't work at far distances from the origin until I wrote the math three different ways to find the most numerically stable and accurate implementation. On paper the math should have resulted in exactly the same result for all three methods!

Another issue arose because I was testing a large circle against a point which laid exactly on its perimeter, but the test failed because of lack of numerical precision. I've also had failures do to nearly degenerate triangles. And other crazy things that are hard to describe concisely.

I'm pretty sure some of the worst bugs to fix properly in my programming career are due to floating point imprecision. We have a long history, and we are not friends.

When I started making games professionally the fix would be to test using an epsilon. For example instead of

if (x == 0.0) { ... }

I would write something like

if (abs(x) < 0.00001) { ... }

In the right case this can be good. But in most cases is very bad. Because without the right epsilon and knowing what x is and always will be, you are potentially creating false positives in addition to fixing the original problem. I avoid this whenever possible.

My goto solution now is to use geometry analysis to determine an answer that needs high precision. Can I make an inference of the result using vertices, edges, and faces, and their relation to each other? Can I write the algorithm to be fault tolerant of values slightly under or over the desired one? If not, can I rewrite the math such that I'm never using values orders of magnitude away from each other?

Having fixed so many small cases - about one a month, I do consider going back to a simple grid for pathfinding. But I purposefully chose this route as the most flexible. I do wonder if I'll ever get this piece of code to be fully stable. At least it works today and it generates lovely paths for units to follow.

132-3.jpg

Only time will tell if I get all the kinks worked out - at least until the next project that uses it in a different way.

View comments »

Art Test

30th January 2019

I've been tired of looking at grey flat land with test objects, so I spent a bit of time doing some art tests. It's hard for me to have any feelings about a game without visuals. Yes, the gameplay can be in place, and it might be fun, but for me it misses something undefinable without the visuals - partly charm, warmth, completeness, but something more as well.

At the same time, since I've still in a very much prototype phase of development, I don't want to spend days on each art asset, or hours making shaders.

So I arrived at a prototype art style that is certainly subject to change, but for now gives me something more concrete to build on where I can add detail later.

It looks like this:

131-0.jpg

So now I can quickly model general shapes, play with color and shading, without spending a ton of time getting high resolution assets built.

A bunch of other things had to be done to make this happen. I had to start a proper terrain generation system specific to the game that incorporates gameplay features instead of just being specific to terrain. This was fairly quick implement do to previous work.

I also had to rewrite the shadowing system I had from Banished - previously the camera couldn't look up to the horizon, but now I'm allowing a much more flexible camera, so I need to render to the horizon and handle shadows in any configuration. Not hard, but a bit of a change from a single shadow maps to cascaded shadow maps.

Also due to the camera changes I now have some new rendering systems to handle more objects and need to create LOD objects for things far away - another thing that I didn't have in Banished. And another reason to have a quick prototype art style. Each asset now needs multiple models for close, mid, and far view distances.

I'm allowing a large zoom out distance, so you can look over a fairly large area. Here's the same terrain as the previous image from far away.

131-1.jpg

Obviously it still needs, ocean, and sky - but usually I'm testing close up in the middle of the island looking down, so it's not an immediate need.

I may end up with something in-between this style and Banished's style, or something completely different. Or I may stick with it - I've got plans for far more assets than the previous game required, so faster art creation might be a really good thing.

View comments »

Game Code Design

9th November 2018

I don't quite know how to write game code properly. It's a bit of a mystery because I'm still new at it.

My experience working professionally on games for the last 17 or so years had mostly been writing systems. My main job used to be writing graphics engines for consoles, but I also wrote collision, physics, and tool chains for content creators. And because part of my job was getting the entire game engine running on consoles, I had exposure to the audio, networking, i/o, asset management, and input systems. I like systems. They all have definable expected results.

Despite the complexity of any one of the major systems in a game engine, it mostly takes some data, does some stuff to it, and outputs a result. The systems make the artwork appear on screen, make the physics behave correctly and perform well, make the audio sound just so. Given requirements for what you want a system to do, it's usually easy(ish) to design and implement. Especially when there are clear lines of separation between systems.

Before starting my own games, I never saw the game code. It was someone else's job to implement it. Sure I knew the basics of what was in the code - state machines, entities, various channels of communication, and scripts that made things happen. And there was also AI code, responding to input, user interface code, and more. And it was all very tightly tied together. And the games tended to be iterative - after initial implementation, features got tweaked until the game shipped.

How do you organize that sort of code and make it easy to refactor? And is it possible to keep the entire game design in mind when implementing things initially?

I'd never done it before, so for Banished, the answer was no. The design of my last game was iterative until near completion, so I just kept adding things to the code base as needed without an overall plan. Things could have been implemented in a cleaner and more productive way, had I been able to stand back and look at the whole picture. Sure I occasionally refactored things when it got hard or messy to continue forward, but a full rearchitect of the game level code wasn't in the cards.

So now that I get to mostly start over, I'm trying to take a more systems level approach to the game code. And while my design isn't fully written out and all the little details aren't set, I know the overall shape and size of the games features. Luckily since the new project I'm working on is similar to Banished in that there's indirect control over some people, I can take a lot of what I learned and apply it.

Entities

I had the concept of entities in Banished. In other game engines these are known as objects, pawns, actors, things, etc. It's basically just something that does something. It could be the player character, a torch, a chest with treasure, a tree, or an monster manager that is spawning zombies. But by itself, it does nothing. In my implementation, I add components to entities which adds functionality - a model to display, audio to play, an ai and movement controller, and many other things. Basically each component gets a chance to do something on creation, update, and removal.

I've kept this for my new project as it's one of the things I got right - but I'm extending it further. In Banished there were a lot of things that weren't entities - the terrain, the sunlight, camera, object selections, object placement, player toolbars, the map data, clock, menus, minimap, and the weather system. And they all required extra manual code (and sometimes repeated code!) to use them in a bunch of places. (If you're clever you'll notice those are the things that are global to the game, and there's only one of them at a time.)

In my new project things like that are now entities as well, since they only do things on creation, update and removal. Having a unified system for all game objects also makes writing other things easier, like save games, since everything fits into the same mold. I also had separate game loop code for loading screens vs the main menu vs the main game. Now it's just one game loop that can do anything based on the entities used.

Hardcoded Data

Early on in Bansished's development I got things working too quickly. Things like professions and types of raw material were coded in C++, rather than configured as data. As you can imagine, late in the game adding a new professions or item type was painful! I had to touch many source files and make sure everything worked and nothing broke. I eventually made professions configurable through data, but the item types were so ingrained in the code that changing it to data instead of code was a task I didn't want to take on.

This is not a mistake I'll make again - any game concept is now made generic and configurable. My rule of thumb is if I think of creating variable names with nouns or descriptors, (Pot, Clay, Bronze, Edible, etc), it's probably something that should be data, not code.

Hierarchal State Machines

State machines in general are useful. The idea is some object is in some state, and while in that state only does certain things. Some event may occur that causes a transition to another state, which has its own things it does. And so on. But they caused me a lot of headache in Banished. With a hierarchical state machine, you can override a state with something new. So let's say you have an entity that is a Box of Treasure. It's normally in the closed state, and when you interact with it, it transitions to the opening state, which plays an animation, and when that's done, there's a transition to the open state, and it gives you some gold. Now lets say I want a Trapped Treasure Box. I only have to override the open state, and instead of giving treasure, I write code to shoot an arrow at the player. I didn't have to rewrite code for the closed state, or the opening state.

In Banished this worked well for adding components to the building. Each component added could implement a state machine, or overrides various states. There were lots of these - partial state machines for gathering resources, building, handing out jobs, being on fire, being diseased, being destroyed, etc. The problem was not all of these were on each building or field, so I had to make them work regardless of which were present or not. It made it exponentially hard to write - should the parent state be called? Should it not? If there's not transition function between states, should it call the transition function in parent states? Does it work if the components are in different orders? This was a huge source of bugs that took a long time to work out.

In my new code, I've got state machines being used, but I'm building them more carefully, and avoiding hierarchies, especially deep ones. Once state machines get to large, they're hard to manage and think about.

Character AI

Ah, this was a mess in Banished. Some hand written code made overall decisions about what to do. It was prone to breaking. What to do if a character is hungry, diseased, and his house is on fire all at the same time? Which has priority? Does it depend on what everyone else is doing? Have I handled all the permutations? I'd add a new concept like being cold or being happy, and that would break something (like not starving) that had been working because the decision process changed or I didn't add the proper checks in the right places. Once the decision was made, a list of actions for the character to carry out would be generated - sometimes by a different chunk of code, like the global general work list, or a building state machine that was handing out jobs. Something like - walk to storage, pick up logs, walk back to workplace, drop logs, etc. Which might be interrupted at any time because the AI deciding to do something else important. The code was spread out over too many places and was a bit prone to breaking.

This time I'm implementing a system that's a bit more configurable and unified. What I'm building now is an overall priority system that weights each characters needs. So things like food, water, warmth, sleep, shelter, companions, possessions, daily schedule, working, helping others, needs of the village, emergencies, and special events will be weighted based on the current situation, and the best one will be selected.

For each of those a behavior tree will drive how each character achieves each need to allow many ways to solve an issue. For example, there should be maybe ways to find food. Is it in my inventory? Is prepared food available nearby? Is preserved food available in my home? Can I ask my neighbors for some? Do I have to get it from storage? Can I ask a hunter to prioritize getting food? Should I walk out into the woods looking for mushrooms and berries myself? Can't find any after a while? Hmm, time to leave the village for a better one that has food.

This sort of decision making and planning will be mostly data driven, so that adding new behaviors requires little to no code. At least I can hope so.

User Interfaces

I'm pretty happy with the way the user interface in Banished turned out. My only issue with it was that there was a lot of code to make it work. Something like 20% of the game code for Banished was UI. If I designed a UI with a button on it, I had to write some UI code to find the button by name, configure it to receive an event, and then when the event occured, call some function on an entity.

So I've rewritten the UI code to be able remove the need for the in-between code that manages that UI widgets - I can just create a UI layout with a button widget that binds itself directly to the function on the entity. The intermediate control code isn't needed. This works for all sorts of widgets and values, as well as text and sprites that appear on the UI. While this won't reduce UI code to nothing, it should help to reduce the amount of code to manage.

Another UI change I've made is that I've separated the way the UI looks from the way it behaves. This way I can easily restyle the UI and also create widgets and layouts in code and have them styled the same way as everything else.

Performance

For the new project, I'm planning farther ahead to deal with performance issues and use more CPU cores. When I started my game engine, I consciously chose to limit multithreading to keep things simple - after all I was working solo and wanted to get initial implementations running quickly. Some things did end up in different threads, like fine grained pathfinding, but everything else - updating entities, drawing, coarse pathfinding, searching for locations, etc was done sequentially.

This needs to change this time around to make a more scalable game.

While a lot of other current engines are running entity updates in parallel, I'm not choosing this route. Threading updates where multiple entities depend on each other is hard. Really hard. The goal is not to use any locks or thread synchronization. This requires really breaking up updates into small chunks, limited or no direct access to other entity data, sending messages to other entities, and waiting for responses. This makes designing the code hard, makes it hard to debug, and hard to modify later if you forget whats going on.

I'm also not sure that updating the entities will be the bottleneck this time around. Entities are now just decision makers and controllers for engine level systems, and most of them don't update every frame. It will require profiling once it's in place to know for sure, but I know other things are going to show up with significant time use on the profiler first.

I'm preparing for moving all the heavy lifting into systems that can be easily parallelized. All animation, character movement, particle systems, pathfinding, ray casts, spacial searches, spacial subdivision updates, and more, are all fully separate systems and can run on different CPUs easily without dependencies. If the AI needs to wait on a search for nearby objects, or how to get from A to B, or other expensive operation, they can just idle a bit until the result comes back from a lower level system.

Additionally the entire rendering pipeline can run start to finish on a different thread if it runs a frame behind the updates. As I don't plan on making games requiring quick twitch input response, I don't believe this will ever be an issue or even be noticed. I've done this before on consoles engines, and it can free up a ton of frame time, making it available for updates. If required I can parallelize culling and command buffer generation within the rendering, but I'm not sure there will be huge gains there unless I'm also supporting DX12 and/or Vulkan.

The Plan

Anyway, that's the plan for this time around - I'm sure that these changes to the way I structure the game code will help to make the code easier to use and update, make a more extensible game engine, and teach me new things as I go along. But I'm also sure at about 80% complete on the game, the code is going to get start to get messy again in the push to finish as I implement all the small items I didn't consider ahead of time. Which, obviously, I'll fix in the game after this one.

View comments »