Experiences of optimizing graphics, physics and lighting (Unity, HDRP)
I’m working on a stealth game set in sometimes 0.5km wide open environments with big rocking sail ships and vast cavern networks. The levels are populated by patrolling guards, other NPCs and carriable objects and lit by dynamic light sources, since real time shadows are important for a mechanic of hiding in the shadows.
After releasing the game in Early Access half a year ago, I’ve spent much of my efforts in trying to make it run better. However, most of the optimizations in this post have been done before that release.
In this blog post I’ll be focusing on the asset and graphics side (which I have a much longer background with than coding), and the approach is on a more surface level without delving very deep into profiling.
Here’s a laundry list of things to keep in mind for optimization:
Draw calls (CPU)
- Combine clusters of static mesh renderers using something like Mesh Baker, Static Batching, or a 3d modeling software
- Use GPU instancing for the materials of static meshes (when you’re not using Static Batching for them)
- Disable groups of game objects that aren’t needed at the time
- Consider atlasing for combined groups of objects (although I steered away from it for workflow reasons)
Tri count (GPU)
- Use LOD Groups for characters, especially ones of that are numerous, like basic enemies
- Use LODs also for environment objects with a higher poly count, whenever you aren’t going to include them in clusters of mesh renderers to be combined with Mesh Baker (which would combine all the LOD versions)
- For something like a huge cavern, you can split it in parts so that the whole cavern mesh doesn’t have to be visible at once
Textures (GPU)
- Don’t make them bigger than they need to be (or adjust the max size in import settings)
- Use power of 2 dimensions (like 512×1024) for textures, since they’re optimal for compression and generating mipmaps
- Keep compression and bilinear (or trilinear) filtering on from the import settings
Lights (GPU)
- Minimize the number of light sources casting shadows at a given time (lights that don’t cast shadows aren’t nearly as much of an issue in deferred rendering)
- Adjust the range of the shadow casting light sources not to be bigger than needed
- Adjust the resolution of the shadows from the light sources
- For your game, baked lighting (and light probes) may also be an option. I didn’t use those because dynamic lights were important for gameplay reasons. Light baking also tends to take a long time, so the effect on the overall speed of working is quite big.
Reflection probes (GPU when baked)
- Bake reflection probes rather than having it done in real time
Particle effects (GPU/CPU)
- Visual FX Graph is GPU based unlike the legacy particle system, and has better performance. However, it has its drawbacks as well, like particle collisions being more difficult to do.
Physics (CPU)
- Make sure your Fixed Timestep isn’t smaller than it needs to be
- Don’t have Rigidbodies as dynamic when they don’t need to be
- Make the colliders of dynamic Rigidbodies as simple as possible (use box, sphere or capsule collider whenever you can, and for more complex objects make a separate low-poly mesh for the collider)
- The number of trigger colliders with scripts calling OnTriggerEnter can also become an issue
Animators (CPU)
- Try not to have animators on for NPCs that don’t need them at the moment
- Limiting the number of bones is apparently a good thing (though I haven’t tried it), as is ”Optimize Game Objects” checkbox in the import settings, if you’re not using the same rig for ragdoll.
Post Effects (mostly GPU)
- Some post effects like Motion Blur or Screen Space Global Illumination can be heavy
Scripts (CPU)
1. Characters
LODs
The first optimization I did for characters was using LOD Groups to make the characters have fewer polygons when they’re far from the player. I didn’t do them for all characters, mainly just focusing on ones that are shown the most often, have the highest polycount, or are the most numerous. Here are some sets of LODs, done in Blender by removing edge loops, welding vertices, etc.:
Hiding characters
The number of moving vertices isn’t the only culprit. There’s the animator component and the hierarchy of the animated skeleton to consider too. While the complexity of the character controller attached to the animator component doesn’t seem to be much of an issue (I use one controller with over 900 animation clips for all the characters in the game), hiding it when not needed seemed to improve the performance a bit. That’s done in two situations:
- When the NPC is in a different room than player
- When the NPC is far from the player
- (I also tried hiding the characters when not in the camera’s field of view, but enabling them ended up forming lag spikes, which were so annoying that it didn’t feel like an improvement)
It should be noted that disabling the animator (or the game object it’s attached to) does make things a bit tricky because when you re-enable it, it’s state gets reset. I solved this by having a function for each animator state change (like SetFloat) which also adds the state change to a list. That way I can load the state from those lists. |
Static mesh (another kind of LOD)
When an NPC is hidden due to being far away (which is defined by a big trigger collider attached to the NPC), I also replace its animator parent along with the skinned mesh renderers with a static, approximate version of the character mesh. This is what they look like:
As you can see, carried light sources have their LOD versions too (because they must be visible from far away and not hidden like other weapon models). It required some tinkering to get it to work, since carried objects are normally linked to the hand bone of the animated character. This LOD is just an extremely simple particle effect with an Additive material.
Ragdolls
Touching a bit on the following topic already, since the characters use ragdolls on death, there were also physics to consider. To minimize the number of dynamic Rigidbodies in the scene at a given time, I set them to turn to Kinematic a bit after death.
2. Physics objects
Use simple colliders
For rigidbodies that can be dynamic, you should use as simple colliders as possible. If the shape of the object can be combined from sphere, capsule and box colliders, use those. However, if it’s a weirder shape, rather than using the mesh itself to make a convex mesh collider in Unity, you should make a simple collider mesh with the smallest number of vertices possible. Below you can see some colliders I made for pushable barrels as an example:
Don’t keep too many objects as dynamic
To ensure that Rigidbodies aren’t dynamic without needing to be, I added a sphere trigger collider to my character prefab (or rather an empty game object linked to it), along with a script that toggles physics on for the entering physics objects, and toggle it off for the exiting ones. Here’s what it looks like in use:
The same Room script that hides characters in different rooms, also hides carriable objects.
3. Environment
Dividing a large cavern model into sections
One of the optimizations I made quite early in the project was splitting a huge cavern mesh. The second mission of this game is mostly set in a network of tunnels with 226k tris. Of course, only a small portion of that geometry can be seen at once, so I split the cavern in 11 pieces to be able to hide the unneeded parts.
I did this quite late in the process when I was reasonably happy with the layout of the tunnels, since splitting the mesh also requires applying all the modifiers before that, which makes making modifications laborious. In the cavern, there are also trigger colliders for showing and hiding the separate pieces. They have to be scaled a lot bigger than the pieces they show, so that player can’t see the edges of the section. Having bends in the corridors also helps with that.
Hiding parts of the cavern
Instead of disabling the game objects of the cavern pieces, I only disabled their renderers. Keeping their colliders in place was necessary not to have physics objects fall through. Here you can see sections being enabled and disabled when the player enters and exits some of the triggers:
Since the cavern is below a terrain, I also disabled the terrain renderer while the player is not outside, as you can see in the clip below. The weird pattern consists of fog particles which also get switched on and off.
Combining groups of static environment models
One of the ways to reduce draw calls is to combine groups of mesh renders that don’t need to move in relation to each other (like ships and building for this game). I did this using the Mesh Baker plugin. However, with a slower workflow, I suppose you could export the meshes from Unity with the FBX exporter and do the combining in a 3d modeling software.
I opted for combining the meshes by other means than Static Batching, because there are structures that may need to move, like the ships, and it seemed simplest to rely on toggling groups of objects on and on rather than relying on something the effect of which it’s not as easy to keep track of.
For this brigantine, I combined 1424 static meshes like floorboards, walls, hull pieces and furniture, into 3 big ones. Here you can see the combined meshes separately:
Here’s some footage of the environment where the ship is situated, before and after combining the meshes:
Replacing the hierarchy of distant structures with simple LOD versions
Even the combined mesh renderers don’t have to be visible with their full polycount when the ship is seen at a distance, so I also made simplified LOD versions of the ships. When player exits a trigger attached to the ship, the whole hierarchy of the ship gets disabled and replaced with the LOD hierarchy. They look like this:
To make LOD versions of terrains in the same vein, I exported a height map from Unity, and used it as a displacement map for a plane with 7 subdivisions / ~32k tris in Blender, then exported that plane with its displacement back to Unity.
As player rows towards the island, reaching a trigger collider swaps the terrain mesh from the LOD version to the actual one.
For some groups of buildings, I just set up a trigger to hide them completely, when the player is on the other side of the island. Here you can see the building, particle waterfall, and a small bridge at the bottom getting hidden.
4. Lighting
With deferred rendering, the number of light sources has very little impact, but the number of shadow-casting ones should certainly be kept in check. To keep them to a minimum, I’ve done two things:
Lerping the intensity of lights in other rooms to zero
Firstly, the lights in different rooms than player is currently in, get hidden with a fade-out. The intensity gets set to zero instead of the light sources getting disabled completely, since according to my own testing and this discussion about the subject, there’s little to no difference between the two.
Only the closest carriable light sources have shadows on
Secondly, since the game has carriable light sources, in some situations lots of torches may get clumped into a small area (like when a bunch of enemies with torches is swarming around the player), so it was necessary to have something capping the number of shadow casting lights dynamically. Certain shadow casting light sources have a trigger collider attached, which adds them to a list in playable character’s movement script. The list gets reordered once a second and shadows are left enabled for the closest light sources. The maximum number of light sources with shadows on gets defined by the quality setting in the game’s Options menu.
The video below has the Medium quality setting on, which limits the number of these shadow casting light sources to 3.
When the player is outside the trigger, the shadows are always off. This clip isn’t the clearest possible, but you can perhaps see the further spotlight getting slightly dimmer when its shadows get enabled as player enters the trigger.
Light range
In HDRP you can adjust the resolution of the shadow for the specific light source. It’s also important to make the range (and angle in case of spotlight) as small as possible. Below you can see how the number of batches and shadow casters change as I adjust those settings (I used a non-combined version of the ship with all the separate mesh renderers for the demo).
Comparisons of half a year ago vs now
To get a good picture of how I’ve managed to improve the optimization of the game in about 6 months after EA release, I went back to that time in version control to capture some ”before” clips, and captured the same situation in the current version. Mind you, the first EA version already had most of the optimizations described.
There’s some extra overhead in these due to having been captured in editor with profiler on, but they still give a good idea of the relative difference of frame rate that I think mainly the reduction of batches account for.
My system:
CPU: | Ryzen 5 3600 |
GPU: | GeForce 3070 |
Memory: | 64gb DDR4 |
As the graphs in the videos show, this situation of climbing in the first mission (as well as the whole game) is clearly CPU-capped in my system. After reading that I can do this, I checked the frame time for CPU and GPU separately by stopping the game, choosing a frame in the profiler, and checking the numbers in the grey bar below the graphs. This frame is from a typical moment while climbing in the same scene.
It shows 20.97ms for CPU and 10.21ms for GPU. I checked this about a week after capturing the ”after” video and at this point I had already attempted some CPU optimizations so that explains the difference between the CPU time and the frame time in the video.
This scene was a particular pain in my neck for a long time, and I perhaps went a bit overboard with the changes. Though there ended up being a big improvement in frame time, I do think the ”before” clip looks much more intriguing, the new version kind of lacks some life. I’ll have to try trace my steps back towards the ”before” version a bit by adjusting the placements of the characters.
The most effective and least laborious change on the situation above was disabling a script that moved the parent of the ship with a bobbing motion. Apparently deep hierarchies under a moving parent result in costly CPU calculations. The article below talks about this in depth:
Spotlight Team best practices: Optimizing the Hierarchy
In the ”before” version there was a conversation happening right next to the ship when the player gets to the area. I moved that conversation elsewhere for optimization reasons, although I think the visuals of this environment would’ve made it a nice setting for eavesdropping a somewhat long conversation.
As you can see, there’s a laggier moment when going outside, and a bunch of NPCs, the ship (in it’s LOD form) and terrain renderer get enabled
I wondered why the difference in batches wasn’t bigger, until I disabled the terrain and noticed it makes for about 2/3 of the batches. So that’s one thing I’ll still have to look into. The terrain has shrubs and trees that have their LOD versions, but no impostors/billboards made for them yet.
This third mission of the game includes a terrain, two big ships, and a bunch of buildings, so getting the LOD versions and disabling structures not currently needed was especially important.
The interior of this huge frigate has been another bigger hurdle to get to run well and be nice to navigate through. However, as the graph and timeline show, the physics still seem especially heavy.
Although I was able to cut down the number of batches by quite a lot and improve the framerate in all situations, physics, scripts and (probably) rendering still seem to take up too much time on the CPU side. In any case, coming up with the materials for this blog post, and learning more about profiling in the process, has provided a good overview of the current situation.
I’ll keep up optimizing this alongside working on new content. If you’re interested in the game, it’s available in Early Access on Steam.
Thanks for reading!