GPU Instancer:BestPractices

From GurBu Wiki
Revision as of 02:37, 2 December 2018 by GurBu Admin (talk | contribs) (Using a No-GameObject Workflow Where Possible)
Jump to: navigation, search

About | Features | Getting Started | Terminology | Best Practices | F.A.Q.


GPU Instancer provides an out of the box solution for improving performance and it comes with a very easy to use interface to get started. Although GPUI internally makes use of quite complicated optimization techniques, it is designed to make it very easy to use these techniques in your project without going through their extensive learning curves (or development times). However, by following certain rules of thumb, you can use these techniques as intended and get the most out of GPUI.


In this page, you will find best practices for using GPUI that will help you get better performance.



Instance Counts

The More Instance Counts, the Better

The rule of thumb to follow here is to have only the prefabs that have high instance counts rendering with GPUI, while minimizing the amount of the defined GPUI prototypes on the managers as much as possible.


When using the Prefab Manager in your scene, the best practice is to add the distinctively repeating prefabs in the scene to the manager as prototypes. Using GPUI for prototypes with very low instance counts is not recommended. GPUI uses a single draw call for every mesh/material combination, and does all culling operations in the GPU. While these operations are very fast and cost efficient, it is unnecessary to use GPU resources if the instance counts are too low and the performance gain from instancing them will not be noticeable.


Furthermore, since prefabs with low instance counts will not gain a noticeable performance boost from GPU Instancing, it is usually better to let Unity handle their rendering. Unity uses draw call batching techniques on the background (such as dynamic batching). These techniques depend on the CPU to run and tax their operations on the CPU memory. When there are many instances of the same prefabs, these operations turn out to be too costly and the reduction in batch counts dwarf in comparison to GPU Instancing. This is where GPUI shines the most. But where the instance counts are noticeably low, the cost on the CPU when using these techniques becomes trivial - yet they will still reduce batch counts and thus draw calls. While using GPU instancing, on the other hand, since meshes are not combined, every mesh/material combination will always be one draw call.


In short, GPU Instancing will help you the most when there are many instances of the same prototype, and it is not recommended to add prototypes with low instance counts.

Scene Prefab Importer

GPUI-v097-ScenePrefabImporter.jpg

GPUI comes with a Scene Prefab Importer tool that can help you with easily managing this. You can access this from:


GPU Instancer --> Tools --> Show Scene Prefab Importer


The Scene Prefab Importer tool will show you all the GameObjects in the open scene that are instances of a Prefab - and their instance counts in the scene.

You can use the Select Min. Instance Count slider and button to select prefabs with a given minimum amount.

You can then Import Selected Prefabs and GPUI will create a Prefab Manager in your scene with selected prefabs as prototypes.

Registered Prefabs in the Prefab Manager

GPUI-v097-RegisteredPrefabs.jpg

When your Prefabs are defined in the Prefab Manager, the manager will show you the instance counts that are registered on it.


If you add any new prefab instances in your scene (or remove any), the instance counts must be registered again in the manager.

You can simply click the Register Prefabs in Scene button to update the instance counts in the manager.

A Single Container Prefab vs Many prefabs

One important thing to notice is about the structure of the prefabs. If you have a prefab that hosts many GameObjects that you created for organizational reasons, Unity does not identify the GameObjects that are actually the same under it as instances of the same object. What this means is that when you add such a prefab to the Prefab Manager, GPUI has no way of knowing that the some of the GameObjects under this prefab can actually be instanced in the same draw call. This results in creating a draw call for each mesh/material combination under this prefab and therefore beats the purpose of GPU Instancing.


To exemplify this, think of a prefab which represents a building. Maybe you have windows under this building prefab as children which use the same mesh and materials. Maybe you also have smaller building blocks, doors, tables, etc. that share the same mesh and materials. If you have this huge prefab, however, from the perspective of Unity (and therefore GPUI), you effectively have a prefab with as many meshes and materials in it that as the total sum of meshes and materials it hosts. So if you add this prefab to GPUI as a prototype, you will also see as many draw calls.


Instead, the better way of having this building in your scene (at least as far as GPUI is concerned) is to have a host building GameObject - where all its child windows, doors, tables, etc. are instances of their own prefabs. When you add these prefabs as prototypes to GPUI, you will see that all the windows, doors, etc. will shadere a single draw call instead.


Please note, however, that this issue is not the case starting with the Nested Prefabs support in Unity 2018.3 and later. Using Nested Prefabs, you can have all your windows, doors, etc. as prefabs and still save the building as a container prefab. GPUI supports Nested Prefabs, so if you create a prototype from the building prefab, the Prefab Manager will recognize the children as prototypes as well.


Nevertheless, this is one of the most important issues concerning prototypes and instance counts; and it is usually one of the biggest pitfalls while using GPU Instancer.

Using Occlusion Culling

The rule of thumb to follow here is that you should turn occlusion culling off if your scenes do not have a sensible amount of occluders.


The occlusion culling solution that GPUI implements is extremely easy to use: you literally don't have to do anything to use this feature. You do not need to bake any maps, to add additional scripts nor use Layers. Furthermore since it works in the GPU, it also is extremely fast. As such, you might be tempted to use this feature even where you probably won't need it.


However, please note that the Hi-Z occlusion culling solution introduces additional operations in the compute shaders. Although GPUI is optimized to handle these operations efficiently and fast, it would still create unnecessary overhang in scenes where the game world is setup such that there is no gain from occlusion culling. A good example of this would be strategy games with top-down cameras where almost everything is always visible and there are no obvious occluders.


GPUI makes it possible to use extreme numbers of objects in your scenes. And in higher numbers, the cost of testing for occlusion can be higher than desired in the GPU if the scene is not designed in such a way that this cost of testing is compensated by the average amount of geometry that is culled. In these scenarios, you will get a better performance boost out of GPUI than having occlusion culling on.


In short, it is recommended to design your scenes in such a way that there will be obvious culling that GPUI's occlusion culling will take advantage of. Examples of this are elevations that slightly block your view in a terrain, walls/buildings that a player walks in front, etc. Or, if your game is so that there will never be enough occluders (e.g. a top-down strategy), than it is recommended to turn occlusion culling off.

Using GPUI for Unity Terrain Details

GPUI-DetailInstancing001.png

The rule of thumb here is to aim to balance the visual quality you introduce to the terrain details with the performance you can get out of the GPUI Detail Manager.


When using the Detail Manager, one thing to keep in mind is that most of the options in the Detail Manager interface serve the purpose of increasing visual quality over that of the default Unity detail shader. GPUI uses instancing techniques for backing this up performance-wise, and the result is always a stable FPS with no spikes.


However, in some cases, the FPS would result in a number that is lower than the Unity default especially if you are using features that don't exist in the Unity terrain details. The most impacting of these are shadows, and it gets heavier with the amount of Shadow Cascades that are defined on your quality settings. Also, depending on your the cross-quadding settings, this option effectively doubles, triples or quadruples the amount of geometry you have for texture details in comparison with the Unity details. The expected result when increasing visual quality in the Detail Manager, therefore, is not that it would always be faster than the Unity terrain, but that it is fast while still using these features.


GPUI also offers culling solutions that help increase the FPS, but they require you to design your scenes with having them in mind. For example the occlusion culling feature will help you get better results in terrains where you have hills that block your view. Not necessarily giant hills, but slight elevations to hide some instances as in the included DetailInstancingDemo scene. In scenes with wide plains where nothing is culled, the occlusion culling system effectively becomes obsolete - if you wish to have scenes like this, it is recommended to cut back on mainly the shadows and the cross-quadding options and the maximum detail distance to see better performance.


Also, please have in mind that GPUI lightens the CPU load by moving all rendering related things to the GPU. If you test only the terrain with a GPUI Detail Manager, you are always testing the GPU since there are no scripts running at all. In a real game scenario, you will be using your CPU for other things, and GPUI's instancing solution would be more apparent in the FPS.


In short, using the Detail Manager effectively requires a bit of thought in the design process. You need to optimize the quality settings on the manager according to the terrain you have. The final FPS would always scale for the better directly with a better graphics card since it's all done in the GPU, but since you're adding features that don't exist in the Unity Terrain, it would not always be faster then the default terrain. On this point, the recommended action to take would be to create different quality levels in your game and use different settings for these. For an example of this usage, you can take a look at the included DetailInstancingDemo scene.


Using a No-GameObject Workflow where Possible

This is an advanced topic, and currently a no-game object workflow in GPUI is only available through the GPU Instancer API.