Middle life graphics programmer crisis: May 2019

Friday, 24 May 2019

Entity Component System, Why? (Part1)

During the last few years we have improved CPU speed and quantity in our systems, but memory has been lagging; specially in a multi processor environment, memory access could be really slow.
So new patterns for programming have become more important (they have been always there) before continuing this article I must recommend reading and understanding as much as possible about cache coherence in a multi processor system and data oriented programming.

https://en.wikipedia.org/wiki/Cache_coherence
https://en.wikipedia.org/wiki/Data-oriented_design

Conclusion:

- If you bring a cache line close to the CPU (L1), always use all data in it.
- Avoid writing shared cache lines between processors, you can read cache lines from all the processors but if you write into a cache line, make sure that other processors are not reading or writing into it.
- Help the pre-fetcher, always try access data in a lineal pattern.

Object oriented?

If you try to follow these rules, OO seems the worst fit as all data of the objects are mixed in lineal storage without thinking in cache lines.
- There are a lot of accesses for a full cache line just for reading a value or a bit. Classic "bool IsActive()".
- Writing and reading is happening in the cache lines without any control from different processors.
- Access can not be pre-fetch, virtual functions are a really good sample of that.

- Virtual functions produce a lot of access to different code, that can produce a lot of cache misses in the instruction cache.

Are there "solutions"?
- Moving the storage position of the data inside the objects, making close accessed data to be in the same cache line, but usually it is difficult to maintain.
- Moving common data outside of the object and access it. The classic example is moving the bounding box and the visibility flag of the object outside of the object, so you can calculate the visibility in a data oriented pattern.

Entity Component System:

The idea is to implement the last "solution" but in a more generic way, so everything inside the object is stored outside of it in a lineal pattern, same type of data share the same cache lines (the classic vector of structs vs structs of vectors).

An object is defined by a list of components and each component is stored in continuous memory. This approach is not only going to help us implement the data-oriented approach, but it is going to implement a component based approach for defining the behaviour of our objects, it is going to avoid the classic inheritance object oriented programming and it is going to remove the need of runtime polymorphism and virtual functions.

Instead of looping the objects and call virtual functions, we generate a kernel function that processes data for a list of components, that kernel function will be executed only for the objects that have all the components needed.

Without defining a lot of the implementation details, it is easy to realise the benefits of this approach:

- The kernel only access the components needed.

- Components are stored in lineal buffers, so the kernel access the data in a lineal pattern (next component is in the same cache line or next cache line, that helps a lot the pre-fetch).

- It will not just help a lot the data cache coherency, it will help the instruction cache coherency because kernel functions will be small pieces of code that gets executed for a lot of components without changing of each type of entity.

- It can be distributed using multiple processors, as the kernel function only needs to access the components and the task can be split in jobs really easily (without sharing cache lines between jobs).

- If you write into a component, only your processor is going to have write access of the cache line.

- Virtual functions are not needed, as you don't need runtime polymorphism in the component.

part 2

String hashes

Strings are evil... but they are human friendly.
The common implementation of strings is heavy in dynamic memory and slow, specially for simple comparisons. There are several alternatives, like pooled strings and string hashes.

With modern C++ you can calculate the hashes for literal strings in compile time, that gives string hashes solution a lot of potential. So all literal strings get converted to hashes, memory is under control and the comparisons are fixed cost.

But it is not perfect:
- Collisions, you need a way to detect collisions as having a collision in your implementation could be disastrous.
- Converting back to strings for logging.
- Debugging, you need to be able to see the string value during debugging.

All these issues can be fixed if you maintain a map that converts from hash to string. It will allow detecting collisions and conversions. It solves the debugging issue if you access the map using a natvis file. And in release configuration you can just strip out this map and, as an extra, no more literal strings inside the executable.

But string hashes solves a pattern that I always fight against in big projects.
Usually there are a group of tools for processing data, for example shader, material and meshes. Those tools are independent executable (usually), but they need some dependencies between them. For example, if the shader defines the pass where it needs to be rendered and the pass is inside an enum; usually you create a common include file between the tools, so you can serialise the correct index for the enum. But this pattern has a lot of issues, a change of this include file invalidates all the tools and the data, wasting a lot of cached data.

So, why not instead of using integers we just serialise the string? that would break the dependency, making all the serialise data more data driven and not based of a fixed enum. Thanks to the string hashes you can just serialise the hash, as hashes are the same between executables.

Extra:

My string hash base class has two template parameters:
- Namespace: That allows the hashes to collide between different namespaces and block assignments between string hashes from different namespaces (strong type string hashes).
- Size: Thanks to the namespace, you can now reduce the size of the hash, for example for the pass name you can use 16 bits integers (or 8 bits).

If your project is big, with a lot of different tools using string hashes, you can use an external database instead of a global map inside each tool.
Benefits:

- Better collision detection: You will detect the collision from the tool that first introduced the duplicated hash.

- Once in shipping configuration, you can convert hashes to strings from the logs without leaking the strings and you can still use the natvis for debugging (You need to implement a natvis custom visualized, changed in VS2019 to UIVisualizers plugin).

Code: https://github.com/JlSanchezB/Cute/blob/master/engine/core/string_hash.h