Today I start with something that Mike Acton Engine Director at Insomniac Games often talks about: "The Three Lies"
Lie 1: Software is a platform
Something that often is being teached at schools. It comes around the idea that you shouldn't care about what hardware you are running. Software is not a platform, software is always running on hardware and even if it may be different hardware its hardware that we can understand and optimize for. If you don't know for example memory restrictions you have no idea how the code will perform in the real world and not this idealized "vacuum" world.
Hardware dictates choices for the data and data dictates choices for the code.
Lie 2: Code should be designed around a model of the world
In typical C++ and OOP code there is idea that you need to model your code around the real world. For example if there is a Missile in the game there will be a Missile class. Usually game updates look like this:
virtual void update() = 0;
class Missile : public Entity
for (uint i = 0; i < m_entites.size(); ++i)
The typical problem with code like this is that above all it doesn't scale. Today all code that is used during run-time needs to been designed with data layout in mind and for multicore.
It will also likely to trash both instruction cache and data cache depending on what kind of entities we have.
A better way to do is to try to separate out each step on its own.
1. first do the collision calculation on the given type (lets say spheres)
2. After step 1 is done update all objects for each type without virtual updates.
CalcSphereCollisions(outputData, inputData, count)
CalcSphereCollisions(sphereHits, collisionSpheres, count)
MoveableUpdate(moveableObjects, sphereHits, objectsCount)
If the code is written this way we can run all sphere calculations in parallel first and then we update all the objects that need the collision information. Now this code also do one thing at a time (but in parallel) thus we reducing the amount of jumping around in the code and we will likely stay in the cache now. The typical "Missile" update has been removed and replaced with a "MovableUpdate" as this function only needs to know to process on on thing and its the given data, if it's a missile or something else is of non importance.
Lie 3: Code is more important than data
I touched a bit about this in the above point as well. This is the big one. We spend way way too much time talking about code when it really doesn't matter. When doing time critical programming one can think of it as a complex DSP. We get data in, modify it and send out. We spend very little time on actually talking about how the data flows in an application. Data is what should be dictating all decisions in your application. If you don't know the data how will you know how it will perform? where the bottlenecks will be, etc. You have no idea.
If data is designed in a correct (and often simple way) you should only be able to write the code in one way.
Take the CalcSphereCollisions function from above. its very clear what this function will do and it does only one thing over a range of items. There are several benefits of having the code like this.
1. Its easy to test in separation. The input and output is clear.
2. Easy to write unit-test for as it can be tested in separation.
3. Verify the input early, can test for bad things at the collision stage on all items instead of being inside some virtual Missile update.
4. Cache-friendly. Input and output is just linear arrays in memory so cache performance will be good. On Cell/SPU it will be easy to DMA the stream in and pretty much hide all memory latency.
On other architectures one can use cache functions to prefect ahead.
5. Can run in parallel. its possible to just divide the work on the list between x number of CPU/SPUs to do the processing in parallel.