Tuesday, September 15, 2009

Ps3 Engine

So I started to work on a Ps3 Engine (No the Ps2 engine is not dead. I'm just getting a bit side tracked here)

The purpose of it is mainly to get more familiar with the Ps3 but also to tackle general concurrency problems.

My basic idea for one frame looks something like this (Note that the each block does not represent the actual time that it may take during a real frame. This is just a high concept but still a bit interesting to see where I will end up with it.


I will be mostly coding use pro-devtools at work so I can't post every single detail here but most concepts I can talk about. Currently I'm implementing a high level jobmanager that will just execute jobs for whole systems (so large granularity) and let the system itself doing load balancing. This is possible as everything should be running on the SPU.

This presentation has been a good inspiration when deciding the direction for the JobManager.

GDC2009 SPU Wrangling

The prime goal is to test out scenarios which is hard to do in a production environment with tons of code and here I can just go while and crazy to test out stuff. For example. Let's say we want to run all gameplay code on SPU. How should the data be designed? What kind of functionality is needed to for example create a Battlefield type vehicle that has lots of components and parts that interacts with each other? And so on.

Hopefully some of this will also be good to bring back to the engine at work.

Watch this space for more updates.

And also do anyone knows how to post images on blogger without the annoying rescale?

Tuesday, August 11, 2009

Deja VU

Renamed the blog to 'Deja VU' (which is a reference to the VUs on ps2 :)

Thanks to Gilligan for the name idea!

In other news I hope during the week/weekend add the next part of the ps2 series.

Tuesday, July 28, 2009

PS2 hardware explained : Overview

Today I start with giving the answer to the question in my first post "Why Ps2?"

I actually can't remember when I started to do some programming on the PS2 but its likely like 5-6 years ago. In the start it was hard and a steep curve for sure. The access you have to the hardware is really low-level making coding for the PS2 very unforgiving (yet fun!). If you happens to get some number wrong in your DMA-chain(s) for example the DMA will lock up and all you know is that something is wrong somewhere.

Nowadays I have written my own code that verifies a DMA-chain when the DMA locks up. It catches some errors but not all. The PS2 for example has no cache-snooping meaning that the data you send to the DMA must be in main memory and this of course can lead to lots of 'funny' problems.

One of the things that makes me still want to code for it is that no one has really pushed the hardware that well. Some games (from Insomniac and Naughty Dog) is likely the ones that pushed the hardware the most but I know that there is still much to give in terms of cool stuff one can do.

In the article series I will show more of how you can do creative things for the PS2 even if quite simple it's very flexible and it's all up to the programmer how to sort out all the problems. There are for sure a whole range of annoying problems/drawbacks but that's the same on all hardware anyway. So the short answer is that I like the hardware and I think it can be pushed further.

The long answer I hope to tell with this series.

Playstation 2 hardly needs any introduction and for you that lived under a rock for the past ten years can read up here: http://en.wikipedia.org/wiki/Playstation_2

Lets start with a picture and some simple facts about the hardware (I will go in detail about each unit/part later on)


These are the basic components that makes up the hardware (Text taken from Wikipedia with some changes)
  • CPU: MIPS R5900 64-bit with 128-bit MMI instructions "Emotion Engine" (I hate that name so "EE Core" or just EE from now on) clocked at 294.912 MHz (299 MHz on newer versions)
  • GS: "Graphics Synthesizer" (Hate that name as well, GS from now on) clocked at 147 MHz
  • Vector Units: VU0 and VU1 (Floating Point Multiply Accumulator × 9, Floating Point Divider × 1), 32-bit, at 150 MHz.
  • DMA: Main Bus running at 150Mhz giving a total bandwidth of 2.4GB/se.The DMAC controls all data transfers in the system. The DMAC transfers in parallel with the CPU
  • IPU: Image processing Unit (Used when doing block decoding of mpeg data for movies)
  • IOP: Input Output Processor. Handles all communication with I/O devices (gamepad, CD/DVD, Network, USB, etc). This is also the same CPU that is used in the PS1 and is used for BC with PS1 games on Ps2)
  • SPU2: Sound Processor (No, it's not the same kind of SPU that is used in Cell/PS3 :)
I haven't gone in to detail yet as this post would be very long then. Some things that I haven't touched on yet is the VIF and the GIF. They will be much more detailed later on but these are interfaces to the VUs and the GS that you use when you 'talk' with them.

Next post I will detail a bit more about the GS and the DMA and also show a very small code example of how to actually get some rendering going on the PS2

Monday, July 27, 2009

The Three Lies

So in this blog I will mostly write about progress on my 3D engine for the Playstation 2 (Why Ps2 you may ask? That will likely be covered in a later post :) and other ranting about code with focus on performance or something totally unrelated to code at times.

Today I start with something that Mike Acton Engine Director at Insomniac Games often talks about: "The Three Lies"

Lie 1: Software is a platform

Something that often is being teached at schools. It comes around the idea that you shouldn't care about what hardware you are running. Software is not a platform, software is always running on hardware and even if it may be different hardware its hardware that we can understand and optimize for. If you don't know for example memory restrictions you have no idea how the code will perform in the real world and not this idealized "vacuum" world.
Hardware dictates choices for the data and data dictates choices for the code.

Lie 2: Code should be designed around a model of the world

In typical C++ and OOP code there is idea that you need to model your code around the real world. For example if there is a Missile in the game there will be a Missile class. Usually game updates look like this:


class Entity
{
virtual void update() = 0;
}

class Missile : public Entity
{
void update()
{
if (!hasCollided())
moveInTraceDirection();
}
}

for (uint i = 0; i < m_entites.size(); ++i)
m_enties[i]->update();

The typical problem with code like this is that above all it doesn't scale. Today all code that is used during run-time needs to been designed with data layout in mind and for multicore.

It will also likely to trash both instruction cache and data cache depending on what kind of entities we have.

A better way to do is to try to separate out each step on its own.

1. first do the collision calculation on the given type (lets say spheres)
2. After step 1 is done update all objects for each type without virtual updates.

Pseudo code:

CalcSphereCollisions(outputData, inputData, count)
{
}

CalcSphereCollisions(sphereHits, collisionSpheres, count)
MoveableUpdate(moveableObjects, sphereHits, objectsCount)

If the code is written this way we can run all sphere calculations in parallel first and then we update all the objects that need the collision information. Now this code also do one thing at a time (but in parallel) thus we reducing the amount of jumping around in the code and we will likely stay in the cache now. The typical "Missile" update has been removed and replaced with a "MovableUpdate" as this function only needs to know to process on on thing and its the given data, if it's a missile or something else is of non importance.

Lie 3: Code is more important than data

I touched a bit about this in the above point as well. This is the big one. We spend way way too much time talking about code when it really doesn't matter. When doing time critical programming one can think of it as a complex DSP. We get data in, modify it and send out. We spend very little time on actually talking about how the data flows in an application. Data is what should be dictating all decisions in your application. If you don't know the data how will you know how it will perform? where the bottlenecks will be, etc. You have no idea.
If data is designed in a correct (and often simple way) you should only be able to write the code in one way.

Take the CalcSphereCollisions function from above. its very clear what this function will do and it does only one thing over a range of items. There are several benefits of having the code like this.

1. Its easy to test in separation. The input and output is clear.
2. Easy to write unit-test for as it can be tested in separation.
3. Verify the input early, can test for bad things at the collision stage on all items instead of being inside some virtual Missile update.
4. Cache-friendly. Input and output is just linear arrays in memory so cache performance will be good. On Cell/SPU it will be easy to DMA the stream in and pretty much hide all memory latency.
On other architectures one can use cache functions to prefect ahead.
5. Can run in parallel. its possible to just divide the work on the list between x number of CPU/SPUs to do the processing in parallel.