Why visualise data?

This excellent short video from Column Five really demonstrates the difference that even basic data visualization techniques can make.

Optimisation Workflow

I believe the key to optimising any program is measurement: not writing l33t code which seems to be what a lot of programmers think optimisation is!  The optimisation process is all about finding the slow parts of your program and speeding those slow parts up by refactoring your source code to meet your target performance goals.  Without measuring the code or having target performance goals optimisation is a waste of time: as programmers are very bad at guessing where the slow parts of the program are and very good at optimising pieces of the program that do not need it.

This diagram emphasises the importance of measurement in the optimisation process: you cannot begin the process, evaluate your optimisations, even discard your optimistations or honestly finish the process without first measuring the program performance versus your target performance.  As I have mentioned before the initial optimisations to a program tend to yeild larger returns and then returns drop off until much more time is invested. These initial easy optimisations are not usually the sort of changes new programmers expect: they expect sexy l33t code like inline assembler for big performance wins, not tweaking compiler flags or removing calls to a pure virtual method on a base object in an inner loop.

Programming Language Layer Cake

There are many different programing languages in existence today (I’m going to write about the languages I am familiar with) and I like to visualise their relationships as layers of a cake.  There are two main ways to categorise programming languages: as having high or low levels of abstraction and as either compiled or interpreted languages.

The bottom half of the cake consists of the languages with low levels of abstraction, with microcode at the bottom and C++ at the top, all of these languages are compiled languages which means their source code is interpreted and compiled into a lower level language e.g, C++ is compiled into assembler.  I categorise C++as a lower level language for the purposes of this diagram as its memory management is still a manual process unlike the languages in the layers above it, which makes it less trivial to use.

Microcode is the lowest level of abstraction available and it is the programming language used to create assembly language instructions.  Assembly Language is the language used by computer processors (CPUs) and is specific to each CPU family and is also the lowest level at which software can typically interact with the CPU.  Next is the common low level languages C and C++ which I’ve bundled into the same layer as C++ can compile C code so they are not really completely distinct.  Although C++ is a higher level language than C as it supports object oriented programming which is typically considered a feature of high level languages.

The top half of the cake can be roughly split in to two groups of programming languages: those languages that are compiled (blue box) into a lower level language before execution by a compiler and those that are interpreted (green box) at run time by an specific interpreter program. The bottom half of the green and blue boxes represent the respective compilers and interpreters for the languages.  It is worth noting that although the .net programming languages (C#, CLR C++ and VB.net) and the Java family of languages are all compiled, they are compiled into byte code executables before they can be executed (run) by a virtual machine (usually written in C/C++).  The interpreted languages Ruby, php and Python are not compiled before execution but are interpreted at run time by an interpreter program (again usually written in C/C++).  Interestingly interpreters do exist to process Ruby (JRuby & IronRuby) and Python (JPython & IronPython) source code into byte code for the Java virtual machine or .net CLI, allowing ruby or python to be run on any Java or .net CLI virtual machine.

Compiled languages like Java or .net must be recompiled before they can run on a different virtual machine however this is usually not a serious issue as .net and Java virtual machines exist for most opereating systems in some form or another.  Recompilation is not a requirement for interperated lanagues which only require a functioning intererator on the target operating system to be able to be run, which makes them easier to use across multiple operating systems and leds to them being considered more portable than the compiled languages.

The Game Development Cycle

The six phases in the average game development cycle are illustrated below:

The initial phase of a project is ‘Pre Production’ where the main game concepts and features are brainstormed, designed, prototyped and signed off on.  The end goal of this phase is to prove (usually via a demo) the core game play experience and produce a feature brief, technical designs and a development schedule.  The project team is usually lightly staffed until ‘Production’ begins.  Next comes the main development phase ‘Production’, this is usually the longest phase in a project and is where the games features and content are created and put together.  The ‘Production’ phase is usually broken into several parts separated by milestones with pre-set deliverables according to the development schedule.

After production comes the main bug fixing portions of the cycle, although ideally bug fixing should be a continuous activity during the whole cycle.  ‘Alpha’ is usually defined as the phase when all game features have been implemented but the game still contains major bugs like crash bugs and the game play still needs tweaking.  Next is ‘Beta’ usually defined as features complete with no major or crash bugs, ‘Beta’ is usually restricted to minor tweaking of the game.

Post ‘Beta’ all that is left to do is ship the product, traditionally this meant sending a disc (usually gold coloured CD-R) of the title to the publisher for duplication for retail.  This is where the term ‘going Gold’ comes from.  During the ‘Gold’ phase of development any changes are forbidden except bug fixes requested by the publisher.

The final phase of a project is post production, during this phase the whole development team is not usually present as people take holidays or are assigned to other projects.  The main focus during ‘Post Production’ is that of clean up and refactoring of the code base, pipelines and game systems.  This includes analysing the development cycle, usually in the form of ‘Post-mortem’ style reviews and brainstorm sessions with the focus to improve future development efforts based on recent experience.

User Population Vs User Expertise

Mapping user expertise against the user population of a product tends to result in a bell shaped curve.  Indicating that most of the user population for a piece of software has an intermediate level of expertise.  This is an important observation when developing software as developers (as expert users themselves) will tend to develop more expert features and customer support will push for more beginner features.  There is then a high probability that the largest part of the user population (the intermediate users) will be ignored, at the cost of increased product adoption or sales.

Cost, Quality & Performance

It is generally regarded that there are three primary properties of a software product: cost of development, performance (speed/memory etc) and software quality.  It is believed that you can only ever obtain two of these properties at the cost of the remaining property e.g, you can have high performance and high quality but it won’t be cheap to develop.

My experiences over the years tends to confirm this rule of thumb but I can’t help wondering if that fictional high performance, high quality and low development cost software has ever been made?