Fun with optimisation in C

Bit twiddling with C is quite a common thing but sometimes it can be hard to understand something quite simple. Here’s a line of code. Can you say what it does? Note that a,b and c are all ints.
c = b ^ ((a ^ b) & -(a < b));
Or how about this which looks quite similar
c = a ^ ((a ^ b) & -(a < b));
The first line puts the minimum of a and b into c, while the second line puts the max of a and b. Both came from a page of bit twiddling hacks by Sean Eron Anderson at Stanford university. The significant thing about these is not that they provide two clever if incomprehensible ways of calculating min and max but they do it without branches.
If you look at the source of min and max in stdlib.h they are defined like this:
#define max(a,b) (((a) > (b)) ? (a) : (b))
#define min(a,b) (((a) < (b)) ? (a) : (b))
Because modern CPUs use instruction caches (pipelines), they try to keep them full through a technique called branch prediction. If a branch happens then the CPU has to read instructions from the wherever the branch goes to and this causes a pipeline stall which slightly slows down processing. So branchless min and max should run faster than branching code, at least in theory.
I tried it on Windows using a small Visual C++ (Well C) program doing the four in a loop 10 million times. I also ran it in 32-bit and 64-bit. All runs were done in release mode which lets the compiler optimise.
Curiously the 32-bit took almost no time at all while the 64-bit took a lot lot longer. There wasn’t much difference between branchless and branch. Possibly the code is too small and the benefits of branchless are less significant because the instructions fall inside the instruction cache for 32-bit while 64-bit may be too big and so is affected by instruction cache misses.
This is the 32-bit time.
fastMin 0.0000000 fastMax 0.1000000 Min 0.1000000 Max 0.1000000
And this is the 64-bit. Quite a lot of difference!
fastMin 1757.6000000 fastMax 1677.9000000 Min 2039.4000000 Max 2134.4000000
This is the code which compiles in Visual C++. You can get the hr_time timing code (for WIndows) from the sdlttf.zip file. I increment and decrement a and b to stop the compiler possibly spotting that the values would be invariant otherwise and doing too-clever optimisation stuff.
#include <math.h>
#include <stdio.h>
#include "hr_time.h"
#define NUMLOOPS 10000000
int a = 1000;
int b = 67;
int c;
double t;
stopWatch s;
int main() {
startTimer(&s);
for (int i = 0; i < NUMLOOPS; i++) {
a++;
b++;
c = b ^ ((a ^ b) & -(a < b));
}
stopTimer(&s);
t = getElapsedTime(&s) * 1000000.0;
printf( "fastMin %8.7f\n", t);
startTimer(&s);
for (int i = 0; i < NUMLOOPS; i++) {
a--;
b--;
c = a ^ ((a ^ b) & -(a < b));
}
stopTimer(&s);
t = getElapsedTime(&s) * 1000000.0;
printf("fastMax %8.7f\n", t);
startTimer(&s);
for (int i = 0; i < NUMLOOPS; i++) {
a++;
b++;
c = min(a, b);
}
stopTimer(&s);
t = getElapsedTime(&s) * 1000000.0;
printf("Min %8.7f\n", t);
startTimer(&s);
for (int i = 0; i < NUMLOOPS; i++) {
a--;
b--;
c = max(a, b);
}
stopTimer(&s);
t = getElapsedTime(&s) * 1000000.0;
printf("Max %8.7f\n", t);
}

So I got round to writing that SDL vs TTF timing comparison. However I had to make a few assumptions. The program was written for Windows but easily converts to Linux (apart from the timing code but I did that 
Back in the day (2011) I did a bit of Flash game development. It was a curious technology compared to what I was used to. It was originally a way of scripting graphics and other media but really took off when ActionScript, a programming language very much based on JavaScript was introduced.
There’s a saying “Don’t assume- it makes an ass of U and me“, and I er fell foul of this a month ago. A couple of months ago I setup a cheap VPS. It was one of those that you pay every month. What I didn’t realise was you are explicitly meant to renew the hosting every month and they send you an email with a link. Of course what did I do?, I er forgot to renew it. Annoyingly, I’d installed Virtualmin, redirected a domain and bought a cheap SSL certificate. All lost.
The final game will use graphics but those graphics will be based on characters, so I’ve started off by drawing a room or two using the provided extended ASCII characters.
This follows on from yesterday’s post about creating QR Codes. How about creating a web game similar to the Choose Your own Adventure type games but with a difference? I did think about implementing this as a proof of concept and may yet still but ideas are worthless until executed so I’m happy to put this out there. Here are a few notes on a proposed web QR game.
Sometimes you come across a design that is sheer simplicity, could not be easier to use and it just works. That QR code you will not be surprised takes you to this very website!

Don’t expect this to be Call of Duty standard but then those games typically have a 50GB or higher footprint on disk.