Simple but effective optimisation in C

StopwatchHere’s a short program. It repeats a loop twice, indexing through a text string a million times adding up the value. Instead of starting the total at 0, I set it to the outer loop index, to try and reduce the scope for immediate optimisation.

#include<stdio.h>
#include <string.h>
#include "hr_time.h"

stopWatch s;

char* testString = "This is a rather long string just to prove a point";
int main()
{
    int total = 0;
    startTimer(&s);
    for (int i = 0; i < 1000000; i++) {
        total = i;
        for (int index = 0; index < (int)strlen(testString); index++) {
            total += testString[index];
        }
    }
    stopTimer(&s);
    printf("Total =%d Took %8.5f\n", total, getElapsedTime(&s));

    startTimer(&s);
    int len = (int)strlen(testString);
    for (int i = 0; i < 1000000; i++) {
        total = i;
        for (int index = 0; index < len; index++) {
            total += testString[index];
        }
    }
    stopTimer(&s);
    printf("Total =%d Took %8.5f\n",total, getElapsedTime(&s));
    return 0;
}

I compiled and ran it twice, once in Debug and once in Release mode on Windows using MSVC.

Debug:

Total =1004673 Took 0.55710
Total =1004673 Took 0.11465

Release

Total =1004673 Took  0.00762
Total =1004673 Took  0.00765

Clearly in Release compilation, the compiler is smart enough to realise that it can optimise strlen(testString) away so there’s no difference between the two times. But in debug it’s clear that calling strlen() inside a for loop is relatively slow.

I compiled it and ran it with clang on Ubuntu 20.04 in the Hyper-V VM. The times with default optimization were

0.18370 
0.10644

and with “-O3” added to the compile line for maximum optimisation,. this changed to

0.0762
0.0745

which is almost identical to the Windows release compile times.