15th October 2014

One of the major headaches when writing code is to do with the accuracy of floating point arithmetic. Due to limited number of bits available in representing fractional numbers it is often the case that even the simple case of adding up numbers in a different order will produce a different result.

```
float total1 = (a + b) + c;
float total2 = a + (b + c);
```

So in the above example, total1 and total2 may give a slightly different result. You may argue that the difference is so small that it is insignificant, but let’s take a look at how this error can propagate over time.

For each of the following examples we will calculate the average of ten random numbers.

**404.** That’s an error.

The requested URL was not found on this server. That’s all we know.

In the first instance we will calculate the average when adding the numbers in order with the numbers added in reverse order.

**404.** That’s an error.

The requested URL was not found on this server. That’s all we know.

This produces an output of:

**404.** That’s an error.

The requested URL was not found on this server. That’s all we know.

Identical! No problem here then. Well not quite.

A float in c++ is 32 bits wide. That means if you add a very large number to a very small number, you aren’t going to have enough bits to represent the entire result and you will lose accuracy. Intel CPUs internally use 80 bits for floating point calculations and then round the result and store that as 32 bits again. So let’s repeat the experiment by adding these same 10 numbers 10000 times.

**404.** That’s an error.

The requested URL was not found on this server. That’s all we know.

As an added bonus we will also calculate the average on the same set of numbers but in a random order. Why a random order you might ask? This is where the parallel processing comes in. If you consider the example of when you parallelise code either on multiple CPUs or on a GPU. These additions could, in theory, be performed in potentially any order.

This time the results are:

**404.** That’s an error.

The requested URL was not found on this server. That’s all we know.

As you can see, these results are not only different each other, but also from the original calculation. This makes data consistency between sequential code and parallel code a huge problem. However while the perception may be to avoid parallel processing of data because it makes your results ‘wrong’, the reality is that if the loss of accuracy is too much for you by adding the same set of numbers in a different order, then your results were always wrong. You were never using enough precision in your calculation to begin with.