You can use Fixed Point mathematics to save Flash memory and increase performance.

Typically you will need to calculate a result as a 'real' number, for
which you usually turn to the floating point library (because its
easy). Floating point has advantages, the main one is that it is simple
to use, but behind the scenes memory is being consumed. In addition the
processor must do more work.

Surprisingly you can do many calculations without loss of
accuracy, accepting that calculations are made to a specific number of
decimal places - which is true in engineering anyway.

Warning: Floating point is inaccurate - you can not represent every
number as a float e.g. 1/3 can not be represented accurately by a floating point representation since it is a recurring number.
Rounding errors creep in which is why you
have to use larger and larger floating point representations to get an
accurate result.

Instead of using
a floating point representation you use an integer variable and choose
where the fixed decimal point will be placed.

Theoretical operation of an 8 bit ADC

Lets say that you have an 8 bit ADC and you want to figure out the
voltage that the ADC is seeing as a real number. To do that you would normally workout
the floating point multiplier needed for each bit of the ADC:

bit_value = 5/pow(2,8) = 0.01953125

Where bit_value represents the voltage per bit of the ADC.

So if your ADC returned a value of 127 then the ADC voltage would be:

0.01953125 * 127 = 2.48V

For the maximum adc value you would have:

0.01953125 * 255 = 4.98V

Note: An ADC does not return 5V for maximum voltage input as it can
only return values in steps of 0-255 and not 0~256 (256*0.1953125
is 5V).

Do not try to force the value by changing the scaling factor using
5/(pow(2,8)-1) as this introduces more error even though the maximum
output of the ADC will now show 5V (it is a fudge and the value shown is
not truly 5V).

The question is how can you use an integer variable to store and manipulate this real operation?

Example of using Integer Fixed point

An example problem

Here is the problem to solve:

Using minimum memory resources of a microcontroller, use an 8 bit ADC with 5V reference
to transmit serial data to a PC via the serial port
an ADC reading every second.

The most important decision is not to use floating point - this saves
lots of memory resource - all the rest; sending data to the serial port
and timing will not take up much memory (the soft serial port (TX) that
you can find here
takes about 90 memory words, the hardware internal serial module needs even less memory).

For this problem you will want to calculate an ADC voltage which needs a multiplier of 0.0195.

Normal maths using a floating point variable would result in
the following:

Results using float for ADC Example

ADC reading

Floating point calculation

Floating point result

0

0 * 0.0195

0.0

127

127
* 0.0195

2.4765

255

255 * 0.0195

4.9725

With Fixed point mathematics you can do this quickly and
efficiently using only
integer
type variables.

Note:
Floating point variables use up large amounts of resources in a microcontroller as
microcontrollers are only really good at integer type variables.
To do floating point complex library functions are called up.
This is similar to the old 386 processor which was no good
at floating point
operations, being far too slow, so a separate floating point processor
was added.

How to setup for fixed point

So how to do it : First of all the step size of each ADC bit is:

5V/256 = 0.0195 (19.5mV)

Where 256 is the expanded value of the number of bits in the ADC 2^8 pow(2,8).

Next, work out the size of the integer variable that is big
enough to hold the maximum expected value but is also the minimum size
integer you can get away with. Factors such as the number of ADC bits
and required calculation accuracy affect the variable size.

Ignoring the leading zeros, choose a multiplier and check the sizes as follows:

In this example choose a multiplier of 195 : Here's the maximum output value:

Maximum value : 255 * 195 = 49725

Max value that unsigned int can store is 2^{16}-1
= 65535.

Here you can use a 16 bit unsigned int since 49725 is smaller than 65535.

Note: For larger calculation
results use a larger integer type.e.g "long" or "long long" and also
remember that integers are different lengths depending on the compiler
setup! (this is why types such as uint32_t are used to specify bit
length).

Results for fixed point ADC Example

This has calculated the result using fixed point as shown in the table
below:

ADC reading

Integer calculation

Integer result

0

0 * 195

0

127

127 * 195

24765

255

255 * 195

49725

Note how every number above as an integer value yet the result
matches the floating point calculations in the previous table. All that
changes is the location of the decimal point which is fixed four places
to the left. This decimal point is the part that you as the programmer
must remember and take into account for further calculations.

This code uses the internal compiler routine '16 bit multiply' which is
much simpler than a floating point multiply so you save memory.

Displaying Output

Simple Algorithm

To display fixed point representation you convert the
integer value to a string representation (easy to do ) then test the
value for size i.e

If the value is greater than 1000 then print out the left most digit else print zero.

Print out a dot (the decimal point).

Then carry on in the same way:

if >100 print the next digit else print zero.

if >10 print the next digit else print zero.

if >1 print the next digit else print zero.

Alternative Generic print method

Another, more generic way, is to use the integer divide (and remainder)
routines and loop through until the value is zero. Example pseudo code (if the value is held in 'fixed'):

while(fixed) {
if (fixed % 10) store fixed % 10 as character in string. Move to next string position.
fixed /= 10;
}
reverse the string.

You would use a string buffer and pointers to do the above job.

Going Further

What about multiplying or dividing

When you start using fixed point you will probably need to multiply or
divide fixed point numbers. The key to doing this is to know where the
decimal point will be after the calculation.

In the example above the numbers were chosen to put the decimal point 4 places to the left
so we had the maximum output from the ADC:

255

and a scaling factor

195

If you normalize these numbers to scientific notation you have:

2.55E2

and

1.95E2

This is useful as it now tells you where the decimal point will be i.e. adding powers when multiplying gives 2.55*10^{2} *1.95*10^{2} or 4.9725 * 10^{4} or places to the left.

So for the maximum output number 49725 we have 4 [dp] 9725. The [dp] is imaginary - you have to remember where it is.

This seems liek a lot of work but once you have figured out the size of
variable required and the expected output value range you just use the
result and interpret that result in the rest of the program.

Remember, you don't lose accuracy with this method and you save a lot of
Flash memory (probably around 1k words or more). In addition the
execution speed will be vastly improved.

Example using 10 bit ADC

As a quick example if you were to use the standard 10bit ADC found in
PIC and AtMega chips then the following calculations would follow:

Voltage value per bit:

5V/1024 = 0.0048828125 (4.882mV)

The scale factor now becomes 488.

Note: Select the number of digits for the required precision e.g. you
could have chosen scale factor 4882 - but remember it changes the
decimal point location.

Maximum value : 1023.0*488 = 499224

Now you can see that this number is too big for a 16 bit integer so you would choose a long to hold the number.

The decimal point location is 1.023e3 * 4.882e2 so adding powers gives
e5 = 5 places to the left. You can see that this is true since 4.99224
is one bit value off of 5V. Here the accuracy is to 2 decimal places.

Improving accuracy

By increasing the number of digits in the scaling value (voltage value
per bit) to 4882 you would get the following maximum output:

1023*4882 = 4994286

Adding one bit value of 4882 gives: 4994286 + 4882 = 4999168

So the above is accurate to 3 decimal places with the decimal point now located 6 digits to the left.
Ultimately accuracy required depends on the accuracy required at the output.

Arduino reference: Arduino functions don't always work the way you think they do! Easily improve your code by learning all about these functions and avoid common problems.

## Comments

Have your say about what you just read! Leave me a comment in the box below.

Don’t see the comments box? Log in to your Facebook account, give Facebook consent, then return to this page and refresh it.