A guided tour

The MicroFloatingPoints package is organized into four modules:

MicroFloatingPoints: the main module containing the definition of the parameterized type Floatmu and the associated methods;
MicroFloatingPoints.MFPUtils: a module providing miscellaneous utility functions for the Floatmu type;
MicroFloatingPoints.MFPPlot: a module offering various graphical ways to display Floatmu floating-point numbers;
MicroFloatingPoints.MFPRandom: the module overloading Random.rand to produce Floatmu random values.

After having correctly installed the package (see Installation), we start our tour by loading the MicroFloatingPoints module:

julia> using MicroFloatingPoints

We can now define a new floating-point type MuFP with 2 bits for the exponent (the first parameter) and 2 bits for the fractional part (the second parameter):

julia> MuFP = Floatmu{2,2}Floatmu{2, 2}

Such a type is very limited, and a call to floatmax will give us the largest finite float representable:

julia> floatmax(MuFP)3.5

Conversely, we can obtain the smallest positive float in the MuFP format with the μ method:

julia> μ(MuFP)0.25

Note that this value is a subnormal number, which is different and smaller than the smallest normal float, obtained by calling floatmin:

julia> floatmin(MuFP)1.0

Graphics with `MicroFloatingPoints.MFPPlot`

The MicroFloatingPoints package offers several graphical functionalities that are all available in MicroFloatingPoints.MFPPlot when the plotting package PyPlot is also loaded:

julia> using MicroFloatingPoints.MFPPlot, PyPlot

Loading PyPlot will trigger the loading of a package extension. Alternatively, PythonPlot can also be used with the alias const plt = pyplot.

To better assess what we can do with such a small type, let us display all finite representable values on the real line. The MFPPlot module has just the right method:

julia> real_line(-floatmax(MuFP),floatmax(MuFP));

Floatmu{2,2} representable finite values

Since the difference between any pair of MuFP is always greater or equal to μ(MuFP), it becomes apparent why the introduction of subnormal numbers (in purple in the picture above) ensures the property:

\[\forall (a,b)\in\text{MuFP}\colon |b-a| = 0 \iff a=b\]

You may also notice in the figure that the predecessor of MuFP(2.0), which is 1.75, is displayed as "1.8". This is due to the fact that, following IEEE 754 requirements, the decimal string used to represent a Floatmu is the shortest that ensures a correct round-trip to the same Floatmu. For our very small format Floatmu{2,2}, "1.8" and "1.75" are represented by the same value; consequently, "1.8" is chosen over "1.75".

Exhaustive search for rounded additions

The type MuFP is so small that we can easily perform exhaustive searches with it. For example, we can display graphically whether the sum of any two finite MuFP floats needs to be rounded or not, using the inexact() and reset_inexact() methods to, respectively, test whether the preceding computation needed rounding and to reset the global inexact flag^[1]:

plt.figure()
plt.title("Exhaustive search for rounded sums in Floatmu{2,2}")
TotalIterator = FloatmuIterator(-floatmax(MuFP),floatmax(MuFP))
N = length(TotalIterator)
Z = zeros(Int,N,N)
let i = 1
    for v1 in TotalIterator
        j = 1
        for v2 in TotalIterator
            reset_inexact()
            v=v1+v2
            Z[i,j] = 0
            if inexact()
                Z[i,j] = isfinite(v) ? 1 : 2
            end
            j += 1
        end
        i += 1
    end
end
V = collect(TotalIterator)
plt.imshow(Z,origin="lower", cmap="Oranges")
plt.yticks(0:(length(V)-1),[string(V[i]) for i in 1:length(V)])
plt.xticks(0:(length(V)-1),[string(V[i]) for i in 1:length(V)],rotation=90);

Note the use of a FloatmuIterator to enumerate all floating-point numbers in a range.

We obtain the following matrix, where a salmon cell means that the sum of the values in row and column needs no rounding, an orange cell means that the result needs rounding to be represented by a Floatmu{2,2}, and dark red cells represent overflowed additions.

Random floats with `MicroFloatingPoints.MFPRandom`

Let us now draw some BFloat16 floats uniformly at random in $[0,1)$. We will use the MicroFloatingPoints.MFPRandom module to overload the rand method for the type Floatmu.

using DataStructures
using PyPlot
using MicroFloatingPoints
using MicroFloatingPoints.MFPRandom

BFloat16 = Floatmu{8,7}

ndraws=1000000
plt.figure()
plt.title("Drawing $ndraws values at random in BFloat16[0,1)")
T = [rand(BFloat16) for i in 1:ndraws]
Tc = counter(T)

We can now display the number of times each float was drawn:

for x in Tc
    (k,v) = x
    plot([k,k],[0,v],marker=".",color="blue",alpha=0.5)
end
(low,high) = extrema(collect(values(Tc)))
plt.ylim(ymin=0.99*low,ymax=1.01*high)

Arithmetic with various precisions

The BFloat16 and Float16 formats both represent floating-point numbers with 16 bits. The BFloat16 trades precision for a larger range. Let us compare the results obtained when summing the values of a vector with both types:

using MicroFloatingPoints
using Random
using Distributions
Random.seed!(42)

BFloat16 = Floatmu{8,7}
MuFloat16 = Floatmu{5,10}
T64 = [rand() for i in 1:1000]
bfT16 = [BFloat16(x) for x in T64]
FT16 = [MuFloat16(x) for x in T64]
println(sum(T64))
println(sum(bfT16))
println(sum(FT16))

502.1034961498109
256.0
503.5

For small values in $[0,1)$, the effect of a smaller significand appears drastic. On the other hand, the small range of the type Float16 makes it useless for computation with medium to large numbers:

T64 = [rand(Uniform(min(floatmin(BFloat16),floatmin(MuFloat16)),
            max(floatmax(BFloat16),floatmax(MuFloat16))/100)) for i in 1:100]
bfT16 = [BFloat16(x) for x in T64]
FT16 = [MuFloat16(x) for x in T64]
println(sum(T64))
println(sum(bfT16))
println(sum(FT16))

1.7828270493465184e38
1.7944577943096364e38
Inf

1More accurately, the inexact flag is local to each spawned task. That variable is not shared among tasks.

A guided tour

Graphics with MicroFloatingPoints.MFPPlot

Exhaustive search for rounded additions

Random floats with MicroFloatingPoints.MFPRandom

Arithmetic with various precisions

Graphics with `MicroFloatingPoints.MFPPlot`

Random floats with `MicroFloatingPoints.MFPRandom`