A guided tour
The MicroFloatingPoints
package is organized into four modules:
MicroFloatingPoints
: the main module containing the definition of the parameterized typeFloatmu
and the associated methods;MicroFloatingPoints.MFPUtils
: a module providing miscellaneous utility functions for theFloatmu
type;MicroFloatingPoints.MFPPlot
: a module offering various graphical ways to displayFloatmu
floating-point numbers;MicroFloatingPoints.MFPRandom
: the module overloadingRandom.rand
to produceFloatmu
random values.
After having correctly installed the package (see Installation), we start our tour by loading the MicroFloatingPoints
module:
julia> using MicroFloatingPoints
We can now define a new floating-point type MuFP
with 2 bits for the exponent (the first parameter) and 2 bits for the fractional part (the second parameter):
julia> MuFP = Floatmu{2,2}
Floatmu{2, 2}
Such a type is very limited, and a call to floatmax
will give us the largest finite float representable:
julia> floatmax(MuFP)
3.5
Conversely, we can obtain the smallest positive float in the MuFP
format with the μ
method:
julia> μ(MuFP)
0.25
Note that this value is a subnormal number, which is different and smaller than the smallest normal float, obtained by calling floatmin
:
julia> floatmin(MuFP)
1.0
Graphics with MicroFloatingPoints.MFPPlot
The MicroFloatingPoints
package offers several graphical functionalities that are all available in MicroFloatingPoints.MFPPlot
when the plotting package PyPlot
is also loaded:
julia> using MicroFloatingPoints.MFPPlot, PyPlot
Loading PyPlot
will trigger the loading of a package extension. Alternatively, PythonPlot
can also be used with the alias const plt = pyplot
.
To better assess what we can do with such a small type, let us display all finite representable values on the real line. The MFPPlot
module has just the right method:
julia> real_line(-floatmax(MuFP),floatmax(MuFP));
Since the difference between any pair of MuFP
is always greater or equal to μ(MuFP), it becomes apparent why the introduction of subnormal numbers (in purple in the picture above) ensures the property:
\[\forall (a,b)\in\text{MuFP}\colon |b-a| = 0 \iff a=b\]
You may also notice in the figure that the predecessor of MuFP(2.0)
, which is 1.75
, is displayed as "1.8"
. This is due to the fact that, following IEEE 754 requirements, the decimal string used to represent a Floatmu
is the shortest that ensures a correct round-trip to the same Floatmu
. For our very small format Floatmu{2,2}
, "1.8" and "1.75" are represented by the same value; consequently, "1.8" is chosen over "1.75".
Exhaustive search for rounded additions
The type MuFP
is so small that we can easily perform exhaustive searches with it. For example, we can display graphically whether the sum of any two finite MuFP
floats needs to be rounded or not, using the inexact()
and reset_inexact()
methods to, respectively, test whether the preceding computation needed rounding and to reset the global inexact flag[1]:
plt.figure()
plt.title("Exhaustive search for rounded sums in Floatmu{2,2}")
TotalIterator = FloatmuIterator(-floatmax(MuFP),floatmax(MuFP))
N = length(TotalIterator)
Z = zeros(Int,N,N)
let i = 1
for v1 in TotalIterator
j = 1
for v2 in TotalIterator
reset_inexact()
v=v1+v2
Z[i,j] = 0
if inexact()
Z[i,j] = isfinite(v) ? 1 : 2
end
j += 1
end
i += 1
end
end
V = collect(TotalIterator)
plt.imshow(Z,origin="lower", cmap="Oranges")
plt.yticks(0:(length(V)-1),[string(V[i]) for i in 1:length(V)])
plt.xticks(0:(length(V)-1),[string(V[i]) for i in 1:length(V)],rotation=90);
Note the use of a FloatmuIterator
to enumerate all floating-point numbers in a range.
We obtain the following matrix, where a salmon cell means that the sum of the values in row and column needs no rounding, an orange cell means that the result needs rounding to be represented by a Floatmu{2,2}
, and dark red cells represent overflowed additions.
Random floats with MicroFloatingPoints.MFPRandom
Let us now draw some BFloat16
floats uniformly at random in $[0,1)$. We will use the MicroFloatingPoints.MFPRandom
module to overload the rand
method for the type Floatmu
.
using DataStructures
using PyPlot
using MicroFloatingPoints
using MicroFloatingPoints.MFPRandom
BFloat16 = Floatmu{8,7}
ndraws=1000000
plt.figure()
plt.title("Drawing $ndraws values at random in BFloat16[0,1)")
T = [rand(BFloat16) for i in 1:ndraws]
Tc = counter(T)
We can now display the number of times each float was drawn:
for x in Tc
(k,v) = x
plot([k,k],[0,v],marker=".",color="blue",alpha=0.5)
end
(low,high) = extrema(collect(values(Tc)))
plt.ylim(ymin=0.99*low,ymax=1.01*high)
Arithmetic with various precisions
The BFloat16
and Float16
formats both represent floating-point numbers with 16 bits. The BFloat16
trades precision for a larger range. Let us compare the results obtained when summing the values of a vector with both types:
using MicroFloatingPoints
using Random
using Distributions
Random.seed!(42)
BFloat16 = Floatmu{8,7}
MuFloat16 = Floatmu{5,10}
T64 = [rand() for i in 1:1000]
bfT16 = [BFloat16(x) for x in T64]
FT16 = [MuFloat16(x) for x in T64]
println(sum(T64))
println(sum(bfT16))
println(sum(FT16))
502.1034961498109
256.0
503.5
For small values in $[0,1)$, the effect of a smaller significand appears drastic. On the other hand, the small range of the type Float16
makes it useless for computation with medium to large numbers:
T64 = [rand(Uniform(min(floatmin(BFloat16),floatmin(MuFloat16)),
max(floatmax(BFloat16),floatmax(MuFloat16))/100)) for i in 1:100]
bfT16 = [BFloat16(x) for x in T64]
FT16 = [MuFloat16(x) for x in T64]
println(sum(T64))
println(sum(bfT16))
println(sum(FT16))
1.7828270493465184e38
1.7944577943096364e38
Inf