On float­ing-point arith­metic and IEEE 754

Created: Sun Feb 3 09:39:23 CET 2019

Last mod­i­fied: Sun Feb 3 23:50:31 CET 2019


I’m learn­ing about float­ing-point com­pu­ta­tions.

I wrote both μ and ν with­out ac­cel­er­a­tors for the sake of sim­plic­ity.

(define (μ k a)
    (define (helper a n)
      (if (null? a)
          0
          (+ (/ (car a) (expt k n))
             (helper (cdr a) (+ n 1)))))
    (helper a 1))

In case you’ve ever won­dered how to en­code 1/10 in bi­nary.

(μ 2 '(0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1))
; => 0.10000000149011612

Regarding IEEE 754 stan­dard for rep­re­sent­ing float­ing point num­bers, we’d have an ex­po­nent of -4, en­coded as 123, and the man­tissa 0.6.

(define (ν k a)
    (define (helper a n)
      (if (null? a)
          0
          (+ (* (car a) (expt k n))
             (helper (cdr a) (- n 1)))))
    (helper a (- (length a) 1)))

ν does the same as μ with nat­u­rals in­stead of pos­i­tive real num­bers. For ex­am­ple, (ν 2 '(1 1 0)) => 6, (ν 10 '(1 1 0)) => 110 and (ν 16 '(15 15)) => 255.

We can use those two func­tions to eval­u­ate any en­coded fp to a nu­mer­i­cal constant.

We can test this with the num­ber 0.085, which can be en­coded as fol­lows.

Sign Exponent  Mantissa
0    123 (-4)  0.36
0    01111011  01011100001010001111011

We’ll just read the steps in re­verse or­der.

(let ((exponent '(0 1 1 1 1 0 1 1))
      (mantissa '(0 1 0 1 1 1 0 0 0 0 1 0 1 0 0 0 1 1 1 1 0 1 1)))
  (exact->inexact
    (* (expt 2 (- (ν 2 exponent) 127))
       (+ 1 (μ 2 mantissa)))))
; => 0.08500000089406967

Of course, you’d have to mul­ti­ply the whole re­sult by -1 had the sign bit been set.

The same process can be ap­plied with the dou­ble pre­ci­sion form us­ing a bias of 1023 instead of 127.

(let ((exponent '(0 1 1 1 1 1 1 1 0 1 1))
      (mantissa '(1 0 0 1 1 0 0 1 1 0 0 1 1
                  0 0 1 1 0 0 1 1 0 0 1 1 0
                  0 1 1 0 0 1 1 0 0 1 1 0 0
                  1 1 0 0 1 1 0 0 1 1 0 1 0)))
    (exact->inexact   
      (* (expt 2 (- (ν 2 exponent) 1023))
         (+ 1 (μ 2 mantissa)))))
; => 0.1

Rounding er­rors

How come 0.1 has no ex­act rep­re­sen­ta­tion ?

The num­ber we ap­prox­i­mate 0.1 with is of the form, x/y where y is a power of two.

We have 10-1 = xy-1. It fol­lows that y = 10x (cross prod­uct) ≡ 2·2·…·2 = 10x. Because this equa­tion has no solution what­so­ever, we know that we can never rep­re­sent 0.1 in a base-2 system. Had the de­nom­i­na­tor been a power of two, we would have got ourselves an ex­act rep­re­sen­ta­tion.

source code