*Summary of IEEE-754*

This is a summary of what I understand about IEEE-754 notation.

*CMIIW ?*

For the purpose of explanation, single precision is chosen.

*Normalised:*

IEEE-754 is another way to represent the scientific notation.

Example of scientific notation : 1.00 x 10^5

Where 1.00 represents the fraction part (Mantissa), and 5 represent the exponent part. 10 represents the radix.

Note that the explanation below uses binary.

Notation :

Notation :

First bit = Sign of the number, 0 as positive, 1 as negative - the 'sign bit'

The next 8 bits = the Exponent part. Represented by 8-bit (in excess-127 format)

FORBIDDEN à 00000000 and 11111111 are forbidden for normalised single precision, will be explained later.

00000000 as an exponent is ONLY FOR DENORMALISED numbers

11111111 as exponent is ONLY FOR INFINITY.

00000000 as an exponent is ONLY FOR DENORMALISED numbers

11111111 as exponent is ONLY FOR INFINITY.

The
last 23 bits = the Fraction part (aka the Mantissa).

The next 23 bits represent the bits to the right of the binary point. To the left of the binary point is always a 1 (strictly for normalised - and therefore it can be disregarded and assumed to be a one).

All 0 (twenty-three 0's) represents the value of 1.0 while all 1 (twenty-three 1's) represents the value of

Note : Binary point is the dot between the number of the fraction part,

For negative : From -2.0 x 2^(127) to -1.0 x 2^(-126)

For positive : From 1.0 x 2^(-126) to 2.0 x 2^(127)

1.125 (decimal) in IEEE-754 :

In scientific notation of binary: 1.125 = 1.001 x 2^(0).

Sign : Positive (0)

Fraction part : 1.001 (2^0 + 2^(-3))

Exponent part : 0

Solve the Exponent part first; exponent part must be in the form of 8-bit excess-127.

0 in 8-bit excess-127 à 10000000

Fraction part à In IEEE-754, ignore the 1 to the left of the binary point, and COPY the numbers to the right of the binary point (001 in this example) and then fulfill the 23 bits requirement (in this case 001 + twenty more 0s).

Therefore the 1.125 in IEEE-754 notation become:

0 10000000 00100000000000000000000

(Spaces between the sign, exponent and fraction were used to make things more visible)

To make it shorter, convert to Hexadecimal

1.125 in IEEE-754 notation in hexadecimal = 40100000

Basically
the same as normalised. However, exponent part is ALWAYS 00000000.
Although in excess-127 00000000 is considered to be -127, FOR
denormalised form, 00000000 indicates that the exponent is -126, but the
number to the left of the binary point of the fraction part is ALWAYS
0.

For denormalised, all 0 in the fraction part represents the value of 0.0 and all 1 in the fraction part represents the value of approximately 1.0.

Example : 0.001 x 2^(-126)

Fraction part is ALWAYS BELOW 1.0, exponent part is ALWAYS -126.

Range :

Negative : From 0 to -1 x 2^(-126)

Positive : From 0 to 1 x 2^(-126)

Note : Two zeros exist: +0 and -0.

Smallest possible non zero value à -2^(-23) * 2^(-126) and 2^(-23) * 2^(-126)

Basically the same thing except the exponent part is 11111111 with all 0 as fraction part.

Note : There exist positive infinity and negative infinity.

32 bit and 64 bit floating point formats:

The next 23 bits represent the bits to the right of the binary point. To the left of the binary point is always a 1 (strictly for normalised - and therefore it can be disregarded and assumed to be a one).

All 0 (twenty-three 0's) represents the value of 1.0 while all 1 (twenty-three 1's) represents the value of

*approximately*2.0.Note : Binary point is the dot between the number of the fraction part,

*like a decimal point - but in binary*:)

Range :Range :

For negative : From -2.0 x 2^(127) to -1.0 x 2^(-126)

For positive : From 1.0 x 2^(-126) to 2.0 x 2^(127)

Example :Example :

1.125 (decimal) in IEEE-754 :

In scientific notation of binary: 1.125 = 1.001 x 2^(0).

Sign : Positive (0)

Fraction part : 1.001 (2^0 + 2^(-3))

Exponent part : 0

Solve the Exponent part first; exponent part must be in the form of 8-bit excess-127.

0 in 8-bit excess-127 à 10000000

Fraction part à In IEEE-754, ignore the 1 to the left of the binary point, and COPY the numbers to the right of the binary point (001 in this example) and then fulfill the 23 bits requirement (in this case 001 + twenty more 0s).

Therefore the 1.125 in IEEE-754 notation become:

0 10000000 00100000000000000000000

(Spaces between the sign, exponent and fraction were used to make things more visible)

To make it shorter, convert to Hexadecimal

1.125 in IEEE-754 notation in hexadecimal = 40100000

*Denormalised*

Note :Note :

For denormalised, all 0 in the fraction part represents the value of 0.0 and all 1 in the fraction part represents the value of approximately 1.0.

*What can be written with denormalised?*

Fraction part is ALWAYS BELOW 1.0, exponent part is ALWAYS -126.

Range :

Negative : From 0 to -1 x 2^(-126)

Positive : From 0 to 1 x 2^(-126)

Note : Two zeros exist: +0 and -0.

Smallest possible non zero value à -2^(-23) * 2^(-126) and 2^(-23) * 2^(-126)

*Infinity*

Note : There exist positive infinity and negative infinity.

32 bit and 64 bit floating point formats: