Number Systems

Digital systems represent all information as sequences of bits. Understanding number systems and their conversions is fundamental to how computers store and process data.

Positional Number Systems

A number in base b (radix b) with digits dₙdₙ₋₁...d₁d₀.d₋₁d₋₂... has value:

Σᵢ dᵢ × bⁱ

Binary (Base 2)

Digits: {0, 1}. Each digit is a bit (binary digit).

1011₂ = 1×2³ + 0×2² + 1×2¹ + 1×2⁰ = 8 + 0 + 2 + 1 = 11₁₀

Terminology:

Nibble: 4 bits
Byte: 8 bits
Word: Architecture-dependent (16, 32, or 64 bits)

Octal (Base 8)

Digits: {0, 1, 2, 3, 4, 5, 6, 7}. Each octal digit = 3 bits.

375₈ = 3×64 + 7×8 + 5 = 253₁₀
375₈ = 011 111 101₂

Used in Unix file permissions (chmod 755 = rwxr-xr-x).

Hexadecimal (Base 16)

Digits: {0-9, A-F} (A=10, B=11, ..., F=15). Each hex digit = 4 bits.

0x1A3F = 1×4096 + 10×256 + 3×16 + 15 = 6719₁₀
0x1A3F = 0001 1010 0011 1111₂

Ubiquitous in computing: memory addresses, color codes (#FF00FF), MAC addresses, byte values.

Conversions

Decimal to Binary

Integer part: Repeated division by 2, read remainders bottom-up.

42 ÷ 2 = 21 R 0
21 ÷ 2 = 10 R 1
10 ÷ 2 = 5  R 0
5  ÷ 2 = 2  R 1
2  ÷ 2 = 1  R 0
1  ÷ 2 = 0  R 1
→ 42₁₀ = 101010₂

Fractional part: Repeated multiplication by 2, read integer parts top-down.

0.625 × 2 = 1.25  → 1
0.25  × 2 = 0.5   → 0
0.5   × 2 = 1.0   → 1
→ 0.625₁₀ = 0.101₂

Some fractions (like 0.1₁₀) produce infinite repeating binary fractions — the root cause of floating-point "errors."

Binary ↔ Hex

Group binary digits in 4s (pad with leading zeros):

1011 0110 1100₂ = B6C₁₆
0xDEAD = 1101 1110 1010 1101₂

Binary ↔ Octal

Group in 3s:

101 110 100₂ = 564₈

Binary Coded Decimal (BCD)

Each decimal digit encoded separately in 4 bits:

92₁₀ = 1001 0010_BCD   (not 1011100₂)

Advantages: Easy conversion to/from decimal display. No rounding for decimal fractions. Disadvantages: Wastes space (6 of 16 4-bit codes unused). Arithmetic is more complex.

Used in financial calculations, embedded displays, and some database decimal types.

Gray Code

Adjacent codes differ in exactly one bit. Avoids transient errors during transitions.

Decimal	Binary	Gray
0	000	000
1	001	001
2	010	011
3	011	010
4	100	110
5	101	111
6	110	101
7	111	100

Binary to Gray: Gᵢ = Bᵢ ⊕ Bᵢ₊₁ (XOR with next higher bit), G_msb = B_msb.

Gray to Binary: Bᵢ = Gᵢ ⊕ Bᵢ₊₁, B_msb = G_msb.

Applications: Rotary encoders, Karnaugh maps, error reduction in ADC.

Signed Integer Representations

Comparison of sign-magnitude, ones complement, and twos complement representations

Sign-Magnitude

MSB is the sign (0 = positive, 1 = negative), remaining bits are the magnitude.

+5 = 0 0101
-5 = 1 0101

Problems: Two zeros (+0 and -0). Arithmetic is complex (separate add/subtract logic).

Range (n bits): -(2ⁿ⁻¹ - 1) to +(2ⁿ⁻¹ - 1).

One's Complement

Negate by flipping all bits.

+5 = 0 0101
-5 = 1 1010

Problems: Two zeros (0000... and 1111...). End-around carry needed in addition.

Two's Complement

The standard representation used by virtually all modern hardware.

Negate by flipping all bits and adding 1:

+5  = 0000 0101
~5  = 1111 1010  (flip)
-5  = 1111 1011  (add 1)

Properties:

One zero (no negative zero)
Range (n bits): -2ⁿ⁻¹ to +2ⁿ⁻¹ - 1
8-bit: -128 to +127
16-bit: -32768 to +32767
32-bit: -2,147,483,648 to +2,147,483,647
64-bit: -9.22 × 10¹⁸ to +9.22 × 10¹⁸
Addition works the same as unsigned (hardware simplicity)
MSB is the sign: 0 = non-negative, 1 = negative
Asymmetric: |min| = |max| + 1 (there's no +128 in 8-bit)

Overflow: Adding two positives can produce a negative (or two negatives → positive). Detected when the carry into the sign bit differs from the carry out.

Sign extension: To widen an n-bit value to m bits (m > n), replicate the sign bit: 1011 (4-bit -5) → 1111 1011 (8-bit -5).

Arithmetic in Two's Complement

  0101 (+5)          1011 (-5)
+ 0011 (+3)        + 1101 (-3)
------              ------
  1000 (+8)         11000 → 1000 (-8) ✓ (carry discarded)

Subtraction: A - B = A + (-B) = A + (~B + 1).

Fixed-Point Representation

A fixed number of bits before and after the binary point.

Q notation: Qm.n = m integer bits + n fractional bits.

Q3.4: 011.1010 = 2 + 1 + 0.5 + 0.125 = 3.625

Advantages: Integer arithmetic hardware. Exact decimal fractions (if chosen well). Deterministic timing. Disadvantages: Limited range and precision. Must carefully track the binary point.

Used in DSP, embedded systems, financial calculations, and game engines.

IEEE 754 Floating-Point

Single Precision (32-bit, f32)

[S | EEEEEEEE | MMMMMMMMMMMMMMMMMMMMMMM]
 1     8 bits          23 bits

Value = (-1)^S × 1.M × 2^(E - 127)

Bias: 127
Exponent range: -126 to +127
Precision: ~7 decimal digits
Range: ~±3.4 × 10³⁸

Double Precision (64-bit, f64)

[S | EEEEEEEEEEE | MMMM...M (52 bits)]
 1    11 bits         52 bits

Value = (-1)^S × 1.M × 2^(E - 1023)

Bias: 1023
Exponent range: -1022 to +1023
Precision: ~15-16 decimal digits
Range: ~±1.8 × 10³⁰⁸

Special Values

Exponent	Mantissa	Value
0	0	±0
0	≠ 0	Subnormal: ±0.M × 2^(-126)
255 (f32) / 2047 (f64)	0	±∞
255 / 2047	≠ 0	NaN

Rounding Modes

Round to nearest, ties to even (default, "banker's rounding"): Minimizes statistical bias.
Round toward +∞ (ceiling)
Round toward -∞ (floor)
Round toward zero (truncation)

Floating-Point Gotchas

0.1 + 0.2 ≠ 0.3        (representation error)
(a + b) + c ≠ a + (b + c)  (not associative)
x + 1 = x               (for large enough x: x > 2⁵³ in f64)
x - x ≠ 0               (for x = NaN)

Comparison: Never use == for floats. Use |a - b| < ε or relative comparison.

Applications in CS

Memory addressing: Hexadecimal for addresses (0x7FFF_FFFF).
Networking: IP addresses (32-bit), MAC addresses (48-bit hex).
Color: RGB hex codes (#FF8800 = orange).
Character encoding: ASCII (7-bit), Unicode code points (U+0041 = 'A').
Instruction encoding: Opcodes in binary/hex.
Cryptography: Keys, hashes in hexadecimal.
Error detection: CRC, checksums operate on binary data.
Digital signal processing: Fixed-point for real-time audio/video processing.