6 min read
On this page

Instruction Set Architecture

The Instruction Set Architecture (ISA) is the interface between software and hardware — the contract defining what instructions a processor supports, how data is accessed, and how programs control execution.

ISA Design Philosophy

RISC vs CISC

| Aspect | RISC | CISC | |---|---|---| | Instructions | Simple, uniform | Complex, variable | | Instruction length | Fixed (32-bit typical) | Variable (1-15 bytes) | | Execution | 1 instruction/cycle (goal) | Multi-cycle for complex ops | | Memory access | Load/store only | Operands can be memory | | Registers | Many (32+) | Fewer (8-16) | | Decoding | Simple, fast | Complex, microcode | | Compiler responsibility | High (scheduling, optimization) | Lower (hardware handles) | | Examples | ARM, RISC-V, MIPS | x86, VAX, IBM z |

Modern reality: The distinction has blurred. x86 processors internally decode CISC instructions into RISC-like micro-operations (μops). ARM has added complex instructions. Performance depends more on microarchitecture than ISA philosophy.

Instruction Formats

Fixed-Length (RISC)

All instructions are the same size (e.g., 32 bits). Simplifies fetch and decode.

MIPS/RISC-V R-type (register-register):

[opcode | rs1 | rs2 | rd | funct7 | funct3]
  7 bits  5     5     5     7        3     = 32 bits

I-type (immediate):

[opcode | rs1 | rd | funct3 | imm[11:0]]
  7       5     5     3        12         = 32 bits

S-type (store):

[opcode | rs1 | rs2 | funct3 | imm[11:5] | imm[4:0]]

Variable-Length (CISC)

Instructions range from 1 to 15+ bytes. Compact code but complex decoding.

x86 encoding:

[Prefixes | Opcode | ModR/M | SIB | Displacement | Immediate]
 0-4 bytes  1-3     0-1      0-1    0-4             0-4

Addressing Modes

How the instruction specifies the location of operands.

| Mode | Syntax | Meaning | Example | |---|---|---|---| | Immediate | #value | Operand is in the instruction | ADD R1, #5 | | Register | Rn | Operand is in register | ADD R1, R2 | | Direct | [addr] | Operand at memory address | LOAD R1, [0x1000] | | Indirect | [Rn] | Address is in register | LOAD R1, [R2] | | Base+Offset | [Rn + off] | Base register + constant offset | LOAD R1, [R2 + 8] | | Indexed | [Rn + Ri] | Base + index register | LOAD R1, [R2 + R3] | | Scaled indexed | [Rn + Ris + off] | Base + scaled index + offset | x86: [RBX + RCX4 + 16] | | PC-relative | [PC + off] | Relative to program counter | Branch targets, data | | Auto-increment | [Rn]+ | Use Rn, then Rn += size | Stack pop | | Auto-decrement | -[Rn] | Rn -= size, then use Rn | Stack push |

RISC typically supports only: register, immediate, base+offset. All memory access via load/store.

CISC supports many modes, allowing memory operands in arithmetic instructions.

Instruction Encoding

Opcode

Identifies the operation (ADD, SUB, LOAD, BRANCH, etc.).

Opcode design tradeoffs:

  • Fixed-length opcodes: Simple decode, wastes bits for simple instructions
  • Variable-length opcodes (Huffman-like): Compact, complex decode
  • Opcode expansion: Use some opcodes to extend the encoding space

Register Specifiers

5 bits → 32 registers. 4 bits → 16 registers. More registers = more bits per instruction but fewer spills to memory.

Immediates

Constants embedded in the instruction. Limited by instruction width.

RISC-V approach: Different immediate formats (I, S, B, U, J) pack bits into the fixed 32-bit instruction. LUI + ADDI can load any 32-bit constant (two instructions).

Major ISAs

MIPS

Designed for simplicity and pedagogical clarity. Three instruction formats (R, I, J). 32 registers. Load/store architecture. Fixed 32-bit instructions.

Used extensively in textbooks (Patterson & Hennessy). Historical importance in RISC development. Still used in embedded (routers, early consoles).

ARM

Dominant in mobile and embedded. Several ISA versions:

  • ARMv7-A (32-bit): Conditional execution, barrel shifter, Thumb mode (16-bit compressed instructions)
  • ARMv8-A / AArch64 (64-bit): Clean 64-bit design, 31 general-purpose registers, NEON SIMD
  • ARM Thumb-2: Mixed 16/32-bit for code density

ARM features: Conditional execution (most instructions can be predicated), flexible second operand (shift/rotate for free), load/store multiple.

x86 / x86-64

The dominant desktop/server ISA. Evolved over 45+ years:

8086 (16-bit, 1978) → 80386 (32-bit, 1985) → x86-64/AMD64 (64-bit, 2003)

Characteristics:

  • Variable-length instructions (1-15 bytes)
  • CISC: memory operands, complex addressing modes
  • Few architectural registers (originally 8, extended to 16 in x86-64)
  • Backwards compatible since 1978 (massive software ecosystem)
  • SSE/AVX SIMD extensions (128/256/512-bit vectors)
  • Modern implementations: out-of-order, superscalar, μop cache

Register set (x86-64): RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, R8-R15 (16 general-purpose 64-bit registers).

RISC-V

Open-source ISA. Designed for extensibility and simplicity.

Base ISAs: RV32I (32-bit integer), RV64I (64-bit integer).

Standard extensions:

  • M: Integer multiply/divide
  • A: Atomic operations
  • F/D: Single/double floating-point
  • C: Compressed instructions (16-bit)
  • V: Vector operations

Key design decisions:

  • 32 registers (x0 hardwired to 0)
  • No condition codes (compare-and-branch instead)
  • No delay slots
  • Relaxed memory model (FENCE instruction for ordering)
  • Privileged architecture for OS support (M/S/U modes)

Ecosystem: Growing rapidly. Linux support, GCC/LLVM toolchains, commercial cores (SiFive), academic use.

Instruction Categories

Data Processing

  • Arithmetic: ADD, SUB, MUL, DIV, ADDI
  • Logical: AND, OR, XOR, NOT, shifts (SLL, SRL, SRA)
  • Comparison: SLT (set less than), CMP
  • Move: MOV, sign/zero extension

Data Transfer

  • Load: Read from memory to register (LW, LB, LH, LD)
  • Store: Write from register to memory (SW, SB, SH, SD)
  • Load upper: LUI (load upper immediate)
  • Stack: PUSH, POP (or synthesized via SP manipulation)

Control Flow

  • Unconditional jump: J, JAL (jump and link — function call)
  • Conditional branch: BEQ, BNE, BLT, BGE, BLTU, BGEU
  • Return: JR, JALR (jump register)
  • System: ECALL (system call), EBREAK (breakpoint)

Special

  • Atomic: Load-reserved/store-conditional (LR/SC), atomic swap (AMOSWAP)
  • Memory ordering: FENCE
  • CSR access: CSRRW, CSRRS (control/status registers)
  • NOP: Often encoded as ADD x0, x0, 0

Function Calling Convention

Caller/Callee Saved Registers

  • Caller-saved (temporary): The called function may overwrite them. Caller must save if needed across a call.
  • Callee-saved (preserved): The called function must save and restore them if used.

RISC-V convention:

  • a0-a7: Arguments and return values (caller-saved)
  • t0-t6: Temporaries (caller-saved)
  • s0-s11: Saved registers (callee-saved)
  • ra: Return address (caller-saved)
  • sp: Stack pointer (callee-saved)

Stack Frame

High addresses
┌──────────────────┐
│ Caller's frame   │
├──────────────────┤ ← old SP
│ Return address   │
│ Saved registers  │
│ Local variables  │
│ Outgoing args    │
├──────────────────┤ ← SP (current)
│                  │
Low addresses

Applications in CS

  • Compiler design: Code generation targets a specific ISA. Register allocation, instruction selection, scheduling.
  • Operating systems: Privileged instructions, interrupt handling, virtual memory support are ISA features.
  • Emulation/Translation: QEMU emulates one ISA on another. Apple Rosetta 2 translates x86 to ARM.
  • Security: ISA features like NX bit (no-execute), SMEP/SMAP, shadow stacks.
  • Performance analysis: Understanding the ISA helps interpret profiler output and optimize code.
  • Embedded systems: ISA choice (ARM Cortex-M, RISC-V, AVR) determines power, cost, code size.