4 min read
On this page

Microcontroller Architecture

MCU block diagram

ARM Cortex-M Family

The ARM Cortex-M series dominates the 32-bit MCU market. Each profile targets different performance and feature requirements.

Core Comparison

| Core | Pipeline | Architecture | FPU | DSP | TrustZone | Typical Use | |---|---|---|---|---|---|---| | Cortex-M0 | 3-stage | ARMv6-M | No | No | No | Ultra-low-power, cost-sensitive | | Cortex-M0+ | 2-stage | ARMv6-M | No | No | No | Lowest power, fastest wake-up | | Cortex-M3 | 3-stage | ARMv7-M | No | Yes (basic) | No | General-purpose embedded | | Cortex-M4 | 3-stage | ARMv7E-M | Optional (SP) | Yes (SIMD) | No | DSP, motor control, audio | | Cortex-M7 | 6-stage | ARMv7E-M | Optional (DP) | Yes (SIMD) | No | High-performance embedded | | Cortex-M33 | 3-stage | ARMv8-M | Optional (SP) | Yes | Yes | Secure IoT, mixed criticality |

SP = single precision, DP = double precision.

Key Cortex-M Features

  • Thumb-2 instruction set: mix of 16-bit and 32-bit instructions for code density
  • NVIC: Nested Vectored Interrupt Controller with configurable priorities
  • SysTick: built-in 24-bit countdown timer for OS tick generation
  • Bit-banding (M3/M4/M7): atomic single-bit access to SRAM and peripheral regions
  • MPU: Memory Protection Unit for access control (optional on M0+, standard on M3+)

Harvard vs Von Neumann Architecture

Harvard Architecture

Separate buses for instructions and data, allowing simultaneous fetch and data access. Used in AVR, PIC, and some DSPs. True Harvard prevents executing data as code.

Modified Harvard (Cortex-M)

ARM Cortex-M uses a modified Harvard architecture:

  • Separate instruction and data buses internally (I-bus, D-bus, S-bus)
  • Unified memory map -- code and data share the same address space
  • Instructions can be fetched from RAM (useful for bootloaders, self-modifying code)
  • Bus matrix allows concurrent access to different memory regions

Memory Map

ARM Cortex-M defines a standardized 4 GB memory map:

0x0000_0000 +-----------------------+
            |  Code Region (512 MB) |  Flash memory, boot ROM
            |  (includes vector     |
            |   table at 0x0000)    |
0x2000_0000 +-----------------------+
            |  SRAM Region (512 MB) |  Stack, heap, variables
            |                       |
0x4000_0000 +-----------------------+
            |  Peripheral (512 MB)  |  GPIO, UART, SPI, I2C, etc.
            |                       |
0x6000_0000 +-----------------------+
            |  External RAM (1 GB)  |  FSMC/FMC (external SRAM, SDRAM)
            |                       |
0xA000_0000 +-----------------------+
            |  External Device      |  External peripherals
            |  (1 GB)               |
0xE000_0000 +-----------------------+
            |  System (512 MB)      |  NVIC, SysTick, SCB, MPU, debug
            |  Private Peripheral   |
0xFFFF_FFFF +-----------------------+

Registers

General-Purpose Registers (R0-R12)

  • R0-R3: function arguments and return values (ARM calling convention)
  • R4-R11: callee-saved registers
  • R12: intra-procedure scratch register

Special Registers

  • R13 (SP): Stack Pointer -- two banked copies (MSP for handler mode, PSP for thread mode)
  • R14 (LR): Link Register -- holds return address on function call
  • R15 (PC): Program Counter -- current instruction address (bit 0 must be 1 for Thumb)

Program Status Register (xPSR)

  • APSR: condition flags (N, Z, C, V, Q)
  • IPSR: current exception/interrupt number
  • EPSR: execution state (Thumb bit)

Accessing Special Registers in Rust

// Read the stack pointer
sp ← READ_REGISTER(MSP)

// Read PRIMASK (interrupt mask)
primask ← READ_REGISTER(PRIMASK)

// Read cycle counter (M3/M4/M7 with DWT enabled)
cycles ← READ_DWT_CYCLE_COUNT()

Interrupt Vector Table

The vector table sits at the start of flash (address 0x0000_0000 by default) and contains pointers to exception and interrupt handlers.

Offset  |  Vector
--------|---------------------------
0x0000  |  Initial Stack Pointer (MSP value)
0x0004  |  Reset Handler (entry point)
0x0008  |  NMI Handler
0x000C  |  HardFault Handler
0x0010  |  MemManage Handler (M3+)
0x0014  |  BusFault Handler (M3+)
0x0018  |  UsageFault Handler (M3+)
0x001C  |  Reserved
  ...   |  ...
0x002C  |  SVCall Handler
0x0030  |  Debug Monitor (M3+)
0x0034  |  Reserved
0x0038  |  PendSV Handler
0x003C  |  SysTick Handler
0x0040  |  IRQ0 (first peripheral interrupt)
0x0044  |  IRQ1
  ...   |  (vendor-specific interrupts)

The cortex-m-rt crate in Rust sets up the vector table automatically. Custom interrupt handlers are registered with attributes:

EXCEPTION HANDLER HARDFAULT(exception_frame)
    LOG_ERROR("HardFault: " + exception_frame)
    LOOP FOREVER   // halt

// Peripheral interrupt (STM32 example)
ISR TIM2()
    // Timer 2 interrupt handler
    // Clear the interrupt flag, process event

Boot Process

  1. Power-on / Reset: CPU loads MSP from address 0x0000_0000
  2. Reset handler: CPU jumps to address stored at 0x0000_0004
  3. Runtime init (cortex-m-rt):
    • Copies .data section from flash to SRAM (initialized globals)
    • Zeros .bss section in SRAM (uninitialized globals)
    • Initializes FPU if present
    • Calls main()
  4. Application main: configures clocks, peripherals, enters main loop

Linker Script

The linker script defines memory layout. cortex-m-rt expects a memory.x file:

/* memory.x for STM32F411 */
MEMORY
{
    FLASH : ORIGIN = 0x08000000, LENGTH = 512K
    RAM   : ORIGIN = 0x20000000, LENGTH = 128K
}

Clock Tree

MCUs derive their operating frequency from a clock tree that multiplies and divides a base clock source.

HSE (External Crystal)  -->+
   (8 MHz typical)         |
                           +--> PLL --> SYSCLK (up to 168 MHz on STM32F4)
HSI (Internal RC)      -->+         |
   (16 MHz, less accurate)         |
                                   +--> AHB prescaler --> HCLK
                                   |       |
                                   |       +--> APB1 prescaler --> PCLK1 (42 MHz max)
                                   |       +--> APB2 prescaler --> PCLK2 (84 MHz max)
                                   |
                                   +--> USB, I2S, RNG clocks

Clock Configuration in Rust

dp ← TAKE_PERIPHERALS()

// Configure clocks: 8 MHz HSE -> PLL -> 168 MHz SYSCLK
rcc ← dp.RCC
clocks ← CONFIGURE_CLOCKS(rcc,
    hse ← 8 MHz,
    sysclk ← 168 MHz,
    pclk1 ← 42 MHz,
    pclk2 ← 84 MHz)
FREEZE(clocks)

Power Management Modes

| Mode | CPU | Peripherals | RAM | Wake-up Source | Current | |---|---|---|---|---|---| | Run | Active | Active | Retained | N/A | mA range | | Sleep | Stopped | Active | Retained | Any interrupt | ~50% of run | | Stop | Stopped | Stopped | Retained | EXTI, RTC | uA range | | Standby | Stopped | Stopped | Lost | WKUP pin, RTC | ~1-3 uA |

// Enter sleep mode (WFI = Wait For Interrupt)
WAIT_FOR_INTERRUPT()

// Enter sleep mode (WFE = Wait For Event)
WAIT_FOR_EVENT()

For deeper sleep modes, vendor-specific power controller registers must be configured before executing wfi.

Watchdog Timers

Watchdog timers reset the MCU if software hangs or enters an unexpected state. The application must periodically "kick" the watchdog to prevent reset.

Independent Watchdog (IWDG)

  • Clocked by a separate low-speed oscillator (LSI)
  • Runs independently of main clock -- catches clock failures
  • Cannot be stopped once started (on most MCUs)

Window Watchdog (WWDG)

  • Must be refreshed within a specific time window (not too early, not too late)
  • Catches both stuck and runaway code
  • Clocked from APB1
iwdg ← WATCHDOG_NEW(IWDG_peripheral)
WATCHDOG_START(iwdg, timeout ← 1000 ms)  // 1-second timeout

LOOP
    // Application logic
    DO_WORK()

    // Must call before timeout expires
    WATCHDOG_FEED(iwdg)

Key Takeaways

  • ARM Cortex-M cores span from ultra-low-power (M0+) to high-performance (M7) with a unified programming model.
  • The standardized memory map places code, SRAM, peripherals, and system registers at fixed address ranges.
  • The vector table at the start of flash drives the entire interrupt and exception system.
  • Clock trees, power modes, and watchdog timers are essential for reliable, power-efficient operation.
  • The cortex-m-rt crate handles the boot sequence, vector table, and linker script integration for Rust.