Chapter 2: What Is a Microcontroller?

You are reading this on a computer that has a CPU, gigabytes of RAM, a hard drive, a GPU, a network card, and an operating system managing it all. A microcontroller is all of that squeezed onto a single chip the size of your fingernail — minus the luxury.

A Complete Computer on a Chip

A microcontroller (MCU) is a self-contained computer on a single integrated circuit. It has:

  • A CPU to execute instructions
  • Flash memory to store your program (survives power loss)
  • RAM to store variables while running
  • Input/Output peripherals to interact with the physical world

All of this costs between 50 and 500 INR, draws milliwatts of power, and fits in a package smaller than a postage stamp.

Fun Fact: There are more microcontrollers in your house than people on your street. Your washing machine, microwave, remote control, thermostat, elevator, car key fob, and electric toothbrush all contain at least one. A modern car has 50-100 of them.

How a Microcontroller Program Works

Every microcontroller program follows the same structure:

Power on
  → Initialize hardware (configure clocks, pins, peripherals)
  → Loop forever:
      1. Read inputs (buttons, sensors, communication)
      2. Process (make decisions, calculate)
      3. Write outputs (LEDs, motors, communication)

There is no operating system. No bootloader menu. No login screen. The moment power is applied, the chip starts executing your program from the first instruction in Flash memory. It runs until power is removed or the universe ends, whichever comes first.

Your program lives in Flash (non-volatile — it persists without power). Your variables live in RAM (volatile — they vanish the instant power is cut). This is why your microcontroller remembers its program after unplugging but starts with fresh variable values every boot.

In Embassy, this looks like:

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    // 1. Initialize hardware
    let p = embassy_stm32::init(Default::default());
    let mut led = Output::new(p.PC13, Level::High, Speed::Low);
    let button = Input::new(p.PA0, Pull::Up);

    // 2. Loop forever
    loop {
        // Read input
        if button.is_low() {
            // Process + Write output
            led.set_low(); // LED on (active low)
        } else {
            led.set_high(); // LED off
        }
        Timer::after_millis(10).await;
    }
}

What Is Inside an STM32

Let us open up the block diagram and look at the major components.

CPU Core: ARM Cortex-M

Every STM32 uses an ARM Cortex-M processor core. ARM does not make chips — they design the processor core and license it to manufacturers like STMicroelectronics. This is why STM32, nRF52, RP2040, and dozens of other chips all share the same instruction set.

Clock speeds range from 48 MHz on entry-level parts to 480 MHz on the H7 series. For context, the original IBM PC ran at 4.77 MHz. Even a "slow" microcontroller is an order of magnitude faster.

Flash Memory (64 KB - 2 MB)

This is where your compiled program lives. Flash is non-volatile (survives power cycles) but has limited write endurance — typically 10,000 write/erase cycles. You write to it when flashing firmware, not during normal operation.

RAM (8 KB - 1 MB)

This is your working memory for variables, buffers, and the stack. It is fast and has unlimited write cycles, but it loses everything when power is removed. Sounds small? A well-written embedded program can do remarkable things in 20 KB of RAM.

Peripherals

This is where it gets interesting. Peripherals are hardware blocks built into the chip that handle specific tasks:

PeripheralWhat It Does
GPIOGeneral Purpose I/O — read buttons, drive LEDs, toggle any pin
UARTSerial communication — talk to GPS modules, Bluetooth, PC terminal
SPIFast synchronous bus — displays, SD cards, flash memory
I2CTwo-wire bus — sensors, EEPROMs, many breakout boards
TimersCount time, generate PWM, capture pulse widths
ADCAnalog-to-Digital — read voltages from sensors
DMADirect Memory Access — move data without CPU involvement
USBUSB device/host — your chip can be a USB keyboard or mass storage
CANController Area Network — automotive/industrial communication

Each peripheral operates independently of the CPU. While your code is processing sensor data, the UART can be receiving bytes, the DMA can be moving ADC samples into RAM, and a timer can be generating PWM — all simultaneously, all in hardware.

The Bus System

Peripherals connect to the CPU through buses — digital highways for data:

  • AHB (Advanced High-performance Bus) — fast bus for GPIO, DMA, memory
  • APB1 (Advanced Peripheral Bus 1) — slower bus for UART, I2C, basic timers
  • APB2 — slightly faster peripheral bus for SPI1, ADC, advanced timers

This matters because each bus has a maximum clock speed. On the STM32F411, APB1 maxes out at 50 MHz while APB2 can run at 100 MHz. Peripherals on APB2 can operate faster.

The Clock System

The clock system is the heartbeat of the microcontroller. It determines how fast everything runs. A typical STM32 clock tree:

8 MHz HSE crystal
    → PLL (Phase-Locked Loop) multiplies to 96 MHz
        → SYSCLK = 96 MHz (CPU runs at this speed)
        → AHB = 96 MHz (GPIO, DMA)
        → APB1 = 48 MHz (UART, I2C, basic timers)
        → APB2 = 96 MHz (SPI, ADC, advanced timers)

Embassy configures all of this for you, but understanding it helps when debugging timing issues.

The Cortex-M Family

Not all ARM cores are equal. Here is what you will encounter across the STM32 lineup:

CoreFeaturesSTM32 FamiliesTypical Clock
Cortex-M0 / M0+Basic, low power, no FPUF0, G0, L048-64 MHz
Cortex-M3Bit-banding, no FPUF172 MHz
Cortex-M4FDSP instructions, single-precision FPUF3, F4, L4, G4, WB72-170 MHz
Cortex-M7Double-precision FPU, I/D cache, branch predictionF7, H7216-480 MHz
Cortex-M33TrustZone security, FPU, DSPL5, U5, H5110-250 MHz

Why the FPU Matters

FPU stands for Floating Point Unit — dedicated hardware for decimal math. Without an FPU, calculating 3.14 * 2.0 requires dozens of integer instructions emulating floating-point arithmetic. With an FPU, it is a single instruction.

The difference is dramatic: 10x to 50x faster for floating-point operations. If your project involves sensor math, PID control loops, audio processing, or anything with decimal numbers, choose a chip with an FPU (Cortex-M4F or higher).

Think About It: A Cortex-M0 running at 48 MHz without an FPU might be slower at floating-point math than a Cortex-M4F running at 48 MHz. Raw clock speed is not the whole story.

Memory-Mapped I/O: THE Key Concept

This is the single most important concept in embedded programming. Read this section twice.

On your desktop computer, hardware is accessed through drivers and operating system calls. On a microcontroller, hardware is accessed by reading and writing specific memory addresses.

Every peripheral, every register, every control bit lives at a fixed address in the chip's memory map:

Address RangeWhat Lives There
0x0000_0000 - 0x0007_FFFFFlash memory (your program)
0x2000_0000 - 0x2001_FFFFRAM (your variables)
0x4002_0000 - 0x4002_03FFGPIOA registers
0x4000_4400 - 0x4000_47FFUSART2 registers
0x4001_3000 - 0x4001_33FFSPI1 registers

When you write a value to address 0x4002_0014, you are not writing to memory. You are changing the physical voltage on the GPIOA output pins. That is memory-mapped I/O: the hardware pretends to be memory, and writing to those "memory locations" controls real-world electrical signals.

Here is what toggling pin PA5 looks like in raw register manipulation (C-style), versus Embassy:

#![allow(unused)]
fn main() {
// Raw register access (what happens underneath)
// Set bit 5 of GPIOA ODR to turn on PA5
unsafe {
    let gpioa_odr = 0x4002_0014 as *mut u32;
    let current = core::ptr::read_volatile(gpioa_odr);
    core::ptr::write_volatile(gpioa_odr, current | (1 << 5));
}

// Embassy (what you actually write)
let mut pin = Output::new(p.PA5, Level::Low, Speed::Low);
pin.set_high();
}

Both do the exact same thing. Embassy just wraps the horror in a safe, readable API.

Registers: The 32-Bit Control Panel

A register is a 32-bit value at a fixed address that controls or reports the state of a peripheral. Think of it as a row of 32 tiny switches.

For example, the GPIO Output Data Register (ODR) for GPIOA:

Bit:  31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
      [            Reserved (read as 0)                ]

Bit:  15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
      P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0

Each bit directly controls one pin. Bit 5 = PA5. Set it to 1 and the pin goes to 3.3V. Set it to 0 and the pin goes to 0V.

Registers come in three flavors:

  • Control registers — you write to configure behavior (e.g., set a pin as input or output)
  • Status registers — you read to check what happened (e.g., has a UART byte arrived?)
  • Data registers — you read/write to exchange data (e.g., the byte to transmit via UART)

The Toolchain

You cannot compile code on your PC and run it on an STM32 — they have completely different processors. You need a cross-compiler: a tool that runs on your PC but produces code for the ARM Cortex-M.

The Rust embedded toolchain consists of:

ComponentPurpose
rustup target add thumbv7em-none-eabihfInstall the cross-compilation target
probe-rsFlash firmware and debug via ST-Link/DAPLink
defmt + defmt-rttLightweight logging over the debug probe
cargo embed or cargo run (with probe-rs)Build, flash, and run in one command

The target triple thumbv7em-none-eabihf breaks down as:

  • thumb — ARM Thumb instruction set
  • v7em — ARMv7E-M architecture (Cortex-M4/M7)
  • none — no operating system
  • eabi — Embedded ABI (calling convention)
  • hf — hardware floating point

What Makes Embedded Different

If you are coming from desktop or web development, embedded is a different world:

DesktopEmbedded
Operating system manages everythingNo OS — your code is the only thing running
Print to terminal with println!No terminal — use defmt over debug probe or UART
Program exits when doneRuns forevermain must never return
Timing is "fast enough"Real-time — a 1 ms deadline means 1 ms, not "roughly 1 ms"
Bugs crash a programBugs affect physical hardware — wrong output can damage circuits
Gigabytes of RAMKilobytes of RAM — every byte matters

Fun Fact: When NASA's Voyager 1 was reprogrammed in 2023, engineers uploaded new code to a computer with 69 KB of memory — less RAM than most STM32s. It has been running since 1977, 15 billion miles from Earth. That is embedded programming at its finest.

Summary

A microcontroller is a self-contained computer on a chip. It has a CPU, Flash, RAM, and peripherals — all controlled by reading and writing to specific memory addresses. The STM32 family uses ARM Cortex-M cores ranging from the basic M0 to the powerful M7, all programmed using the same Rust toolchain.

The key insight is memory-mapped I/O: hardware registers live at fixed addresses, and writing to those addresses controls physical pins and peripherals. Embassy abstracts this into safe Rust APIs, but the registers are always there underneath.

Next, we will explore the STM32 family in detail — which chip to choose, what the part numbers mean, and how to navigate the (enormous) datasheet.