Chapter 2: What Is a Microcontroller?

You are reading this on a computer that has a CPU, gigabytes of RAM, a hard drive, a GPU, a network card, and an operating system managing it all. A microcontroller is all of that squeezed onto a single chip the size of your fingernail — minus the luxury.

A Complete Computer on a Chip

A microcontroller (MCU) is a self-contained computer on a single integrated circuit. It has:

A CPU to execute instructions
Flash memory to store your program (survives power loss)
RAM to store variables while running
Input/Output peripherals to interact with the physical world

All of this costs between 50 and 500 INR, draws milliwatts of power, and fits in a package smaller than a postage stamp.

Fun Fact: There are more microcontrollers in your house than people on your street. Your washing machine, microwave, remote control, thermostat, elevator, car key fob, and electric toothbrush all contain at least one. A modern car has 50-100 of them.

How a Microcontroller Program Works

Every microcontroller program follows the same structure:

Power on
  → Initialize hardware (configure clocks, pins, peripherals)
  → Loop forever:
      1. Read inputs (buttons, sensors, communication)
      2. Process (make decisions, calculate)
      3. Write outputs (LEDs, motors, communication)

There is no operating system. No bootloader menu. No login screen. The moment power is applied, the chip starts executing your program from the first instruction in Flash memory. It runs until power is removed or the universe ends, whichever comes first.

Your program lives in Flash (non-volatile — it persists without power). Your variables live in RAM (volatile — they vanish the instant power is cut). This is why your microcontroller remembers its program after unplugging but starts with fresh variable values every boot.

In Embassy, this looks like:

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    // 1. Initialize hardware
    let p = embassy_stm32::init(Default::default());
    let mut led = Output::new(p.PC13, Level::High, Speed::Low);
    let button = Input::new(p.PA0, Pull::Up);

    // 2. Loop forever
    loop {
        // Read input
        if button.is_low() {
            // Process + Write output
            led.set_low(); // LED on (active low)
        } else {
            led.set_high(); // LED off
        }
        Timer::after_millis(10).await;
    }
}

What Is Inside an STM32

Let us open up the block diagram and look at the major components.

CPU Core: ARM Cortex-M

Every STM32 uses an ARM Cortex-M processor core. ARM does not make chips — they design the processor core and license it to manufacturers like STMicroelectronics. This is why STM32, nRF52, RP2040, and dozens of other chips all share the same instruction set.

Clock speeds range from 48 MHz on entry-level parts to 480 MHz on the H7 series. For context, the original IBM PC ran at 4.77 MHz. Even a "slow" microcontroller is an order of magnitude faster.

Flash Memory (64 KB - 2 MB)

This is where your compiled program lives. Flash is non-volatile (survives power cycles) but has limited write endurance — typically 10,000 write/erase cycles. You write to it when flashing firmware, not during normal operation.

RAM (8 KB - 1 MB)

This is your working memory for variables, buffers, and the stack. It is fast and has unlimited write cycles, but it loses everything when power is removed. Sounds small? A well-written embedded program can do remarkable things in 20 KB of RAM.

Peripherals

This is where it gets interesting. Peripherals are hardware blocks built into the chip that handle specific tasks:

Peripheral	What It Does
GPIO	General Purpose I/O — read buttons, drive LEDs, toggle any pin
UART	Serial communication — talk to GPS modules, Bluetooth, PC terminal
SPI	Fast synchronous bus — displays, SD cards, flash memory
I2C	Two-wire bus — sensors, EEPROMs, many breakout boards
Timers	Count time, generate PWM, capture pulse widths
ADC	Analog-to-Digital — read voltages from sensors
DMA	Direct Memory Access — move data without CPU involvement
USB	USB device/host — your chip can be a USB keyboard or mass storage
CAN	Controller Area Network — automotive/industrial communication

Each peripheral operates independently of the CPU. While your code is processing sensor data, the UART can be receiving bytes, the DMA can be moving ADC samples into RAM, and a timer can be generating PWM — all simultaneously, all in hardware.

The Bus System

Peripherals connect to the CPU through buses — digital highways for data:

AHB (Advanced High-performance Bus) — fast bus for GPIO, DMA, memory
APB1 (Advanced Peripheral Bus 1) — slower bus for UART, I2C, basic timers
APB2 — slightly faster peripheral bus for SPI1, ADC, advanced timers

This matters because each bus has a maximum clock speed. On the STM32F411, APB1 maxes out at 50 MHz while APB2 can run at 100 MHz. Peripherals on APB2 can operate faster.

The Clock System

The clock system is the heartbeat of the microcontroller. It determines how fast everything runs. A typical STM32 clock tree:

8 MHz HSE crystal
    → PLL (Phase-Locked Loop) multiplies to 96 MHz
        → SYSCLK = 96 MHz (CPU runs at this speed)
        → AHB = 96 MHz (GPIO, DMA)
        → APB1 = 48 MHz (UART, I2C, basic timers)
        → APB2 = 96 MHz (SPI, ADC, advanced timers)

Embassy configures all of this for you, but understanding it helps when debugging timing issues.

The Cortex-M Family

Not all ARM cores are equal. Here is what you will encounter across the STM32 lineup:

Core	Features	STM32 Families	Typical Clock
Cortex-M0 / M0+	Basic, low power, no FPU	F0, G0, L0	48-64 MHz
Cortex-M3	Bit-banding, no FPU	F1	72 MHz
Cortex-M4F	DSP instructions, single-precision FPU	F3, F4, L4, G4, WB	72-170 MHz
Cortex-M7	Double-precision FPU, I/D cache, branch prediction	F7, H7	216-480 MHz
Cortex-M33	TrustZone security, FPU, DSP	L5, U5, H5	110-250 MHz

Why the FPU Matters

FPU stands for Floating Point Unit — dedicated hardware for decimal math. Without an FPU, calculating 3.14 * 2.0 requires dozens of integer instructions emulating floating-point arithmetic. With an FPU, it is a single instruction.

The difference is dramatic: 10x to 50x faster for floating-point operations. If your project involves sensor math, PID control loops, audio processing, or anything with decimal numbers, choose a chip with an FPU (Cortex-M4F or higher).

Think About It: A Cortex-M0 running at 48 MHz without an FPU might be slower at floating-point math than a Cortex-M4F running at 48 MHz. Raw clock speed is not the whole story.

Memory-Mapped I/O: THE Key Concept

This is the single most important concept in embedded programming. Read this section twice.

On your desktop computer, hardware is accessed through drivers and operating system calls. On a microcontroller, hardware is accessed by reading and writing specific memory addresses.

Every peripheral, every register, every control bit lives at a fixed address in the chip's memory map:

Address Range	What Lives There
`0x0000_0000` - `0x0007_FFFF`	Flash memory (your program)
`0x2000_0000` - `0x2001_FFFF`	RAM (your variables)
`0x4002_0000` - `0x4002_03FF`	GPIOA registers
`0x4000_4400` - `0x4000_47FF`	USART2 registers
`0x4001_3000` - `0x4001_33FF`	SPI1 registers

When you write a value to address 0x4002_0014, you are not writing to memory. You are changing the physical voltage on the GPIOA output pins. That is memory-mapped I/O: the hardware pretends to be memory, and writing to those "memory locations" controls real-world electrical signals.

Here is what toggling pin PA5 looks like in raw register manipulation (C-style), versus Embassy:

#![allow(unused)]
fn main() {
// Raw register access (what happens underneath)
// Set bit 5 of GPIOA ODR to turn on PA5
unsafe {
    let gpioa_odr = 0x4002_0014 as *mut u32;
    let current = core::ptr::read_volatile(gpioa_odr);
    core::ptr::write_volatile(gpioa_odr, current | (1 << 5));
}

// Embassy (what you actually write)
let mut pin = Output::new(p.PA5, Level::Low, Speed::Low);
pin.set_high();
}

Both do the exact same thing. Embassy just wraps the horror in a safe, readable API.

Registers: The 32-Bit Control Panel

A register is a 32-bit value at a fixed address that controls or reports the state of a peripheral. Think of it as a row of 32 tiny switches.

For example, the GPIO Output Data Register (ODR) for GPIOA:

Bit:  31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
      [            Reserved (read as 0)                ]

Bit:  15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
      P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0

Each bit directly controls one pin. Bit 5 = PA5. Set it to 1 and the pin goes to 3.3V. Set it to 0 and the pin goes to 0V.

Registers come in three flavors:

Control registers — you write to configure behavior (e.g., set a pin as input or output)
Status registers — you read to check what happened (e.g., has a UART byte arrived?)
Data registers — you read/write to exchange data (e.g., the byte to transmit via UART)

The Toolchain

You cannot compile code on your PC and run it on an STM32 — they have completely different processors. You need a cross-compiler: a tool that runs on your PC but produces code for the ARM Cortex-M.

The Rust embedded toolchain consists of:

Component	Purpose
`rustup target add thumbv7em-none-eabihf`	Install the cross-compilation target
`probe-rs`	Flash firmware and debug via ST-Link/DAPLink
`defmt` + `defmt-rtt`	Lightweight logging over the debug probe
`cargo embed` or `cargo run` (with probe-rs)	Build, flash, and run in one command

The target triple thumbv7em-none-eabihf breaks down as:

thumb — ARM Thumb instruction set
v7em — ARMv7E-M architecture (Cortex-M4/M7)
none — no operating system
eabi — Embedded ABI (calling convention)
hf — hardware floating point

What Makes Embedded Different

If you are coming from desktop or web development, embedded is a different world:

Desktop	Embedded
Operating system manages everything	No OS — your code is the only thing running
Print to terminal with `println!`	No terminal — use `defmt` over debug probe or UART
Program exits when done	Runs forever — `main` must never return
Timing is "fast enough"	Real-time — a 1 ms deadline means 1 ms, not "roughly 1 ms"
Bugs crash a program	Bugs affect physical hardware — wrong output can damage circuits
Gigabytes of RAM	Kilobytes of RAM — every byte matters

Fun Fact: When NASA's Voyager 1 was reprogrammed in 2023, engineers uploaded new code to a computer with 69 KB of memory — less RAM than most STM32s. It has been running since 1977, 15 billion miles from Earth. That is embedded programming at its finest.

Summary

A microcontroller is a self-contained computer on a chip. It has a CPU, Flash, RAM, and peripherals — all controlled by reading and writing to specific memory addresses. The STM32 family uses ARM Cortex-M cores ranging from the basic M0 to the powerful M7, all programmed using the same Rust toolchain.

The key insight is memory-mapped I/O: hardware registers live at fixed addresses, and writing to those addresses controls physical pins and peripherals. Embassy abstracts this into safe Rust APIs, but the registers are always there underneath.

Next, we will explore the STM32 family in detail — which chip to choose, what the part numbers mean, and how to navigate the (enormous) datasheet.

Embedded Systems Programming with STM32 and Rust