The Watchdog

Your firmware has a bug. It will always have a bug. Maybe a sensor returns unexpected data that triggers an infinite loop. Maybe a pointer goes somewhere it should not and the whole thing locks up. Maybe a cosmic ray flips a bit in RAM (yes, this actually happens). When your firmware hangs, what happens to the hardware it controls?

If your firmware is blinking an LED, nothing bad happens. If your firmware is controlling a motor, a heater, or a drone's propellers — a firmware hang means the last command keeps executing forever. The motor runs at full speed. The heater stays on. The drone flies into a wall.

This is why watchdogs exist.

What Is a Watchdog?

A watchdog timer is a hardware countdown timer. You start it, and it begins counting down. Before it reaches zero, you must "pet" (or "kick" or "feed") the watchdog — which resets the countdown. If your firmware hangs and fails to pet the watchdog in time, the countdown reaches zero and the watchdog resets the entire MCU.

  Start       Pet       Pet       Pet       CRASH (no pet)
    │          │          │          │          │
    ▼          ▼          ▼          ▼          ▼
    [████████] [████████] [████████] [████████] [████░░░░] --> RESET!
    Counting   Restarted  Restarted  Restarted  Timed out

It is a dead man's switch. As long as your firmware is healthy and running its main loop, it pets the watchdog and everything is fine. The moment it hangs, the watchdog notices and reboots the system.

💡 Fun Fact: The term "watchdog" comes from a real watchdog — a dog that barks if an intruder enters. In computing, the concept dates back to the 1960s when NASA used hardware watchdog circuits on early spacecraft. If the computer froze, the watchdog circuit would trigger a hardware reset. The Mars rovers use watchdog timers too.

Two Types of Watchdog

STM32 microcontrollers have two independent watchdog peripherals:

Independent Watchdog (IWDG)

The IWDG is the simple, reliable workhorse. It runs on its own independent clock — the LSI (Low-Speed Internal) oscillator, typically 32 kHz. This means:

  • It works even if the main system clock fails
  • It works even if the PLL crashes
  • It works even in low-power modes
  • It is available on every single STM32 ever made

The IWDG is simple: configure a timeout period, start it, and pet it before the timeout. That is it.

FeatureDetail
Clock sourceLSI (~32 kHz, varies by chip)
Timeout range~125 microseconds to ~32 seconds
Can be stopped?No — once started, it cannot be stopped (on most STM32s)
Survives clock failure?Yes
ComplexityVery low

Warning: On most STM32 families, once you start the IWDG, you cannot stop it. It will keep counting until the MCU is power-cycled. This is by design — if malicious or buggy code could disable the watchdog, it would defeat the purpose.

Window Watchdog (WWDG)

The WWDG is the IWDG's fussier cousin. Instead of just requiring a pet before a deadline, it requires the pet to happen within a specific time window — not too early and not too late.

  ┌──────────────────────────────────────────────────┐
  │  TOO EARLY          WINDOW            TOO LATE   │
  │  (pet = reset)      (pet = OK)        (= reset)  │
  │  ████████████        ░░░░░░░░          ██████████ │
  └──────────────────────────────────────────────────┘

Why would you want this? It catches a different class of bugs. The IWDG only catches a full hang. The WWDG also catches:

  • Code running too fast (skipping important work)
  • A tight infinite loop that accidentally pets the watchdog each iteration
  • Timing violations that indicate corrupted control flow
FeatureIWDGWWDG
Pet timingBefore deadlineWithin a window
Clock sourceIndependent LSIAPB1 clock
Survives clock failure?YesNo (needs APB1)
Can detect "too fast"?NoYes
ComplexityLowMedium

For most projects, the IWDG is all you need. The WWDG is for safety-critical systems where you need to verify your control loop is running at the correct rate.

🧠 Think About It: Imagine firmware that has a bug causing it to skip sensor reads but still pet the watchdog. The IWDG would not catch this — the firmware is still running, just incorrectly. How would you design your petting strategy so the IWDG catches this too? (Hint: only pet the watchdog after successfully completing all critical tasks.)

IWDG in Embassy

Basic Setup

use embassy_stm32::wdg::IndependentWatchdog;

#[embassy_executor::main]
async fn main(_spawner: Spawner) {
    let p = embassy_stm32::init(Default::default());

    // Create a watchdog with a 2-second timeout
    let mut watchdog = IndependentWatchdog::new(p.IWDG, 2_000_000); // microseconds

    // Start the watchdog — no going back after this!
    watchdog.unleash();

    loop {
        // Do your work
        do_important_stuff().await;

        // Pet the watchdog — must happen within 2 seconds
        watchdog.pet();
    }
}

Choosing the Timeout

The timeout should be long enough that your firmware can always pet in time during normal operation, but short enough that a hang is detected quickly.

TimeoutGood For
100 msFast control loops (motor control, flight controllers)
500 msMedium-speed systems (robotics, sensor hubs)
1 - 2 sGeneral purpose, forgiving
5+ sSystems with long processing tasks

A good starting point: set the timeout to 3 to 5 times your main loop period. If your loop runs at 100 Hz (10 ms), a 50 ms timeout gives plenty of margin.

Practical: Watchdog with Task Architecture

In a real Embassy application, you often have multiple tasks. The question is: where do you pet the watchdog? If you pet it in one task, it does not tell you whether the other tasks are healthy.

Here is a pattern: each task sets a "heartbeat" flag, and a supervisor task only pets the watchdog if all flags are set.

use core::sync::atomic::{AtomicBool, Ordering};

static SENSOR_OK: AtomicBool = AtomicBool::new(false);
static CONTROL_OK: AtomicBool = AtomicBool::new(false);
static COMMS_OK: AtomicBool = AtomicBool::new(false);

#[embassy_executor::task]
async fn sensor_task() {
    loop {
        read_sensors().await;
        SENSOR_OK.store(true, Ordering::Relaxed);
        Timer::after_millis(10).await;
    }
}

#[embassy_executor::task]
async fn control_task() {
    loop {
        run_pid_loop().await;
        CONTROL_OK.store(true, Ordering::Relaxed);
        Timer::after_millis(10).await;
    }
}

#[embassy_executor::task]
async fn comms_task() {
    loop {
        send_telemetry().await;
        COMMS_OK.store(true, Ordering::Relaxed);
        Timer::after_millis(100).await;
    }
}

#[embassy_executor::main]
async fn main(spawner: Spawner) {
    let p = embassy_stm32::init(Default::default());

    let mut watchdog = IndependentWatchdog::new(p.IWDG, 1_000_000);

    spawner.spawn(sensor_task()).unwrap();
    spawner.spawn(control_task()).unwrap();
    spawner.spawn(comms_task()).unwrap();

    watchdog.unleash();

    loop {
        Timer::after_millis(200).await;

        let all_ok = SENSOR_OK.load(Ordering::Relaxed)
            && CONTROL_OK.load(Ordering::Relaxed)
            && COMMS_OK.load(Ordering::Relaxed);

        if all_ok {
            watchdog.pet();

            // Reset all flags — tasks must set them again
            SENSOR_OK.store(false, Ordering::Relaxed);
            CONTROL_OK.store(false, Ordering::Relaxed);
            COMMS_OK.store(false, Ordering::Relaxed);
        }
        // If any task has stalled, we do NOT pet, and the watchdog resets us
    }
}

This pattern ensures that every critical task must be running correctly for the watchdog to get fed. If the sensor task hangs, or the control task panics, or the comms task gets stuck — the watchdog catches it.

💡 Fun Fact: This "multi-task heartbeat" pattern is standard practice in aerospace and automotive firmware. The MISRA-C guidelines (used in car ECU development) recommend exactly this approach. Safety-critical systems often have multiple watchdog layers — a software watchdog feeding a hardware watchdog feeding an external watchdog IC.

When to Use a Watchdog

Always use a watchdog if your system controls physical actuators.

ApplicationWatchdog Needed?Consequence of Hang
LED blinkerNice to haveLED stuck on or off
Data loggerRecommendedMissed data, corrupt files
Motor controllerEssentialMotor runs uncontrolled
Heater controlEssentialFire hazard
Drone flight controllerEssentialCrash
Medical deviceEssential + redundantPatient safety risk

Even for non-safety-critical systems, a watchdog is good practice. A data logger that resets and resumes logging after a crash is infinitely better than one that hangs silently and logs nothing.

What Happens After a Watchdog Reset?

When the watchdog triggers, the MCU resets as if you pressed the reset button. Your firmware starts from the top of main. But you might want to know why you reset — was it a power-on or a watchdog timeout?

STM32 has reset status flags that tell you:

#![allow(unused)]
fn main() {
use embassy_stm32::pac;

// Check reset cause (read RCC reset status register)
// The exact register and bit names vary by STM32 family
let rcc = pac::RCC;
let csr = rcc.csr().read();

if csr.iwdgrstf() {
    defmt::warn!("RESET CAUSE: Independent Watchdog timeout!");
    // Log this, increment a crash counter, enter safe mode, etc.
}

if csr.wwdgrstf() {
    defmt::warn!("RESET CAUSE: Window Watchdog timeout!");
}

// Clear the reset flags so they do not persist after the next reset
rcc.csr().modify(|w| w.set_rmvf(true));
}

🧠 Think About It: After a watchdog reset, should your firmware immediately resume normal operation? Or should it enter a "safe mode" first? For a drone, you might want to enter a controlled descent instead of resuming the last flight command. Think about what "safe" means for your specific application.

Summary

The watchdog is your firmware's safety net. It is the simplest peripheral on the chip — just a countdown timer — but it might be the most important one. A firmware hang without a watchdog is an uncontrolled system. A firmware hang with a watchdog is a system reset followed by a clean recovery.

Use the IWDG for most projects. Start it early, pet it in your main loop (or better, use the multi-task heartbeat pattern), and choose a timeout that balances responsiveness with margin. Once the watchdog is unleashed, it cannot be stopped — and that is exactly the point.

With this chapter, you have covered all the core peripherals of the STM32. You can blink LEDs, read sensors over UART, SPI, and I2C, measure analog voltages, move data efficiently with DMA, and keep your system safe with a watchdog. In the next part of the book, we will go deeper — into memory architecture, embedded Rust patterns, and building complete real-world projects.