Writing & Managing Services
Why This Matters
You have written an application -- maybe a Python web server, a Go API, or a Node.js background worker. It runs fine when you type the command in a terminal. But what happens when you log out? It dies. What happens when it crashes at 3 AM? It stays dead. What happens when the server reboots? Nobody starts it.
This is where writing your own systemd service unit comes in. A unit file is a simple text file that tells systemd how to start, stop, supervise, and restart your application. No shell script gymnastics. No screen sessions. No nohup hacks. Just a declarative configuration that systemd follows reliably, every time.
This chapter teaches you to write unit files from scratch, understand every directive that matters, configure restart policies, manage dependencies, and replace cron jobs with systemd timers.
Try This Right Now
Let us create a trivially simple service in under a minute:
# Create a tiny script
sudo tee /usr/local/bin/hello-service.sh << 'SCRIPT'
#!/bin/bash
while true; do
echo "Hello from my service at $(date)"
sleep 10
done
SCRIPT
sudo chmod +x /usr/local/bin/hello-service.sh
# Create a unit file for it
sudo tee /etc/systemd/system/hello.service << 'UNIT'
[Unit]
Description=My Hello Service
[Service]
ExecStart=/usr/local/bin/hello-service.sh
[Install]
WantedBy=multi-user.target
UNIT
# Load, start, and watch it
sudo systemctl daemon-reload
sudo systemctl start hello.service
journalctl -u hello.service -f
You should see "Hello from my service" messages appearing every 10 seconds. Press Ctrl+C to stop watching the log. The service keeps running.
# Clean up when done
sudo systemctl stop hello.service
sudo systemctl disable hello.service
sudo rm /etc/systemd/system/hello.service
sudo rm /usr/local/bin/hello-service.sh
sudo systemctl daemon-reload
Unit File Anatomy
Every systemd unit file has the same basic structure: sections (denoted by square brackets) containing key-value directives.
+---------------------------------------------------+
| [Unit] <-- Metadata & Dependencies |
| Description=... |
| After=... |
| Requires=... |
| |
| [Service] <-- How to Run |
| Type=... |
| ExecStart=... |
| Restart=... |
| |
| [Install] <-- Boot Integration |
| WantedBy=... |
+---------------------------------------------------+
The [Unit] Section
This section describes the unit and defines its relationships to other units.
[Unit]
Description=My Application Server
Documentation=https://example.com/docs
After=network.target postgresql.service
Requires=postgresql.service
Wants=redis.service
| Directive | Purpose |
|---|---|
Description= | Human-readable name shown in systemctl status |
Documentation= | URL or man page reference |
After= | Start this unit after the listed units |
Before= | Start this unit before the listed units |
Requires= | Hard dependency -- if the required unit fails, this unit fails too |
Wants= | Soft dependency -- if the wanted unit fails, this unit still starts |
BindsTo= | Like Requires, but also stops this unit if the bound unit stops |
Conflicts= | Cannot run at the same time as the listed units |
ConditionPathExists= | Only start if the given path exists |
After vs Requires: A Critical Distinction
These two directives do different things, and confusing them is a very common mistake:
After=controls ordering -- "start me after X has started"Requires=controls dependency -- "if X fails, I fail too"
You almost always want both together:
# WRONG: ordering without dependency
After=postgresql.service
# PostgreSQL starts first, but if it fails, your app starts anyway
# WRONG: dependency without ordering
Requires=postgresql.service
# They might start at the same time (parallel), causing race conditions
# RIGHT: both together
After=postgresql.service
Requires=postgresql.service
# PostgreSQL starts first, AND your app won't start if PostgreSQL fails
Think About It: When would you use
Wants=instead ofRequires=? Think of a case where you would prefer your application to start even if an optional dependency failed.
The [Service] Section
This is where you define how the service actually runs.
[Service]
Type=simple
User=appuser
Group=appgroup
WorkingDirectory=/opt/myapp
Environment=NODE_ENV=production
EnvironmentFile=/opt/myapp/.env
ExecStartPre=/opt/myapp/check-config.sh
ExecStart=/opt/myapp/server --port 8080
ExecReload=/bin/kill -HUP $MAINPID
ExecStop=/bin/kill -TERM $MAINPID
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
We will explore each important directive in detail below.
The [Install] Section
This section defines how the unit integrates with the boot process.
[Install]
WantedBy=multi-user.target
| Directive | Purpose |
|---|---|
WantedBy= | When enabled, add this unit to the listed target's "wants" |
RequiredBy= | When enabled, add this unit to the listed target's "requires" |
Also= | When enabling this unit, also enable the listed units |
Alias= | Additional names for this unit |
WantedBy=multi-user.target is the most common value. It means "start this service
when the system reaches multi-user mode" -- which is the normal boot target for servers.
ExecStart, ExecStop, and ExecReload
ExecStart
The most important directive. This is the command that starts your service:
# Simple command
ExecStart=/usr/bin/python3 /opt/myapp/server.py
# With arguments
ExecStart=/usr/bin/node /opt/myapp/index.js --port 3000
# IMPORTANT: Must be an absolute path. This will NOT work:
# ExecStart=python3 server.py <-- WRONG
Rules for ExecStart:
- Must use an absolute path to the executable
- For
Type=simpleandType=forking, there can be only one ExecStart line - For
Type=oneshot, you can have multiple ExecStart lines
ExecStartPre and ExecStartPost
Run commands before or after the main process starts:
ExecStartPre=/opt/myapp/validate-config.sh
ExecStart=/opt/myapp/server
ExecStartPost=/opt/myapp/notify-started.sh
Prefix with - to ignore failures:
# If this check fails, still start the service
ExecStartPre=-/opt/myapp/optional-check.sh
ExecStop
How to stop the service. If not specified, systemd sends SIGTERM (and then SIGKILL after a timeout):
# Custom graceful shutdown
ExecStop=/opt/myapp/graceful-shutdown.sh
ExecReload
What to do when systemctl reload is called. Typically sends SIGHUP:
ExecReload=/bin/kill -HUP $MAINPID
$MAINPID is a special variable systemd sets to the PID of the main process.
Service Types: Type=
The Type= directive tells systemd how your service starts and how to track its main
process. Getting this wrong is one of the most common sources of service management bugs.
Type=simple (Default)
systemd considers the service "started" as soon as ExecStart runs. The process
specified by ExecStart is the main process.
[Service]
Type=simple
ExecStart=/usr/bin/python3 /opt/myapp/server.py
Use when: your application runs in the foreground and does not fork.
Type=forking
For traditional daemons that fork a child process and then the parent exits. systemd considers the service started when the parent process exits.
[Service]
Type=forking
PIDFile=/var/run/myapp.pid
ExecStart=/opt/myapp/start.sh
Use when: your application daemonizes itself (forks into the background). You usually
need PIDFile= so systemd can track the main process.
Type=oneshot
For services that do a single task and then exit. systemd waits for the process to finish before considering the unit "started."
[Service]
Type=oneshot
ExecStart=/opt/myapp/run-migration.sh
ExecStart=/opt/myapp/seed-database.sh
RemainAfterExit=yes
Use when: you need to run a setup task at boot (like loading firewall rules). With
RemainAfterExit=yes, the unit shows as "active" even after the process exits.
Type=notify
The service sends a notification to systemd when it is ready. This is the most precise way to signal readiness.
[Service]
Type=notify
ExecStart=/opt/myapp/server
The application must call sd_notify(0, "READY=1") (using the systemd library) or
write to the $NOTIFY_SOCKET. Many modern services support this (e.g., PostgreSQL,
nginx with certain configurations).
Type=exec
Similar to simple, but systemd considers the service started only after the binary
has been successfully executed (after the exec() system call). This catches cases
where the binary does not exist or cannot be executed.
+---------------------------------------------------+
| Type=simple -> Started immediately |
| Type=exec -> Started after exec() succeeds |
| Type=forking -> Started when parent exits |
| Type=oneshot -> Started when process exits |
| Type=notify -> Started when service signals |
+---------------------------------------------------+
Think About It: You have an application that takes 30 seconds to warm up (loading data into memory, connecting to databases) before it can serve requests. Which Type would you choose, and why?
Restart Policies
One of systemd's most valuable features: automatic restart when a service crashes.
[Service]
Restart=on-failure
RestartSec=5
Restart= Options
| Value | Restarts On |
|---|---|
no | Never restart (default) |
on-success | Clean exit (exit code 0) |
on-failure | Non-zero exit code, signal, timeout, watchdog |
on-abnormal | Signal, timeout, watchdog (but NOT non-zero exit) |
on-abort | Unclean signal only |
on-watchdog | Watchdog timeout only |
always | Always restart, no matter what |
For most services, you want either on-failure or always:
# Restart only on crashes (not on intentional stops)
Restart=on-failure
RestartSec=5
# Always restart (even after clean exit -- useful for workers)
Restart=always
RestartSec=5
Preventing Restart Loops
If a service is badly broken, you do not want systemd to restart it forever:
[Service]
Restart=on-failure
RestartSec=5
StartLimitIntervalSec=300
StartLimitBurst=5
This means: if the service fails 5 times within 300 seconds (5 minutes), stop trying. The service enters a "failed" state.
Distro Note:
StartLimitIntervalSecandStartLimitBurstbelong in the[Unit]section on older systemd versions (before 230). On modern systems, they work in either section, but[Unit]is more portable.
Hands-On: Writing a Custom Service
Let us write a proper service for a Python web application.
Step 1: Create the Application
sudo mkdir -p /opt/mywebapp
sudo tee /opt/mywebapp/app.py << 'PYTHON'
#!/usr/bin/env python3
"""A tiny HTTP server for demonstration."""
from http.server import HTTPServer, SimpleHTTPRequestHandler
import os
import signal
import sys
PORT = int(os.environ.get('PORT', 8080))
def graceful_shutdown(signum, frame):
print(f"Received signal {signum}, shutting down gracefully...", flush=True)
sys.exit(0)
signal.signal(signal.SIGTERM, graceful_shutdown)
print(f"Starting server on port {PORT}", flush=True)
server = HTTPServer(('', PORT), SimpleHTTPRequestHandler)
print(f"Server is ready and listening on port {PORT}", flush=True)
server.serve_forever()
PYTHON
sudo chmod +x /opt/mywebapp/app.py
Step 2: Create a Dedicated User
sudo useradd --system --no-create-home --shell /usr/sbin/nologin mywebapp
Step 3: Write the Unit File
sudo tee /etc/systemd/system/mywebapp.service << 'UNIT'
[Unit]
Description=My Python Web Application
After=network.target
Documentation=https://example.com/mywebapp
[Service]
Type=simple
User=mywebapp
Group=mywebapp
WorkingDirectory=/opt/mywebapp
Environment=PORT=8080
ExecStart=/usr/bin/python3 /opt/mywebapp/app.py
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
RestartSec=5
StartLimitIntervalSec=300
StartLimitBurst=5
# Security hardening
NoNewPrivileges=yes
ProtectSystem=strict
ProtectHome=yes
PrivateTmp=yes
StandardOutput=journal
StandardError=journal
SyslogIdentifier=mywebapp
[Install]
WantedBy=multi-user.target
UNIT
Step 4: Deploy and Start
# Reload systemd to pick up the new unit file
sudo systemctl daemon-reload
# Start and enable the service
sudo systemctl enable --now mywebapp.service
# Check status
systemctl status mywebapp.service
# Test it
curl http://localhost:8080/
Step 5: Test Restart Behavior
# Find the main PID
systemctl show mywebapp.service --property=MainPID
# MainPID=12345
# Kill it rudely (simulating a crash)
sudo kill -9 $(systemctl show mywebapp.service --property=MainPID --value)
# Wait a moment, then check -- it should have restarted
sleep 6
systemctl status mywebapp.service
# Notice the PID has changed and the service is active
Step 6: Check the Logs
# View all logs for this service
journalctl -u mywebapp.service --no-pager
# Follow logs in real time
journalctl -u mywebapp.service -f
Clean Up
sudo systemctl disable --now mywebapp.service
sudo rm /etc/systemd/system/mywebapp.service
sudo rm -rf /opt/mywebapp
sudo userdel mywebapp
sudo systemctl daemon-reload
Service Security Hardening
systemd provides powerful security directives that sandbox your service. Use them wherever possible:
[Service]
# Run as non-root
User=myapp
Group=myapp
# Cannot gain new privileges (e.g., via setuid binaries)
NoNewPrivileges=yes
# Make the entire filesystem read-only except specified paths
ProtectSystem=strict
ReadWritePaths=/var/lib/myapp /var/log/myapp
# Hide /home, /root, /run/user
ProtectHome=yes
# Private /tmp (isolated from other services)
PrivateTmp=yes
# Cannot modify kernel variables
ProtectKernelTunables=yes
# Cannot load kernel modules
ProtectKernelModules=yes
# Restrict network families
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
# Restrict system calls
SystemCallFilter=@system-service
You can check the security score of any service:
systemd-analyze security mywebapp.service
This gives each service a score from 0 (fully exposed) to 10 (fully locked down) and shows which hardening directives are missing.
Dependencies Deep Dive
Ordering: After= and Before=
These control when units start relative to each other:
# My app starts AFTER network and database are ready
After=network.target postgresql.service
# My app starts BEFORE the monitoring agent
Before=monitoring-agent.service
Dependency: Requires= and Wants=
These control whether units must succeed:
# Hard dependency: if PostgreSQL fails to start, my app fails too
Requires=postgresql.service
# Soft dependency: try to start Redis, but my app works without it
Wants=redis.service
Combining Them
A complete dependency setup:
[Unit]
Description=My Application
After=network.target postgresql.service redis.service
Requires=postgresql.service
Wants=redis.service
This means:
- Start after network, PostgreSQL, and Redis
- Fail if PostgreSQL is not running
- Continue even if Redis is not running
BindsTo=
Stronger than Requires. If the bound unit stops at any time (not just at startup), this unit also stops:
[Unit]
BindsTo=postgresql.service
After=postgresql.service
Conflicts=
Ensures two units never run simultaneously:
[Unit]
Conflicts=apache2.service
If you start this service, apache2.service is stopped automatically.
systemd Timers: The Modern Cron
systemd timers are a powerful replacement for cron jobs. They offer better logging, dependency management, and resource control.
Timer Anatomy
A timer requires two files:
- A
.timerunit (the schedule) - A
.serviceunit (the actual work)
Example: Run a Backup Every Day at 2 AM
The service file (/etc/systemd/system/backup.service):
[Unit]
Description=Daily Backup Job
[Service]
Type=oneshot
ExecStart=/opt/scripts/backup.sh
User=backup
StandardOutput=journal
The timer file (/etc/systemd/system/backup.timer):
[Unit]
Description=Run Backup Daily at 2 AM
[Timer]
OnCalendar=*-*-* 02:00:00
Persistent=true
RandomizedDelaySec=300
[Install]
WantedBy=timers.target
# Enable the timer (not the service)
sudo systemctl daemon-reload
sudo systemctl enable --now backup.timer
# Check when it will fire next
systemctl list-timers backup.timer --no-pager
Timer Directives
| Directive | Purpose |
|---|---|
OnCalendar= | Calendar-based schedule (like cron) |
OnBootSec= | Run X time after boot |
OnUnitActiveSec= | Run X time after the service last ran |
OnStartupSec= | Run X time after systemd started |
Persistent=true | If the system was off when the timer should have fired, run it at next boot |
RandomizedDelaySec= | Add random delay to prevent thundering herd |
AccuracySec= | How precise the timer needs to be |
OnCalendar Syntax
The OnCalendar format is DayOfWeek Year-Month-Day Hour:Minute:Second:
OnCalendar=*-*-* 02:00:00 # Every day at 2:00 AM
OnCalendar=Mon *-*-* 09:00:00 # Every Monday at 9:00 AM
OnCalendar=*-*-01 00:00:00 # First day of every month
OnCalendar=*-01-01 00:00:00 # January 1st every year
OnCalendar=hourly # Every hour
OnCalendar=daily # Every day at midnight
OnCalendar=weekly # Every Monday at midnight
OnCalendar=*-*-* *:00:00 # Every hour on the hour
OnCalendar=*-*-* *:*:00 # Every minute
OnCalendar=*-*-* 08..17:00:00 # Every hour from 8 AM to 5 PM
Validate your schedule with systemd-analyze:
# When will this fire next?
systemd-analyze calendar "Mon *-*-* 09:00:00"
# Next elapse: Mon 2025-03-17 09:00:00 UTC
# How about every 15 minutes?
systemd-analyze calendar "*-*-* *:00/15:00"
Relative Timers
Instead of calendar-based, run relative to events:
[Timer]
# 15 minutes after boot
OnBootSec=15min
# Every 30 minutes after the service last ran
OnUnitActiveSec=30min
Socket Activation Basics
Socket activation lets systemd listen on a port and start the service only when a connection arrives. This means:
- Services start on-demand, not at boot (faster boot)
- If no one connects, the service never runs (saves resources)
- systemd can restart a crashed service without losing queued connections
How Socket Activation Works
+----------+ +---------+ +----------+
| Client | ----> | systemd | ----> | Service |
| connects | | (holds | | (started |
| | | socket)| | on demand)|
+----------+ +---------+ +----------+
1. systemd creates the socket and listens
2. Client connects to the socket
3. systemd starts the service
4. systemd passes the socket file descriptor to the service
5. Service handles the connection
Example: Socket-Activated Service
Socket file (/etc/systemd/system/myapp.socket):
[Unit]
Description=My App Socket
[Socket]
ListenStream=8080
Accept=no
[Install]
WantedBy=sockets.target
Service file (/etc/systemd/system/myapp.service):
[Unit]
Description=My App Service
Requires=myapp.socket
[Service]
Type=simple
ExecStart=/opt/myapp/server
# Enable the socket (not the service directly)
sudo systemctl enable --now myapp.socket
# The service is not running yet
systemctl is-active myapp.service
# inactive
# Connect to the socket
curl http://localhost:8080/
# Now the service starts automatically
systemctl is-active myapp.service
# active
Debug This: Service Keeps Crashing
Your custom service starts but immediately dies, and systemd keeps restarting it:
● myapp.service - My Application
Active: activating (auto-restart) (Result: exit-code)
The journal shows:
myapp.service: Main process exited, code=exited, status=1/FAILURE
myapp.service: Scheduled restart job, restart counter is at 4.
Here is your debugging checklist:
-
Check the full journal output:
journalctl -u myapp.service -n 100 --no-pager -
Run ExecStart manually to see errors directly:
systemctl cat myapp.service | grep ExecStart # Then run that command as the same user: sudo -u appuser /opt/myapp/server -
Check that the binary exists and is executable:
ls -la /opt/myapp/server file /opt/myapp/server -
Check that the user has correct permissions:
sudo -u appuser ls -la /opt/myapp/ sudo -u appuser cat /opt/myapp/config.yaml -
Check environment variables:
systemctl show myapp.service --property=Environment -
Temporarily stop restart looping to debug:
sudo systemctl stop myapp.service # Now you can investigate without it restarting
Where Unit Files Live
+------------------------------------------------------------------+
| /usr/lib/systemd/system/ <- Vendor/package-provided units |
| (do NOT edit these directly) |
| |
| /etc/systemd/system/ <- Admin-created units (your stuff)|
| (this is where you create them)|
| |
| /run/systemd/system/ <- Runtime-only units |
| (disappear on reboot) |
+------------------------------------------------------------------+
Priority: /etc > /run > /usr/lib
To override a vendor-provided unit without editing it directly:
# Create an override directory
sudo systemctl edit nginx.service
# This opens an editor for /etc/systemd/system/nginx.service.d/override.conf
Or manually:
sudo mkdir -p /etc/systemd/system/nginx.service.d/
sudo tee /etc/systemd/system/nginx.service.d/override.conf << 'OVERRIDE'
[Service]
LimitNOFILE=65535
OVERRIDE
sudo systemctl daemon-reload
sudo systemctl restart nginx
What Just Happened?
+------------------------------------------------------------------+
| CHAPTER 16 RECAP |
+------------------------------------------------------------------+
| |
| - Unit files have three sections: [Unit], [Service], [Install] |
| - ExecStart= must use absolute paths |
| - Type= controls how systemd tracks your process: |
| simple, forking, oneshot, notify |
| - Restart=on-failure with RestartSec= for automatic recovery |
| - After= controls order; Requires= controls dependency |
| - Use both together for proper dependency management |
| - systemd timers replace cron with OnCalendar= schedules |
| - Socket activation starts services on-demand |
| - Put custom units in /etc/systemd/system/ |
| - Always run daemon-reload after editing unit files |
| - Security hardening: User=, ProtectSystem=, PrivateTmp= |
| |
+------------------------------------------------------------------+
Try This
Exercise 1: Write a One-Shot Service
Write a Type=oneshot service that creates a /tmp/system-booted file containing the
current timestamp. Enable it so it runs at boot.
Exercise 2: Build a Timer
Create a systemd timer that runs a script every 15 minutes. The script should append
the current date and system load average (uptime) to a log file. Verify the timer
with systemctl list-timers.
Exercise 3: Dependency Chain
Create three services: service-a, service-b, and service-c. Configure them so
that service-c requires service-b, which requires service-a. Verify the ordering:
sudo systemctl start service-c.service
# Should automatically start a and b first
Exercise 4: Crash Recovery
Create a service that intentionally exits with an error after 5 seconds. Set up
Restart=on-failure with RestartSec=3. Watch it restart using journalctl -f.
Then add StartLimitBurst=3 and StartLimitIntervalSec=60 and observe what happens
after the third failure.
Bonus Challenge
Convert one of your existing cron jobs to a systemd timer. Compare the two approaches. Which gives you better logging? Which is easier to debug when something goes wrong?