HAProxy & Advanced Load Balancing
Why This Matters
Nginx and Apache can both load balance, but they are web servers that happen to do load balancing. HAProxy is a dedicated load balancer and reverse proxy -- it is all it does, and it does it extraordinarily well.
HAProxy powers some of the highest-traffic sites on the internet: GitHub, Reddit, Stack Overflow, Tumblr, and Twitter have all relied on it. It handles millions of concurrent connections, provides fine-grained health checking, supports advanced traffic routing with ACLs, offers real-time stats, and is battle-tested in ways that few other pieces of software can claim.
If your infrastructure grows beyond a single Nginx instance handling everything, or if you need Layer 4 (TCP) load balancing, advanced health checks, or sophisticated traffic routing, HAProxy is the tool you reach for.
Try This Right Now
# Install HAProxy
$ sudo apt update && sudo apt install -y haproxy
# Check the version
$ haproxy -v
HAProxy version 2.8.5 ...
# Check if it is running
$ sudo systemctl status haproxy
Distro Note: On RHEL/CentOS/Fedora:
$ sudo dnf install -y haproxy $ sudo systemctl enable --now haproxyOn Arch:
sudo pacman -S haproxy.
What Makes HAProxy Different
┌──────────────────────────────────────────────────────────────┐
│ Web Server vs Dedicated Load Balancer │
│ │
│ Nginx / Apache: │
│ ┌─────────────────────────────────────────┐ │
│ │ Static files │ Reverse proxy │ LB │ │
│ │ CGI/FastCGI │ URL rewriting │ │ │
│ │ Compression │ TLS termination│ │ │
│ └─────────────────────────────────────────┘ │
│ Load balancing is ONE feature among many. │
│ │
│ HAProxy: │
│ ┌─────────────────────────────────────────┐ │
│ │ LOAD BALANCING │ │
│ │ Layer 4 (TCP) │ Layer 7 (HTTP) │ │
│ │ Health checks │ ACL routing │ │
│ │ Stick tables │ Rate limiting │ │
│ │ Stats dashboard│ Connection queuing │ │
│ │ TLS termination│ Header manipulation │ │
│ └─────────────────────────────────────────┘ │
│ Load balancing is THE ONLY job. Depth over breadth. │
└──────────────────────────────────────────────────────────────┘
Key advantages of HAProxy:
- Active health checks -- probes backends independently (not just on user request failures)
- Layer 4 and Layer 7 -- can load balance any TCP protocol, not just HTTP
- Connection queuing -- when all backends are at capacity, requests are queued rather than rejected
- Stick tables -- track per-client state (request rates, connections) for advanced traffic management
- Stats dashboard -- real-time visibility into backend health, request rates, and response times
- Zero-downtime reloads -- seamless configuration changes with no dropped connections
HAProxy Configuration Structure
HAProxy's configuration file is typically at /etc/haproxy/haproxy.cfg. It is divided into four sections:
┌──────────────────────────────────────────────────────────────┐
│ HAProxy Configuration Sections │
│ │
│ ┌─────────────────────┐ │
│ │ global │ Process-wide settings │
│ │ (security, tuning) │ (user, chroot, logging) │
│ └─────────────────────┘ │
│ │ │
│ ┌─────────────────────┐ │
│ │ defaults │ Default settings inherited by all │
│ │ (timeouts, modes) │ frontends and backends │
│ └─────────────────────┘ │
│ │ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ frontend │──>│ backend │ │
│ │ (listening side) │ │ (server pool) │ │
│ │ (binds to ports) │ │ (health checks) │ │
│ │ (ACL routing) │ │ (load balancing) │ │
│ └─────────────────────┘ └─────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
A Complete Configuration Walkthrough
$ sudo cat /etc/haproxy/haproxy.cfg
Here is a well-annotated production configuration:
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log /dev/log local0 # Log to syslog
log /dev/log local1 notice
chroot /var/lib/haproxy # Chroot for security
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy # Run as unprivileged user
group haproxy
daemon # Run in background
# TLS tuning
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2
#---------------------------------------------------------------------
# Default settings (inherited by all frontends/backends)
#---------------------------------------------------------------------
defaults
log global
mode http # Layer 7 mode (use "tcp" for Layer 4)
option httplog # Log full HTTP requests
option dontlognull # Don't log health check probes
timeout connect 5s # Time to connect to backend
timeout client 30s # Time to wait for client data
timeout server 30s # Time to wait for backend response
retries 3 # Retry on connection failure
errorfile 400 /etc/haproxy/errors/400.http
errorfile 403 /etc/haproxy/errors/403.http
errorfile 500 /etc/haproxy/errors/500.http
errorfile 502 /etc/haproxy/errors/502.http
errorfile 503 /etc/haproxy/errors/503.http
errorfile 504 /etc/haproxy/errors/504.http
#---------------------------------------------------------------------
# Frontend: accepts incoming connections
#---------------------------------------------------------------------
frontend http_front
bind *:80 # Listen on port 80 on all interfaces
default_backend web_servers # Where to send traffic by default
#---------------------------------------------------------------------
# Backend: pool of servers
#---------------------------------------------------------------------
backend web_servers
balance roundrobin # Load balancing algorithm
server web1 10.0.1.10:8080 check # "check" enables health checking
server web2 10.0.1.11:8080 check
server web3 10.0.1.12:8080 check
Hands-On: Your First HAProxy Setup
# Start three backends
$ for port in 8001 8002 8003; do
dir="/tmp/habackend${port}"
mkdir -p "$dir"
echo "Response from backend ${port}" > "$dir/index.html"
cd "$dir" && python3 -m http.server "$port" &
done
# Configure HAProxy
$ sudo tee /etc/haproxy/haproxy.cfg > /dev/null << 'EOF'
global
log /dev/log local0
chroot /var/lib/haproxy
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
timeout connect 5s
timeout client 30s
timeout server 30s
frontend web_frontend
bind *:80
default_backend web_backend
backend web_backend
balance roundrobin
server backend1 127.0.0.1:8001 check
server backend2 127.0.0.1:8002 check
server backend3 127.0.0.1:8003 check
EOF
# Validate the configuration
$ sudo haproxy -c -f /etc/haproxy/haproxy.cfg
Configuration file is valid
# Start HAProxy
$ sudo systemctl restart haproxy
# Test round-robin
$ for i in $(seq 1 6); do curl -s http://localhost; done
Response from backend 8001
Response from backend 8002
Response from backend 8003
Response from backend 8001
Response from backend 8002
Response from backend 8003
Layer 4 vs Layer 7 Load Balancing
This is one of HAProxy's major differentiators.
Layer 4 (TCP Mode)
HAProxy forwards raw TCP connections. It does not inspect HTTP content -- it just passes bytes between client and backend. Use this for databases, mail servers, or any non-HTTP protocol.
defaults
mode tcp # Layer 4 mode
frontend mysql_front
bind *:3306
default_backend mysql_servers
backend mysql_servers
balance roundrobin
server db1 10.0.1.20:3306 check
server db2 10.0.1.21:3306 check
Layer 7 (HTTP Mode)
HAProxy understands HTTP. It can inspect headers, URLs, cookies, and make routing decisions based on content.
defaults
mode http # Layer 7 mode
frontend http_front
bind *:80
# Route based on URL path
acl is_api path_beg /api
acl is_static path_end .css .js .png .jpg
use_backend api_servers if is_api
use_backend static_servers if is_static
default_backend web_servers
Comparison
┌──────────────────────────────────────────────────────────────┐
│ Layer 4 vs Layer 7 │
├──────────────────┬───────────────────────────────────────────┤
│ │ Layer 4 (TCP) │ Layer 7 (HTTP) │
├──────────────────┼─────────────────────┼─────────────────────┤
│ Inspects content │ No │ Yes (headers, URLs)│
│ Protocols │ Any TCP │ HTTP/HTTPS only │
│ Routing options │ IP, port │ URL, header, cookie│
│ Performance │ Faster (no parsing)│ Slightly slower │
│ Use cases │ DB, mail, SSH │ Web apps, APIs │
│ SSL termination │ Passthrough or term│ Full termination │
└──────────────────┴─────────────────────┴─────────────────────┘
Think About It: You need to load-balance PostgreSQL connections across three database replicas. Should you use Layer 4 or Layer 7? Why? (Answer: Layer 4, because PostgreSQL does not speak HTTP. HAProxy only needs to forward the raw TCP connection.)
ACLs: Intelligent Traffic Routing
Access Control Lists (ACLs) let you define conditions and route traffic based on them. This is one of HAProxy's most powerful features.
frontend http_front
bind *:80
# Define ACLs
acl is_api path_beg /api/
acl is_admin path_beg /admin/
acl is_websocket hdr(Upgrade) -i websocket
acl is_post method POST
acl from_internal src 10.0.0.0/8 192.168.0.0/16
acl is_mobile hdr_sub(User-Agent) -i mobile android iphone
# Route based on ACLs
use_backend api_servers if is_api
use_backend admin_servers if is_admin from_internal # AND logic
use_backend ws_servers if is_websocket
use_backend upload_servers if is_api is_post # API POST -> upload servers
default_backend web_servers
ACL Matching Functions
| Function | What It Matches | Example |
|---|---|---|
path_beg | URL path starts with | path_beg /api/ |
path_end | URL path ends with | path_end .php |
path | Exact URL path match | path /health |
hdr | HTTP header exact match | hdr(Host) -i api.example.com |
hdr_beg | HTTP header starts with | hdr_beg(Host) -i api. |
hdr_sub | HTTP header contains substring | hdr_sub(User-Agent) -i curl |
src | Source IP address/range | src 10.0.0.0/8 |
method | HTTP method | method POST |
ssl_fc | Connection is over SSL | ssl_fc |
Hands-On: Content-Based Routing
frontend http_front
bind *:80
# Different backends for different domains
acl host_blog hdr(Host) -i blog.example.com
acl host_api hdr(Host) -i api.example.com
acl host_shop hdr(Host) -i shop.example.com
use_backend blog_backend if host_blog
use_backend api_backend if host_api
use_backend shop_backend if host_shop
default_backend default_backend
backend blog_backend
balance roundrobin
server blog1 10.0.1.10:8080 check
server blog2 10.0.1.11:8080 check
backend api_backend
balance leastconn
server api1 10.0.2.10:3000 check
server api2 10.0.2.11:3000 check
server api3 10.0.2.12:3000 check
backend shop_backend
balance source
server shop1 10.0.3.10:8080 check
server shop2 10.0.3.11:8080 check
Health Checks
HAProxy's health checking is far more sophisticated than Nginx's passive checks.
TCP Health Check (Default)
When you add check to a server line, HAProxy opens a TCP connection to verify the backend is alive:
server web1 10.0.1.10:8080 check inter 5s fall 3 rise 2
inter 5s-- check every 5 secondsfall 3-- mark as down after 3 consecutive failuresrise 2-- mark as up after 2 consecutive successes
HTTP Health Check
Verify the backend returns a proper HTTP response:
backend web_servers
option httpchk GET /health HTTP/1.1\r\nHost:\ localhost
http-check expect status 200
server web1 10.0.1.10:8080 check
server web2 10.0.1.11:8080 check
This sends an actual HTTP GET to /health and expects a 200 status code. If the backend returns 500 (because its database is down, for example), HAProxy removes it from the pool.
Advanced Health Check
backend api_servers
option httpchk
http-check send meth GET uri /health ver HTTP/1.1 hdr Host localhost
http-check expect status 200
http-check expect string "OK" # Also verify response body
server api1 10.0.2.10:3000 check inter 3s fall 2 rise 3
server api2 10.0.2.11:3000 check inter 3s fall 2 rise 3
Server States
┌──────────────────────────────────────────────────────────────┐
│ HAProxy Server States │
│ │
│ UP ──(health check fails)──> check fail count increases │
│ ──(fall threshold)──> DOWN │
│ │
│ DOWN ──(health check succeeds)──> check pass count increases│
│ ──(rise threshold)──> UP │
│ │
│ Additional states: │
│ - MAINT: manually put into maintenance │
│ - DRAIN: accepting no new connections, finishing existing │
│ - NOLB: not in load balancing pool but still checked │
└──────────────────────────────────────────────────────────────┘
Stick Tables
Stick tables let HAProxy track per-client state. This is powerful for rate limiting, session persistence, and abuse detection -- all without external tools.
Session Persistence (Sticky Sessions)
backend app_servers
balance roundrobin
stick-table type ip size 200k expire 30m
stick on src
server app1 10.0.1.10:8080 check
server app2 10.0.1.11:8080 check
This creates a table of client IPs. When a client first connects, they are assigned a backend server. Subsequent connections from the same IP go to the same server for 30 minutes.
Rate Limiting with Stick Tables
frontend http_front
bind *:80
# Track request rate per client IP
stick-table type ip size 100k expire 30s store http_req_rate(10s)
# Count requests
http-request track-sc0 src
# Deny if more than 100 requests in 10 seconds
http-request deny deny_status 429 if { sc_http_req_rate(0) gt 100 }
default_backend web_servers
Connection Limiting
frontend http_front
bind *:80
stick-table type ip size 100k expire 30s store conn_cur
http-request track-sc0 src
http-request deny deny_status 429 if { sc_conn_cur(0) gt 50 }
default_backend web_servers
Think About It: How are HAProxy's stick tables different from Nginx's
limit_reqfor rate limiting? (Answer: Stick tables are far more flexible. They can track any counter -- connection rate, request rate, bytes transferred, error rates -- and combine multiple conditions. They can also be synchronized between HAProxy nodes in a cluster.)
SSL/TLS Termination
HAProxy can terminate TLS and forward plain HTTP to backends:
frontend https_front
bind *:443 ssl crt /etc/haproxy/certs/example.com.pem
bind *:80
# Redirect HTTP to HTTPS
http-request redirect scheme https unless { ssl_fc }
# Forward the protocol info to backends
http-request set-header X-Forwarded-Proto https if { ssl_fc }
http-request set-header X-Forwarded-Proto http unless { ssl_fc }
default_backend web_servers
backend web_servers
balance roundrobin
server web1 10.0.1.10:8080 check
server web2 10.0.1.11:8080 check
The certificate file must contain the certificate and private key concatenated:
# Combine cert and key into one file (HAProxy's required format)
$ sudo cat /etc/letsencrypt/live/example.com/fullchain.pem \
/etc/letsencrypt/live/example.com/privkey.pem \
| sudo tee /etc/haproxy/certs/example.com.pem
$ sudo chmod 600 /etc/haproxy/certs/example.com.pem
SNI-Based Routing (Multiple Domains on One IP)
frontend https_front
bind *:443 ssl crt /etc/haproxy/certs/
# Route based on the TLS SNI hostname
use_backend blog_servers if { ssl_fc_sni blog.example.com }
use_backend api_servers if { ssl_fc_sni api.example.com }
default_backend web_servers
When crt points to a directory, HAProxy loads all .pem files in it and selects the right certificate based on the SNI hostname.
The Stats Dashboard
HAProxy comes with a built-in statistics dashboard that provides real-time visibility into your entire load balancing setup:
listen stats
bind *:8404
stats enable
stats uri /stats
stats refresh 10s
stats auth admin:secretpassword
stats admin if LOCALHOST # Allow admin actions from localhost
$ sudo systemctl reload haproxy
# View stats in terminal
$ curl -u admin:secretpassword http://localhost:8404/stats
# Or open in a browser: http://your-server:8404/stats
The dashboard shows:
- Frontend -- incoming connection rates, bytes in/out
- Backend -- each server's status (UP/DOWN), current connections, request rate, response times, error counts
- Server details -- health check status, weight, last status change
- Session rates -- current, maximum, and limit
Safety Warning: The stats page exposes sensitive information about your infrastructure. Always protect it with authentication and restrict access by IP. Never expose it to the public internet.
Hands-On: Enable and Explore Stats
# Add stats to your configuration
$ sudo tee -a /etc/haproxy/haproxy.cfg > /dev/null << 'EOF'
listen stats
bind *:8404
stats enable
stats uri /stats
stats refresh 5s
stats auth admin:haproxy123
EOF
$ sudo haproxy -c -f /etc/haproxy/haproxy.cfg
$ sudo systemctl reload haproxy
# Access the stats
$ curl -u admin:haproxy123 http://localhost:8404/stats
# (Better viewed in a web browser for the HTML dashboard)
# You can also get stats in CSV format
$ curl -s -u admin:haproxy123 "http://localhost:8404/stats;csv"
HAProxy vs Nginx for Load Balancing
┌──────────────────────────────────────────────────────────────┐
│ HAProxy vs Nginx for Load Balancing │
├──────────────────┬───────────────────┬───────────────────────┤
│ Feature │ HAProxy │ Nginx (open source) │
├──────────────────┼───────────────────┼───────────────────────┤
│ Active health │ Yes (built-in) │ No (passive only; │
│ checks │ │ active in Nginx Plus) │
│ Layer 4 (TCP) │ Full support │ Supported │
│ Stats dashboard │ Built-in, rich │ Basic stub_status │
│ Stick tables │ Yes │ No │
│ Connection queue │ Yes (with limits) │ No (rejects excess) │
│ Zero-downtime │ Yes (seamless) │ Yes (reload) │
│ reload │ │ │
│ Serve static │ No │ Yes (excellent) │
│ files │ │ │
│ ACL routing │ Very powerful │ Via location/map │
│ Rate limiting │ Via stick tables │ Via limit_req module │
│ Config │ Single flat file │ Hierarchical includes │
│ Ecosystem │ LB only │ LB + web server + more│
└──────────────────┴───────────────────┴───────────────────────┘
In practice, many production architectures use both:
Internet → HAProxy (Layer 4/7 LB, global routing, stats)
→ Nginx (TLS termination, static files, caching)
→ Application backends
High Availability Concepts
A single HAProxy instance is a single point of failure. Production environments need redundancy.
Active-Passive with Keepalived
The most common HA pattern uses Keepalived with a Virtual IP (VIP):
┌──────────────────────────────────────────────────────────────┐
│ High Availability with Keepalived │
│ │
│ ┌─────────────────┐ │
│ │ Virtual IP │ │
│ │ 192.168.1.100 │ │
│ └────────┬────────┘ │
│ │ │
│ ┌────────────┼────────────┐ │
│ │ │ │ │
│ ┌─────┴─────┐ ┌───┴───────┐ │
│ │ HAProxy 1 │ │ HAProxy 2 │ │
│ │ (MASTER) │ │ (BACKUP) │ │
│ │ .1.101 │ │ .1.102 │ │
│ └────────────┘ └────────────┘ │
│ │
│ - The VIP floats to whichever node is MASTER │
│ - If the master fails, keepalived moves the VIP to backup │
│ - Clients always connect to the VIP, never directly │
│ - Failover happens in seconds │
└──────────────────────────────────────────────────────────────┘
# Install keepalived
$ sudo apt install -y keepalived
# /etc/keepalived/keepalived.conf on HAProxy node 1 (MASTER)
$ sudo tee /etc/keepalived/keepalived.conf > /dev/null << 'EOF'
vrrp_script check_haproxy {
script "/usr/bin/killall -0 haproxy"
interval 2
weight 2
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 101
advert_int 1
authentication {
auth_type PASS
auth_pass mysecret
}
virtual_ipaddress {
192.168.1.100/24
}
track_script {
check_haproxy
}
}
EOF
On the backup node, change state MASTER to state BACKUP and priority 101 to priority 100.
Practical: Complete Production Configuration
Here is a comprehensive HAProxy configuration that ties all the concepts together:
global
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
user haproxy
group haproxy
daemon
maxconn 50000
ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
ssl-default-bind-options ssl-min-ver TLSv1.2
defaults
log global
mode http
option httplog
option dontlognull
option forwardfor # Add X-Forwarded-For header
timeout connect 5s
timeout client 30s
timeout server 30s
timeout http-keep-alive 10s
timeout check 5s
retries 3
default-server inter 3s fall 3 rise 2
#---------------------------------------------------------------------
# Stats dashboard
#---------------------------------------------------------------------
listen stats
bind *:8404
stats enable
stats uri /stats
stats refresh 10s
stats auth admin:secretpassword
#---------------------------------------------------------------------
# HTTP frontend (redirect to HTTPS)
#---------------------------------------------------------------------
frontend http_front
bind *:80
http-request redirect scheme https code 301 unless { ssl_fc }
#---------------------------------------------------------------------
# HTTPS frontend (main entry point)
#---------------------------------------------------------------------
frontend https_front
bind *:443 ssl crt /etc/haproxy/certs/
# Security headers
http-response set-header X-Frame-Options SAMEORIGIN
http-response set-header X-Content-Type-Options nosniff
http-response set-header Strict-Transport-Security max-age=31536000
# Rate limiting
stick-table type ip size 100k expire 30s store http_req_rate(10s)
http-request track-sc0 src
http-request deny deny_status 429 if { sc_http_req_rate(0) gt 200 }
# ACL-based routing
acl is_api path_beg /api/
acl is_websocket hdr(Upgrade) -i websocket
acl is_admin path_beg /admin/
acl from_office src 203.0.113.0/24
# Routing rules
use_backend api_backend if is_api
use_backend ws_backend if is_websocket
use_backend admin_backend if is_admin from_office
default_backend web_backend
#---------------------------------------------------------------------
# Backends
#---------------------------------------------------------------------
backend web_backend
balance roundrobin
option httpchk GET /health
http-check expect status 200
server web1 10.0.1.10:8080 check weight 100
server web2 10.0.1.11:8080 check weight 100
server web3 10.0.1.12:8080 check weight 50 # Less powerful server
backend api_backend
balance leastconn
option httpchk GET /api/health
http-check expect status 200
timeout server 60s # APIs may need longer timeout
server api1 10.0.2.10:3000 check
server api2 10.0.2.11:3000 check
server api3 10.0.2.12:3000 check
backend ws_backend
balance source # Sticky for WebSocket
timeout tunnel 3600s # Long timeout for WebSocket
server ws1 10.0.3.10:3001 check
server ws2 10.0.3.11:3001 check
backend admin_backend
balance roundrobin
server admin1 10.0.4.10:8080 check
Debug This
Users report that some requests are taking 30 seconds before returning a 504 Gateway Timeout. The stats dashboard shows all backends as "UP."
# Step 1: Check the stats page for response time data
# Look at the "Avg" and "Max" time columns in the backend section
$ curl -s -u admin:pass http://localhost:8404/stats\;csv | \
awk -F, '/web_backend/ {print $2, "rtime_avg="$62, "rtime_max="$63}'
# Step 2: Check HAProxy logs for slow requests
$ sudo journalctl -u haproxy --since "10 minutes ago" | grep "30000"
# (30000ms = 30 seconds = timeout)
# Step 3: Is it a specific backend server?
# Check per-server stats in the dashboard
# Step 4: Check the backend directly (bypass HAProxy)
$ curl -o /dev/null -s -w "Total: %{time_total}s\n" http://10.0.1.10:8080/slow-endpoint
# Step 5: Check if the timeout is configured correctly
$ grep "timeout server" /etc/haproxy/haproxy.cfg
timeout server 30s
# 30s timeout matches the 30-second delay!
Common causes:
- Backend application is slow on certain endpoints (increase
timeout serverfor that backend) - Backend server is overloaded (check CPU/memory on the backend)
- Connection pool exhaustion (check
maxconnon server lines) - DNS resolution issues (HAProxy resolves at startup; backends changed IP?)
Configuration Validation and Reload
# Always validate before reloading
$ sudo haproxy -c -f /etc/haproxy/haproxy.cfg
Configuration file is valid
# Reload (zero downtime -- new process takes over from old)
$ sudo systemctl reload haproxy
# If reload fails, check the journal
$ sudo journalctl -u haproxy --since "1 minute ago"
HAProxy's reload is seamless: it starts a new process, transfers listening sockets to it, and the old process finishes existing connections before exiting. No connections are dropped.
What Just Happened?
┌──────────────────────────────────────────────────────────────┐
│ Chapter 47 Recap │
├──────────────────────────────────────────────────────────────┤
│ │
│ HAProxy is a dedicated load balancer and reverse proxy. │
│ │
│ Configuration sections: │
│ - global: process-wide settings │
│ - defaults: inherited defaults │
│ - frontend: where connections come in (binds to ports) │
│ - backend: where connections go (server pools) │
│ │
│ Key capabilities: │
│ - Layer 4 (TCP) and Layer 7 (HTTP) load balancing │
│ - ACLs for content-based routing │
│ - Active health checks (probes backends independently) │
│ - Stick tables for rate limiting and session persistence │
│ - Built-in stats dashboard │
│ - Zero-downtime reloads │
│ │
│ For high availability: pair with Keepalived + Virtual IP │
│ │
│ HAProxy excels at load balancing. Nginx excels at web │
│ serving. They complement each other beautifully. │
│ │
└──────────────────────────────────────────────────────────────┘
Try This
Exercise 1: Multi-Backend Routing
Configure HAProxy to route traffic based on URL path: /api/ goes to one backend pool, /static/ goes to another, and everything else goes to a default backend. Verify with curl.
Exercise 2: Health Checks in Action
Set up HAProxy with three backends and HTTP health checks. Kill one backend and observe:
- The stats dashboard shows it going DOWN
- Requests are no longer sent to it
- Start it back up and watch it return to UP
# Kill backend 2
$ kill $(lsof -ti:8002)
# Check stats: backend2 should show DOWN after a few seconds
# Restart it
$ cd /tmp/habackend8002 && python3 -m http.server 8002 &
# Check stats: backend2 should return to UP
Exercise 3: Rate Limiting
Implement rate limiting with stick tables (max 10 requests per second). Use a bash loop to send rapid requests and observe the 429 responses.
Exercise 4: Stats Dashboard
Enable the stats dashboard and explore it. Identify: current session rate, backend health status, response time averages, and error counts.
Bonus Challenge
Set up a complete HA architecture:
- Two HAProxy instances with Keepalived (MASTER/BACKUP) sharing a VIP
- Three web backends behind them
- HTTP health checks
- Stats dashboard on each HAProxy node
- Test failover by stopping HAProxy on the master node
This is the exact architecture used in many production environments.
What Comes Next
With web servers and load balancing mastered, you have the skills to build and manage production web infrastructure. The next section of the book moves into storage, backup, and disaster recovery -- because all that web traffic needs data, and data needs protection.