Disk Management in Production
Why This Matters
It is 2 AM and your monitoring alerts you: the database server's /var/lib/mysql partition is 95% full. The database will stop accepting writes within hours. If this were a traditional fixed partition, you would be looking at downtime -- adding a new disk, copying data, resizing partitions, and praying nothing goes wrong.
But this server uses LVM. You attach a new disk, extend the volume group, grow the logical volume, and resize the filesystem -- all without unmounting, all without downtime. The database never notices.
On another server, one of three disks in a RAID array has failed. The array is still serving data because RAID provides redundancy. You hot-swap the failed disk, add the replacement, and the array rebuilds itself while the application keeps running.
This is what production disk management looks like. LVM gives you flexibility. RAID gives you resilience. Together, they are the foundation of reliable storage in any serious Linux environment. This chapter teaches you both.
Try This Right Now
Check your current disk and partition layout:
$ lsblk
$ df -hT
$ cat /proc/mdstat # shows RAID status (empty if no RAID)
$ sudo lvs 2>/dev/null # shows logical volumes (empty if no LVM)
$ sudo pvs 2>/dev/null # shows physical volumes
$ sudo vgs 2>/dev/null # shows volume groups
If these LVM commands return nothing, you may not have LVM set up yet -- which is exactly what we are about to learn.
LVM: Logical Volume Management
The Problem LVM Solves
Traditional partitioning is rigid. When you create a 50 GB partition for /home, that is all you get. If you need more space, you have difficult choices: resize the partition (risky), move data to a bigger disk, or add a mount point and split your data.
LVM adds a layer of abstraction between your physical disks and your filesystems. This abstraction gives you the ability to:
- Resize volumes while they are mounted and in use
- Span a single volume across multiple physical disks
- Take snapshots of volumes for backups
- Move data between physical disks without downtime
The Three Layers of LVM
LVM has three layers, and understanding them is essential:
┌───────────────────────────────────────────────────────┐
│ FILESYSTEMS │
│ /home /var /data │
├───────────────────────────────────────────────────────┤
│ LOGICAL VOLUMES (LV) │
│ lv_home lv_var lv_data │
│ (These are what you format and mount) │
├───────────────────────────────────────────────────────┤
│ VOLUME GROUP (VG) │
│ vg_main │
│ (A pool of storage from one or more PVs) │
├───────────────────────────────────────────────────────┤
│ PHYSICAL VOLUMES (PV) │
│ /dev/sdb1 /dev/sdc1 │
│ (Actual disk partitions or whole disks) │
├───────────────────────────────────────────────────────┤
│ PHYSICAL DISKS │
│ /dev/sdb /dev/sdc │
└───────────────────────────────────────────────────────┘
Physical Volume (PV): A disk or partition that has been initialized for use by LVM. Think of it as raw material entering a factory.
Volume Group (VG): A pool of storage formed by combining one or more PVs. Think of it as a warehouse where all the raw material is combined into one big pile.
Logical Volume (LV): A slice of storage carved out from a VG. This is what you actually format with a filesystem and mount. Think of it as the finished product cut from the pile.
Hands-On: Creating an LVM Setup
We will simulate this using loop devices (virtual block devices backed by files). This is safe to do on any system.
Step 1: Create virtual disks
# Create two 500 MB files to act as disks
$ sudo dd if=/dev/zero of=/tmp/disk1.img bs=1M count=500
$ sudo dd if=/dev/zero of=/tmp/disk2.img bs=1M count=500
# Attach them as loop devices
$ sudo losetup /dev/loop10 /tmp/disk1.img
$ sudo losetup /dev/loop11 /tmp/disk2.img
# Verify
$ losetup -a | grep loop1
/dev/loop10: []: (/tmp/disk1.img)
/dev/loop11: []: (/tmp/disk2.img)
Step 2: Create Physical Volumes
$ sudo pvcreate /dev/loop10 /dev/loop11
Physical volume "/dev/loop10" successfully created.
Physical volume "/dev/loop11" successfully created.
# Inspect them
$ sudo pvs
PV VG Fmt Attr PSize PFree
/dev/loop10 lvm2 --- 500.00m 500.00m
/dev/loop11 lvm2 --- 500.00m 500.00m
$ sudo pvdisplay /dev/loop10
"/dev/loop10" is a new physical volume of "500.00 MiB"
--- NEW Physical volume ---
PV Name /dev/loop10
VG Name
PV Size 500.00 MiB
...
Step 3: Create a Volume Group
$ sudo vgcreate vg_lab /dev/loop10 /dev/loop11
Volume group "vg_lab" successfully created
$ sudo vgs
VG #PV #LV #SN Attr VSize VFree
vg_lab 2 0 0 wz--n- 992.00m 992.00m
$ sudo vgdisplay vg_lab
--- Volume group ---
VG Name vg_lab
VG Size 992.00 MiB
PE Size 4.00 MiB
Total PE 248
Free PE / Size 248 / 992.00 MiB
...
Notice that the VG size (992 MB) is slightly less than the raw total (1000 MB) due to LVM metadata overhead.
Step 4: Create Logical Volumes
# Create a 400 MB logical volume
$ sudo lvcreate -n lv_data -L 400M vg_lab
Logical volume "lv_data" created.
# Create another using 200 MB
$ sudo lvcreate -n lv_logs -L 200M vg_lab
Logical volume "lv_logs" created.
$ sudo lvs
LV VG Attr LSize Pool
lv_data vg_lab -wi-a----- 400.00m
lv_logs vg_lab -wi-a----- 200.00m
Step 5: Create filesystems and mount
# Format with ext4
$ sudo mkfs.ext4 /dev/vg_lab/lv_data
$ sudo mkfs.ext4 /dev/vg_lab/lv_logs
# Create mount points and mount
$ sudo mkdir -p /mnt/data /mnt/logs
$ sudo mount /dev/vg_lab/lv_data /mnt/data
$ sudo mount /dev/vg_lab/lv_logs /mnt/logs
# Verify
$ df -h /mnt/data /mnt/logs
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_lab-lv_data 388M 2.3M 362M 1% /mnt/data
/dev/mapper/vg_lab-lv_logs 190M 1.6M 175M 1% /mnt/logs
Think About It: We have 992 MB in the volume group, and we have allocated 600 MB to logical volumes. What happens to the remaining 392 MB? Can we use it later?
Extending and Reducing LVM Volumes
This is where LVM truly shines -- resizing storage on the fly.
Extending a Logical Volume
The /mnt/data volume is getting full. Let us add 200 MB from the free space in the volume group:
# Check free space in the VG
$ sudo vgs
VG #PV #LV #SN Attr VSize VFree
vg_lab 2 2 0 wz--n- 992.00m 392.00m
# Extend the logical volume by 200 MB
$ sudo lvextend -L +200M /dev/vg_lab/lv_data
Size of logical volume vg_lab/lv_data changed from 400.00 MiB to 600.00 MiB.
Logical volume vg_lab/lv_data successfully resized.
# IMPORTANT: The LV is bigger, but the filesystem still sees the old size
$ df -h /mnt/data
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_lab-lv_data 388M 2.3M 362M 1% /mnt/data
# Resize the filesystem to fill the new space
$ sudo resize2fs /dev/vg_lab/lv_data
resize2fs 1.47.0 (5-Feb-2023)
Filesystem at /dev/vg_lab/lv_data is mounted on /mnt/data; on-line resizing required
Performing an on-line resize of /dev/vg_lab/lv_data to 614400 (1k) blocks.
# Now the filesystem sees the new size
$ df -h /mnt/data
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_lab-lv_data 580M 2.3M 545M 1% /mnt/data
You can combine both steps with a single command:
# The -r flag resizes the filesystem automatically
$ sudo lvextend -L +100M -r /dev/vg_lab/lv_data
Distro Note: For XFS filesystems (default on RHEL/CentOS/Fedora), use
xfs_growfs /mnt/datainstead ofresize2fs. XFS can only grow, never shrink.
Adding a New Disk to a Volume Group
When the entire volume group is full, you can add another physical disk:
# Create a third virtual disk
$ sudo dd if=/dev/zero of=/tmp/disk3.img bs=1M count=500
$ sudo losetup /dev/loop12 /tmp/disk3.img
# Initialize it as a PV and add to the VG
$ sudo pvcreate /dev/loop12
$ sudo vgextend vg_lab /dev/loop12
$ sudo vgs
VG #PV #LV #SN Attr VSize VFree
vg_lab 3 2 0 wz--n- <1.46g 692.00m
You just expanded your storage pool without touching existing data. No unmounting, no reformatting, no data copying.
Reducing a Logical Volume
WARNING: Reducing a volume can destroy data if done incorrectly. Always back up first. XFS filesystems cannot be shrunk at all.
# Unmount first (required for shrinking)
$ sudo umount /mnt/logs
# Check the filesystem
$ sudo e2fsck -f /dev/vg_lab/lv_logs
# Shrink filesystem first, then LV
$ sudo resize2fs /dev/vg_lab/lv_logs 100M
$ sudo lvreduce -L 100M /dev/vg_lab/lv_logs
WARNING: Reducing active logical volume to 100.00 MiB.
THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce vg_lab/lv_logs? [y/n]: y
# Remount
$ sudo mount /dev/vg_lab/lv_logs /mnt/logs
Or use the safe combined approach:
$ sudo umount /mnt/logs
$ sudo lvreduce -L 100M -r /dev/vg_lab/lv_logs
$ sudo mount /dev/vg_lab/lv_logs /mnt/logs
LVM Snapshots
LVM snapshots create a point-in-time copy of a logical volume. They are invaluable for backups and for testing changes safely.
# Create some test data
$ sudo sh -c 'echo "Important data - version 1" > /mnt/data/config.txt'
# Create a snapshot (100M for storing changes)
$ sudo lvcreate -s -n snap_data -L 100M /dev/vg_lab/lv_data
Logical volume "snap_data" created.
# Now modify the original
$ sudo sh -c 'echo "Important data - version 2 (BROKEN)" > /mnt/data/config.txt'
# Mount the snapshot (read-only) to recover
$ sudo mkdir -p /mnt/snap
$ sudo mount -o ro /dev/vg_lab/snap_data /mnt/snap
# The snapshot still has the original data
$ cat /mnt/snap/config.txt
Important data - version 1
# Recover the file
$ sudo cp /mnt/snap/config.txt /mnt/data/config.txt
# Cleanup
$ sudo umount /mnt/snap
$ sudo lvremove /dev/vg_lab/snap_data
Snapshots use copy-on-write: they only store blocks that change in the original volume after the snapshot is taken. The snapshot volume needs to be large enough to hold all the changes that occur while it exists.
Think About It: If you take a snapshot and then write 200 MB of new data to the original volume, but the snapshot only has 100 MB of space, what happens?
RAID: Redundant Array of Independent Disks
RAID combines multiple disks to provide redundancy, performance, or both. Linux supports software RAID through mdadm.
RAID Levels Explained
RAID 0 (Striping) - Performance, NO redundancy
┌─────────┐ ┌─────────┐
│ Disk 1 │ │ Disk 2 │
│ Block 1 │ │ Block 2 │
│ Block 3 │ │ Block 4 │
│ Block 5 │ │ Block 6 │
└─────────┘ └─────────┘
Min disks: 2 | Usable: 100% | Fault tolerance: NONE
If ANY disk fails, ALL data is lost.
RAID 1 (Mirroring) - Redundancy, reduced capacity
┌─────────┐ ┌─────────┐
│ Disk 1 │ │ Disk 2 │
│ Block 1 │ │ Block 1 │ (identical copy)
│ Block 2 │ │ Block 2 │ (identical copy)
│ Block 3 │ │ Block 3 │ (identical copy)
└─────────┘ └─────────┘
Min disks: 2 | Usable: 50% | Can lose 1 disk
RAID 5 (Striping + Distributed Parity)
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Disk 1 │ │ Disk 2 │ │ Disk 3 │
│ Data A1 │ │ Data A2 │ │ Parity A │
│ Data B1 │ │ Parity B │ │ Data B2 │
│ Parity C │ │ Data C1 │ │ Data C2 │
└─────────┘ └─────────┘ └─────────┘
Min disks: 3 | Usable: (N-1)/N | Can lose 1 disk
RAID 6 (Striping + Double Distributed Parity)
Same as RAID 5 but with two parity blocks per stripe.
Min disks: 4 | Usable: (N-2)/N | Can lose 2 disks
RAID 10 (Mirror + Stripe)
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Disk 1 │ │ Disk 2 │ │ Disk 3 │ │ Disk 4 │
│ Block 1 │ │ Block 1 │ │ Block 2 │ │ Block 2 │
│ Block 3 │ │ Block 3 │ │ Block 4 │ │ Block 4 │
└─────────┘ └─────────┘ └─────────┘ └─────────┘
Mirror 1 Mirror 2
─────────── Striped ──────────────
Min disks: 4 | Usable: 50% | Can lose 1 disk per mirror
Hands-On: Creating a RAID 1 Array with mdadm
# Install mdadm
$ sudo apt install mdadm # Debian/Ubuntu
$ sudo dnf install mdadm # Fedora/RHEL
# Create two virtual disks for RAID
$ sudo dd if=/dev/zero of=/tmp/raid1.img bs=1M count=200
$ sudo dd if=/dev/zero of=/tmp/raid2.img bs=1M count=200
$ sudo losetup /dev/loop20 /tmp/raid1.img
$ sudo losetup /dev/loop21 /tmp/raid2.img
# Create a RAID 1 array
$ sudo mdadm --create /dev/md0 --level=1 --raid-devices=2 \
/dev/loop20 /dev/loop21
mdadm: array /dev/md0 started.
# Check the status
$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 loop21[1] loop20[0]
200576 blocks super 1.2 [2/2] [UU]
# The [UU] means both disks are Up. [U_] would mean one is missing.
# Detailed information
$ sudo mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sat Jan 18 10:30:00 2025
Raid Level : raid1
Array Size : 200576 (195.89 MiB)
Used Dev Size : 200576 (195.89 MiB)
Raid Devices : 2
Total Devices : 2
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
State : clean
# Format and mount
$ sudo mkfs.ext4 /dev/md0
$ sudo mkdir -p /mnt/raid
$ sudo mount /dev/md0 /mnt/raid
Simulating a Disk Failure and Recovery
# Write some data
$ sudo sh -c 'echo "Critical data on RAID" > /mnt/raid/important.txt'
# Simulate a disk failure
$ sudo mdadm --manage /dev/md0 --fail /dev/loop20
mdadm: set /dev/loop20 faulty in /dev/md0
$ cat /proc/mdstat
md0 : active raid1 loop21[1] loop20[0](F)
200576 blocks super 1.2 [2/1] [_U]
# [_U] -- first disk is down, second is up
# But data is still accessible!
$ cat /mnt/raid/important.txt
Critical data on RAID
# Remove the failed disk
$ sudo mdadm --manage /dev/md0 --remove /dev/loop20
# Add a replacement disk
$ sudo dd if=/dev/zero of=/tmp/raid3.img bs=1M count=200
$ sudo losetup /dev/loop22 /tmp/raid3.img
$ sudo mdadm --manage /dev/md0 --add /dev/loop22
# Watch the rebuild
$ cat /proc/mdstat
md0 : active raid1 loop22[2] loop21[1]
200576 blocks super 1.2 [2/1] [_U]
[========>............] recovery = 42.5% ...
# Wait for it to finish, then:
$ cat /proc/mdstat
md0 : active raid1 loop22[2] loop21[1]
200576 blocks super 1.2 [2/2] [UU]
Monitoring RAID Health
# Check array status
$ sudo mdadm --detail /dev/md0
# Scan all arrays
$ sudo mdadm --examine --scan
# Set up email alerts for failures
$ sudo mdadm --monitor --mail=admin@example.com --delay=300 /dev/md0 &
# Or configure monitoring in mdadm.conf
$ cat /etc/mdadm/mdadm.conf
MAILADDR admin@example.com
Disk Health Monitoring with smartctl
Disks warn you before they die -- if you are listening. SMART (Self-Monitoring, Analysis, and Reporting Technology) tracks disk health indicators.
# Install smartmontools
$ sudo apt install smartmontools # Debian/Ubuntu
$ sudo dnf install smartmontools # Fedora/RHEL
# Check if a disk supports SMART
$ sudo smartctl -i /dev/sda
# View overall health
$ sudo smartctl -H /dev/sda
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
# View detailed attributes
$ sudo smartctl -A /dev/sda
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE RAW_VALUE
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail 0
9 Power_On_Hours 0x0032 097 097 000 Old_age 14523
197 Current_Pending_Sector 0x0012 100 100 000 Old_age 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age 0
Key attributes to watch:
| Attribute | What It Means | Worry When |
|---|---|---|
| Reallocated_Sector_Ct | Bad sectors replaced by spares | Any value > 0 |
| Current_Pending_Sector | Sectors waiting to be remapped | Any value > 0 |
| Offline_Uncorrectable | Sectors that could not be read | Any value > 0 |
| Power_On_Hours | Total hours of operation | Approaching rated life |
| Temperature_Celsius | Current temperature | Above 50C for HDDs |
# Run a short self-test
$ sudo smartctl -t short /dev/sda
# Run a long self-test (can take hours)
$ sudo smartctl -t long /dev/sda
# Check test results
$ sudo smartctl -l selftest /dev/sda
# Enable automatic monitoring daemon
$ sudo systemctl enable --now smartd
Distro Note: On RHEL/CentOS, the smartd configuration is at
/etc/smartmontools/smartd.conf. On Debian/Ubuntu, it is at/etc/smartd.conf.
Debug This
A junior admin reports: "I extended the logical volume but the filesystem still shows the old size."
$ sudo lvs
LV VG Attr LSize
lv_app vg_prod -wi-ao---- 100.00g
$ df -h /app
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_prod-lv_app 50G 45G 2.5G 95% /app
The LV is 100 GB but the filesystem only sees 50 GB. What did they forget?
Answer: They forgot to resize the filesystem after extending the LV. The fix depends on the filesystem type:
# For ext4:
$ sudo resize2fs /dev/vg_prod/lv_app
# For XFS:
$ sudo xfs_growfs /app
This is one of the most common LVM mistakes. The -r flag on lvextend would have handled this automatically:
$ sudo lvextend -L 100G -r /dev/vg_prod/lv_app
Cleanup
If you followed along with the lab, clean up the loop devices:
$ sudo umount /mnt/data /mnt/logs /mnt/raid /mnt/snap 2>/dev/null
$ sudo lvremove -f vg_lab/lv_data vg_lab/lv_logs 2>/dev/null
$ sudo vgremove vg_lab 2>/dev/null
$ sudo pvremove /dev/loop10 /dev/loop11 /dev/loop12 2>/dev/null
$ sudo mdadm --stop /dev/md0 2>/dev/null
$ sudo losetup -d /dev/loop10 /dev/loop11 /dev/loop12 /dev/loop20 /dev/loop21 /dev/loop22 2>/dev/null
$ sudo rm -f /tmp/disk1.img /tmp/disk2.img /tmp/disk3.img /tmp/raid1.img /tmp/raid2.img /tmp/raid3.img
┌──────────────────────────────────────────────────────────┐
│ What Just Happened? │
├──────────────────────────────────────────────────────────┤
│ │
│ LVM provides flexible storage management: │
│ - PV (Physical Volume) → raw disk/partition │
│ - VG (Volume Group) → pool of PVs │
│ - LV (Logical Volume) → usable slice from a VG │
│ │
│ Key LVM operations: │
│ - pvcreate/vgcreate/lvcreate → build the stack │
│ - lvextend -r → grow a volume + filesystem │
│ - lvreduce -r → shrink (backup first!) │
│ - lvcreate -s → snapshot for backup/testing │
│ │
│ RAID provides disk redundancy: │
│ - RAID 0 = speed, no safety │
│ - RAID 1 = mirror, can lose one disk │
│ - RAID 5 = parity across 3+ disks │
│ - RAID 10 = mirror + stripe (production favorite) │
│ │
│ smartctl monitors disk health before failure. │
│ │
└──────────────────────────────────────────────────────────┘
Try This
-
LVM basics: Create three loop devices, combine them into a volume group, create two logical volumes, format them with ext4, and mount them.
-
Online resize: Write a 50 MB file to one of your logical volumes, then extend the volume by 200 MB using
lvextend -r. Verify the file is still intact. -
Snapshot backup: Create a snapshot of a logical volume, write new files to the original, then mount the snapshot read-only and verify it still has the old data.
-
RAID simulation: Create a RAID 5 array with three loop devices. Write data, mark one device as failed, verify data is still readable, then add a replacement and watch the rebuild.
-
Bonus challenge: Combine LVM and RAID -- create a RAID 1 array with mdadm, then use the RAID device as a physical volume for LVM. This is how many production servers are configured.