Symmetric and Asymmetric Cryptography
"Cryptography is typically bypassed, not penetrated." — Adi Shamir, co-inventor of RSA
The Locked Box Problem
Imagine you need to send a secret message to a colleague in another office across the city. You have a lockbox with a padlock. You put the message in the box, lock it, and send it via courier. Simple enough -- but how does your colleague open the box? You need to send the key somehow, and the courier could copy it.
You have just described the fundamental problem of symmetric cryptography: both parties need the same key, and you need a secure way to share that key. This problem has driven cryptographic innovation for decades, and the solutions are what make the internet possible.
There are actually two complementary solutions, each with different properties. In practice, you will almost never use just one. Every real-world cryptographic system uses both, composed together. Let's start with symmetric encryption, because it is simpler, faster, and what actually encrypts your data.
Symmetric Encryption: One Key to Rule Them All
Symmetric encryption uses the same key for encryption and decryption. The sender and receiver must both possess the secret key.
flowchart LR
P["Plaintext<br/>'Hello, World!'"] --> ENC["Encrypt<br/>AES-256-GCM"]
K1["Key K<br/>(shared secret)"] --> ENC
ENC --> C["Ciphertext<br/>7f3a2b91c4d8<br/>e6f0129834ab"]
C --> DEC["Decrypt<br/>AES-256-GCM"]
K2["Same Key K<br/>(shared secret)"] --> DEC
DEC --> P2["Plaintext<br/>'Hello, World!'"]
style K1 fill:#e53e3e,color:#fff
style K2 fill:#e53e3e,color:#fff
style C fill:#3182ce,color:#fff
AES: The Standard
AES (Advanced Encryption Standard) is the symmetric cipher you should use. Period.
Why AES specifically? Because it won a public, multi-year international competition run by NIST in 2001. Fifteen candidate algorithms were submitted. After three years of analysis by the world's best cryptanalysts, Rijndael (designed by Joan Daemen and Vincent Rijmen) was selected. It has withstood over two decades of cryptanalysis, is implemented in hardware on virtually every modern CPU, and is approved for protecting classified information up to Top Secret. When a cipher has that resume, you do not look for alternatives.
Key properties of AES:
- Block cipher: Operates on fixed-size blocks of 128 bits (16 bytes)
- Key sizes: 128, 192, or 256 bits (10, 12, or 14 rounds respectively)
- Performance: Extremely fast with hardware acceleration (AES-NI instructions), typically 5-10 GB/s per core
- Standardized: NIST FIPS 197
- Ubiquitous: Supported by every programming language, every operating system, every hardware platform
# Check if your CPU supports AES hardware acceleration
# macOS:
$ sysctl -a | grep -i aes
hw.optional.aes: 1
# Linux:
$ grep -o aes /proc/cpuinfo | head -1
aes
# Benchmark AES with and without hardware acceleration
$ openssl speed -evp aes-256-gcm
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-gcm 710042.13k 2357312.00k 5765292.80k 7876132.86k 8489013.25k 8547123.09k
# That's ~8.5 GB/s on a single core with AES-NI
# Without AES-NI, you'd see ~200 MB/s — a 40x difference
Block Cipher Modes: How AES Actually Works
AES encrypts 16 bytes at a time. Real-world data is longer than 16 bytes. Block cipher modes define how to use AES to encrypt arbitrarily long data. The mode you choose matters enormously for security.
This is where developers make their first cryptographic mistake. They pick AES (correct) and then use ECB mode (catastrophically wrong).
ECB (Electronic Codebook) -- Never Use This
ECB encrypts each 16-byte block independently with the same key. Identical plaintext blocks produce identical ciphertext blocks. This means patterns in the plaintext are preserved in the ciphertext.
flowchart TD
subgraph ECB["ECB Mode — Pattern Leakage"]
P1["Block 1: 'AAAA'"] -->|"AES(K)"| C1["Cipher: X"]
P2["Block 2: 'BBBB'"] -->|"AES(K)"| C2["Cipher: Y"]
P3["Block 3: 'AAAA'"] -->|"AES(K)"| C3["Cipher: X"]
P4["Block 4: 'CCCC'"] -->|"AES(K)"| C4["Cipher: Z"]
P5["Block 5: 'AAAA'"] -->|"AES(K)"| C5["Cipher: X"]
end
NOTE["Same plaintext block → Same ciphertext block<br/>Patterns are visible!<br/>The famous 'ECB Penguin' shows this:<br/>encrypting an image in ECB preserves the outline"]
style ECB fill:#e53e3e,color:#fff
style NOTE fill:#fff3cd,color:#1a202c
The "ECB Penguin" is the canonical demonstration of this flaw: when you encrypt a bitmap image of the Linux penguin (Tux) with AES in ECB mode, the encrypted image still shows the outline of the penguin because regions of the same color produce the same ciphertext. The image is "encrypted" but the structure is completely visible.
ECB mode is never acceptable for encrypting data longer than one block. If you see ECB mode in production code, it's a critical vulnerability. Unfortunately, ECB is the default mode in many cryptographic libraries:
- Java: `Cipher.getInstance("AES")` defaults to `AES/ECB/PKCS5Padding`
- Python PyCryptodome: `AES.new(key, AES.MODE_ECB)` if you explicitly select it
- .NET: `Aes.Create()` defaults to CBC (better, but still not ideal)
Always explicitly specify the mode. Never rely on defaults.
CBC (Cipher Block Chaining) -- Legacy, Use with Care
CBC chains blocks together: each plaintext block is XORed with the previous ciphertext block before encryption. An Initialization Vector (IV) is used for the first block. Identical plaintexts produce different ciphertexts (assuming different, random IVs).
The chaining means that a change in any plaintext block affects all subsequent ciphertext blocks -- eliminating ECB's pattern leakage. However, CBC has critical weaknesses:
-
Padding oracle attacks: CBC requires padding (PKCS7) when the plaintext is not a multiple of the block size. If the server reveals whether padding is valid or invalid (through different error messages or timing differences), an attacker can decrypt the entire ciphertext one byte at a time without knowing the key. The POODLE attack (2014) and Lucky Thirteen attack exploited this.
-
Bit-flipping attacks: Because XOR is its own inverse, an attacker can modify specific bytes of the plaintext by flipping corresponding bits in the previous ciphertext block. Without a separate integrity check (MAC), this goes undetected.
-
Not parallelizable for encryption: Each block depends on the previous one, so encryption is sequential. (Decryption can be parallelized.)
CBC was the standard for 20 years. It is in TLS 1.0-1.2, SSH, IPsec, and countless applications. If you inherit legacy code using CBC, add an HMAC (Encrypt-then-MAC) and ensure constant-time padding validation. For new code, use GCM.
GCM (Galois/Counter Mode) -- The Modern Standard
GCM provides both encryption and authentication (integrity checking) in a single operation. It is an AEAD (Authenticated Encryption with Associated Data) mode -- the gold standard for modern cryptography.
flowchart TD
PT["Plaintext"] --> GCM
KEY["AES Key<br/>(128 or 256 bits)"] --> GCM
IV["IV / Nonce<br/>(96 bits, MUST be unique)"] --> GCM
AAD["Associated Data<br/>(authenticated but<br/>NOT encrypted)<br/>e.g., TLS record header"] --> GCM
GCM["AES-GCM<br/>Encrypt + Authenticate"] --> CT["Ciphertext<br/>(same size as plaintext)"]
GCM --> TAG["Authentication Tag<br/>(128 bits / 16 bytes)"]
CT --> VERIFY["Decrypt + Verify"]
TAG --> VERIFY
KEY2["Same AES Key"] --> VERIFY
IV2["Same IV"] --> VERIFY
AAD2["Same AAD"] --> VERIFY
VERIFY -->|"Tag matches"| VALID["Decrypted plaintext"]
VERIFY -->|"Tag mismatch"| REJECT["REJECT:<br/>Tampering detected!"]
style TAG fill:#38a169,color:#fff
style REJECT fill:#e53e3e,color:#fff
style VALID fill:#38a169,color:#fff
Why GCM is superior:
- Confidentiality + Integrity in one operation: No need for a separate HMAC. The authentication tag guarantees both that the data has not been modified and that the associated data has not been modified.
- Parallelizable: Both encryption and decryption can be parallelized, making it fast on multi-core systems.
- Associated data: You can authenticate additional data (like packet headers) without encrypting it. In TLS, the record header is associated data -- it must be readable by network equipment but tampering must be detected.
- No padding needed: GCM uses CTR (counter) mode internally, which turns AES into a stream cipher. No padding, no padding oracle attacks.
What is the "associated data" part for, exactly? Imagine you are sending an encrypted packet. The packet header contains routing information -- source, destination, packet type. Routers need to read this header to route the packet. You cannot encrypt it. But you also cannot let an attacker modify it undetected -- changing the destination or packet type could cause serious problems. GCM authenticates both the encrypted payload and the plaintext header. If anyone modifies either, decryption fails. This is exactly how TLS uses GCM: the record header (content type, version, length) is associated data.
# Encrypt a file with AES-256-GCM using openssl
# Step 1: Generate a random key and IV
$ KEY=$(openssl rand -hex 32) # 256-bit key
$ IV=$(openssl rand -hex 12) # 96-bit IV for GCM
# Step 2: Encrypt
$ openssl enc -aes-256-gcm -in secret.txt -out secret.enc \
-K $KEY -iv $IV
# More practically, using password-based encryption with PBKDF2:
$ openssl enc -aes-256-cbc -salt -pbkdf2 -iter 100000 \
-in secret.txt -out secret.enc -k "my_passphrase"
# Note: openssl enc CLI has limited GCM support;
# in application code, always use GCM via your language's crypto library
# Generate a random 256-bit key
$ openssl rand -hex 32
a4f3b2c1d0e9f8a7b6c5d4e3f2a1b0c9d8e7f6a5b4c3d2e1f0a9b8c7d6e5f4
# That's 64 hex characters = 32 bytes = 256 bits
# Benchmark GCM vs CBC
$ openssl speed -evp aes-256-gcm
$ openssl speed -evp aes-256-cbc
# GCM is typically 10-20% faster than CBC due to parallelization
# and elimination of padding operations
ChaCha20-Poly1305: The Alternative AEAD
GCM relies on AES hardware acceleration (AES-NI) for performance. On devices without AES-NI (older mobile phones, some ARM processors), AES-GCM is significantly slower. ChaCha20-Poly1305, designed by Daniel Bernstein, is a software-optimized AEAD that performs well without hardware acceleration.
- ChaCha20: Stream cipher (encryption)
- Poly1305: Message authentication code (integrity)
- Combined: AEAD with similar security properties to AES-GCM
TLS 1.3 supports both TLS_AES_256_GCM_SHA384 and TLS_CHACHA20_POLY1305_SHA256. Servers typically negotiate ChaCha20-Poly1305 when the client is a mobile device.
Key Sizes and What They Mean
Does it matter if you use AES-128 or AES-256? Is 256 "more secure"? AES-128 provides 128 bits of security, meaning a brute-force attack would require 2^128 operations. To put that number in perspective:
| Key Size | Operations to Brute Force | Context |
|---|---|---|
| 56-bit (DES) | 2^56 = ~72 quadrillion | Cracked in 22 hours by EFF's Deep Crack in 1998 ($250,000 machine) |
| 128-bit (AES-128) | 2^128 = 3.4 x 10^38 | If every atom in the universe were a computer trying a billion keys per second, it would take longer than the age of the universe |
| 256-bit (AES-256) | 2^256 = 1.16 x 10^77 | This number approaches the number of atoms in the observable universe (~10^80) |
AES-128 is unbreakable by brute force with any conceivable classical technology. AES-256 exists for three reasons: regulatory compliance (some standards require 256-bit keys), defense against quantum computers (Grover's algorithm halves the effective key length, so AES-256 becomes 128-bit security against quantum attacks, while AES-128 becomes 64-bit -- potentially breakable), and defense-in-depth philosophy.
For practical purposes, AES-128 is fine against classical computers. But given the quantum threat, use AES-256 for anything with a long secrecy requirement (medical records, financial data, government secrets). The performance overhead is minimal -- AES-256 uses 14 rounds vs AES-128's 10 rounds, but with hardware acceleration, you are talking about 7.5 GB/s vs 8.5 GB/s. The extra margin costs you almost nothing.
Asymmetric Encryption: Two Keys Are Better Than One
Back to the locked box problem. You need to share a secret key, but how do you share it securely if you do not already have a secure channel? That is a chicken-and-egg problem.
In the 1970s, Whitfield Diffie, Martin Hellman, and Ralph Merkle had a revolutionary insight: what if you used two mathematically linked keys -- one public, one private -- where data encrypted with one can only be decrypted with the other?
flowchart TD
subgraph KEYGEN["Key Generation"]
GEN["Generate Key Pair"] --> PUB["Public Key<br/>(shared with everyone)"]
GEN --> PRIV["Private Key<br/>(kept secret, never shared)"]
end
subgraph ENCRYPT["Encryption (Anyone can do this)"]
MSG["Secret Message"] --> ENC["Encrypt with<br/>recipient's PUBLIC key"]
PUB2["Recipient's Public Key"] --> ENC
ENC --> CIPHER["Ciphertext<br/>a8f2c91b3d4e7f0e"]
end
subgraph DECRYPT["Decryption (Only recipient can do this)"]
CIPHER2["Ciphertext"] --> DEC["Decrypt with<br/>recipient's PRIVATE key"]
PRIV2["Recipient's Private Key"] --> DEC
DEC --> PLAIN["Secret Message"]
end
KEYGEN --> ENCRYPT
ENCRYPT --> DECRYPT
style PUB fill:#38a169,color:#fff
style PRIV fill:#e53e3e,color:#fff
style PUB2 fill:#38a169,color:#fff
style PRIV2 fill:#e53e3e,color:#fff
Think of it as a special mailbox. Anyone can drop a letter through the slot (encrypt with the public key), but only the owner with the unique key can open the mailbox and read the letters (decrypt with the private key). Even the person who dropped the letter cannot get it back out.
This is an elegant solution. But why not use asymmetric encryption for everything? Performance. And that is the critical constraint that drives real-world cryptographic architecture.
RSA: The Original
RSA (Rivest-Shamir-Adleman, 1977) was the first practical public-key cryptosystem. Its security is based on the difficulty of factoring large numbers -- specifically, the product of two large primes.
The mathematical intuition: multiplying two 1024-bit primes takes microseconds. Factoring the resulting 2048-bit product back into its prime factors takes centuries (with classical computers). This asymmetry between the easy direction (multiplication) and the hard direction (factoring) is what makes RSA work.
# Generate an RSA key pair (2048-bit minimum, 4096-bit recommended)
$ openssl genrsa -out private_key.pem 4096
Generating RSA private key, 4096 bit long modulus
..............................................++
.....++
# Extract the public key
$ openssl rsa -in private_key.pem -pubout -out public_key.pem
# Look at the key components
$ openssl rsa -in private_key.pem -text -noout | head -20
RSA Private-Key: (4096 bit, 2 primes)
modulus:
00:c5:3a:... (512 bytes — this is p*q)
publicExponent: 65537 (0x10001)
privateExponent:
5b:2e:...
prime1: # This is p
...
prime2: # This is q
...
# Encrypt a small message with someone's public key
$ echo "This is a secret message" | \
openssl pkeyutl -encrypt -pubin -inkey public_key.pem -out message.enc
# Decrypt with the private key
$ openssl pkeyutl -decrypt -inkey private_key.pem -in message.enc
This is a secret message
# Note: RSA can only encrypt data smaller than the key size minus padding
# 4096-bit RSA with OAEP padding can encrypt at most ~446 bytes
# This is why RSA is used for key transport, not bulk data
RSA has a critical limitation you need to understand: the maximum message size. With OAEP padding (which you should always use -- never use PKCS1v1.5 padding for new code), a 2048-bit RSA key can encrypt at most 214 bytes. A 4096-bit key can encrypt 446 bytes. This is not a practical limitation because RSA is never used for bulk data -- it is used to encrypt a symmetric key, which is 32 bytes.
ECC: The Modern Alternative
Elliptic Curve Cryptography (ECC) provides the same security level as RSA with dramatically smaller key sizes. The security is based on the Elliptic Curve Discrete Logarithm Problem (ECDLP), which is harder to solve than RSA's factoring problem for equivalent key sizes.
| Security Level | RSA Key Size | ECC Key Size | Ratio |
|---|---|---|---|
| 80 bits | 1024 bits | 160 bits | 6.4x |
| 112 bits | 2048 bits | 224 bits | 9.1x |
| 128 bits | 3072 bits | 256 bits | 12x |
| 192 bits | 7680 bits | 384 bits | 20x |
| 256 bits | 15360 bits | 521 bits | 29.5x |
For most purposes, ECC is simply better than RSA. ECC is faster for key generation and signing, uses less bandwidth (smaller keys and signatures), and provides equivalent security. The main reasons RSA is still around are legacy compatibility, wider library support in older systems, and inertia. For new systems, use ECC.
# Generate an ECC key pair (P-256 curve)
$ openssl ecparam -genkey -name prime256v1 -noout -out ec_private.pem
# Extract the public key
$ openssl ec -in ec_private.pem -pubout -out ec_public.pem
# Compare key file sizes
$ wc -c private_key.pem ec_private.pem
3272 private_key.pem # RSA 4096-bit
227 ec_private.pem # ECC P-256 — 14x smaller!
# Generate a Curve25519 key pair (preferred for new implementations)
$ openssl genpkey -algorithm Ed25519 -out ed25519_private.pem
$ openssl pkey -in ed25519_private.pem -pubout -out ed25519_public.pem
$ wc -c ed25519_private.pem
119 ed25519_private.pem # Even smaller
The choice of elliptic curve matters more than most developers realize:
- **P-256 (secp256r1 / prime256v1)**: NIST standard curve, published in 1999. Widely supported, used in most TLS deployments. Some cryptographers have expressed concern about NIST's curve generation process — the seed values were not fully explained, leading to speculation about potential backdoors. No evidence of actual weakness has been found, but the opacity of the generation process is unsatisfying.
- **Curve25519 (X25519 for key exchange, Ed25519 for signatures)**: Designed by Daniel Bernstein in 2005. Mathematically elegant, faster than P-256 in software, and designed to be resistant to implementation errors. The curve parameters are rigid (derived from obvious mathematical constants, not arbitrary seeds), addressing the NIST transparency concern. Used in TLS 1.3, Signal Protocol, SSH, WireGuard, and Tor. **This is the recommended choice for new implementations.**
- **P-384 (secp384r1)**: Higher security level (192-bit). Used when regulations require it (NSA's CNSA suite for government systems). Slower than P-256 but provides greater security margin.
- **secp256k1**: Used exclusively by Bitcoin and Ethereum. Not commonly used outside cryptocurrency. Chosen for performance characteristics specific to digital signatures in blockchain contexts.
When you see these names in cipher suite negotiations or key exchange specifications, now you know what they mean and why the choice matters.
The Performance Problem: Why We Need Both
Here is the critical insight that ties symmetric and asymmetric encryption together.
# Benchmark AES-256-GCM (symmetric)
$ openssl speed -evp aes-256-gcm 2>/dev/null | tail -2
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-gcm 710042.13k 2357312.00k 5765292.80k 7876132.86k 8489013.25k
# Benchmark RSA-2048 (asymmetric)
$ openssl speed rsa2048 2>/dev/null | tail -2
sign verify sign/s verify/s
rsa 2048 bits 0.000784s 0.000023s 1275.6 43478.3
# Benchmark ECDSA P-256 (asymmetric)
$ openssl speed ecdsap256 2>/dev/null | tail -2
sign verify sign/s verify/s
256 bits ecdsa (nistp256) 0.0000s 0.0001s 30000.0 12000.0
# Summary:
# AES-256-GCM: ~8.5 GB/s throughput (bulk data)
# RSA-2048 sign: ~1,275 operations/second
# ECDSA P-256: ~30,000 signs/second (23x faster than RSA)
# RSA is ~1000x slower than AES for bulk data encryption
Asymmetric encryption is far too slow for encrypting actual data. Encrypting a 1 GB file with RSA would require splitting it into approximately 4.7 million chunks (each no more than 214 bytes), performing 4.7 million RSA operations -- that would take over an hour. With AES-GCM, it takes 0.12 seconds. That is why we use hybrid encryption: asymmetric crypto to exchange a symmetric key, then symmetric crypto to encrypt the actual data. This is the architecture of every real-world cryptographic protocol.
sequenceDiagram
participant A as Alice
participant B as Bob
Note over A,B: Step 1: Key Exchange (Asymmetric — slow, small data)
A->>A: Generate random 256-bit session key K
A->>B: Encrypt K with Bob's PUBLIC key<br/>(RSA or ECIES — only 32 bytes to encrypt)
B->>B: Decrypt K with PRIVATE key<br/>Now both have session key K
Note over A,B: Step 2: Data Exchange (Symmetric — fast, bulk data)
A->>B: AES-256-GCM(K, plaintext_1) at full speed
B->>A: AES-256-GCM(K, plaintext_2) at full speed
A->>B: AES-256-GCM(K, plaintext_3) at full speed
Note over A,B: Best of both worlds:<br/>Asymmetric solves key distribution<br/>Symmetric provides speed for bulk data
This is the fundamental architecture of TLS, which is covered in depth in Chapter 6. The asymmetric crypto is used only for the handshake -- exchanging or agreeing on a session key. All actual data encryption uses symmetric crypto (AES-GCM or ChaCha20-Poly1305 in modern TLS).
Common Mistakes That Break Everything
The algorithms are solid. They have been analyzed by thousands of cryptographers over decades. What breaks in practice is how developers use them. Here are the mistakes that appear in production code year after year.
Mistake 1: Using ECB Mode
Already covered above, but worth repeating because it appears every year in production audits:
# Java developers: DON'T do this
# Cipher.getInstance("AES") ← defaults to AES/ECB/PKCS5Padding!
# DO this instead:
# Cipher.getInstance("AES/GCM/NoPadding")
# Python developers: DON'T do this
# from Crypto.Cipher import AES
# cipher = AES.new(key, AES.MODE_ECB) ← explicitly wrong
# DO this instead:
# cipher = AES.new(key, AES.MODE_GCM, nonce=nonce)
Mistake 2: Reusing IVs/Nonces
AES-GCM requires a unique nonce (number used once) for every encryption with the same key. Reusing a nonce with the same key is catastrophic -- it completely breaks GCM's security:
- The authentication tag becomes forgeable
- The keystream can be recovered via XOR of the two ciphertexts
- With two ciphertext/plaintext pairs encrypted under the same nonce, the attacker can recover the authentication key H and forge arbitrary messages
How do you ensure uniqueness? Two approaches:
Random nonces (96 bits): Generate a random 12-byte nonce for each encryption. The birthday paradox means collision probability reaches 50% after 2^48 (~2.8 x 10^14) messages. For most applications, this is safe. But at scale (billions of messages per day with the same key), random nonces become dangerous.
Counter-based nonces: Use a monotonically increasing counter as the nonce. Never repeats, deterministic, no birthday paradox concern. But requires state management -- you must reliably persist the counter. If state is lost (application crash, server restart), you might reuse a counter value. For distributed systems, partition the nonce space: use the first 4 bytes as a server ID and the last 8 bytes as a per-server counter.
The safest approach: rotate keys frequently enough that any single key never encrypts more than 2^32 messages. With key rotation every few hours, random nonces are completely safe.
Mistake 3: Rolling Your Own Crypto
Never implement your own encryption algorithm, your own key derivation function, or your own random number generator. Use established libraries: libsodium (NaCl), OpenSSL, BoringSSL, or the standard library crypto packages in Go/Rust/Python.
Cryptographic code that looks correct can have devastating timing side-channel vulnerabilities that only experts would notice. For example, comparing two HMAC values with `==` leaks timing information that allows byte-by-byte reconstruction of the correct HMAC. You must use constant-time comparison functions (`hmac.compare_digest()` in Python, `crypto/subtle.ConstantTimeCompare()` in Go).
The history of "roll your own" crypto failures is long: WEP's RC4 implementation used 24-bit IVs (too short), PlayStation 3's ECDSA implementation reused the random nonce k (allowing private key extraction), and dozens of JWT libraries accepted `"alg": "none"` to skip signature verification entirely.
Mistake 4: Weak Key Derivation
When you derive encryption keys from passwords, you must use a proper key derivation function (KDF). Raw hashing (SHA-256 of the password) is not a KDF -- it is too fast, making brute-force attacks feasible.
# BAD: Raw SHA-256 of password (billions of guesses/second on GPU)
# key = SHA256("my_password")
# GOOD: PBKDF2 with high iteration count
$ openssl enc -aes-256-cbc -salt -pbkdf2 -iter 600000 \
-in secret.txt -out secret.enc -k "my_password"
# -iter 600000 means 600,000 rounds of PBKDF2
# OWASP recommends minimum 600,000 iterations for PBKDF2-SHA256
# BETTER: Use Argon2id for key derivation (if available)
# Argon2id is memory-hard, making GPU attacks much harder
# argon2 parameters: -t 3 -m 65536 -p 4 (3 iterations, 64MB RAM, 4 threads)
Mistake 5: Not Authenticating Ciphertext
Encryption without authentication (plain AES-CBC) means an attacker can modify the ciphertext, and you will decrypt it to altered plaintext without knowing it was tampered with. This is worse than it sounds because of the bit-flipping attack in CBC mode:
If the attacker knows (or can guess) the plaintext at a specific position, they can XOR the corresponding byte in the previous ciphertext block to change the decrypted plaintext to any value they choose. For example, changing amount=100 to amount=900 by flipping specific bits. Without authentication, this modification is undetectable.
This is why AEAD (Authenticated Encryption with Associated Data) modes like GCM exist. They give you both confidentiality and integrity in one operation. If anyone modifies the ciphertext or the associated data, decryption fails with an authentication error.
Mistake 6: Hardcoding Keys in Source Code
An audit of a fintech application revealed its AES encryption key hardcoded as a string constant in the source code: `private static final String KEY = "a1b2c3d4e5f6a7b8..."`. The developers argued it was "obfuscated because the code was compiled." Decompiling the Java JAR with `jad` took about thirty seconds and revealed the key on line 42 of the decompiled source. Every piece of encrypted data in their database -- customer financial information, SSNs, account numbers -- was immediately decryptable. The key had been the same since the application was written four years earlier.
It gets worse. Running `trufflehog` against their git repository revealed that the key had been committed in plaintext in the initial commit, before someone moved it to a constants file. Even if they had rotated the key in the application, the old key was in git history forever, and could decrypt all historical data.
Use a secrets management system (HashiCorp Vault, AWS KMS, GCP KMS, Azure Key Vault). Keys should never exist in source code, configuration files, or environment variables on disk. The key management system should handle key rotation, access control, and audit logging. If you're encrypting data with AES, the AES key itself should be encrypted by a KMS-managed key (envelope encryption).
When to Use Which
| Use Case | Algorithm | Why |
|---|---|---|
| Encrypting data at rest (files, database fields) | AES-256-GCM with KMS-managed keys | Fast, authenticated, hardware-accelerated |
| Encrypting data in transit (TLS) | AES-128-GCM or AES-256-GCM (or ChaCha20-Poly1305) | Symmetric key negotiated via asymmetric handshake |
| Key exchange | ECDHE (X25519 or P-256) | Asymmetric, provides forward secrecy |
| Digital signatures | Ed25519 or ECDSA (P-256) | Asymmetric, small signatures, fast verification |
| Encrypting a message for a specific recipient | Hybrid: ECIES or RSA to encrypt a random AES key, AES-GCM for the data | Asymmetric for key transport, symmetric for bulk data |
| Password storage | NOT encryption -- use Argon2id/bcrypt/scrypt | Passwords should be hashed, not encrypted |
| API key / token encryption | AES-256-GCM with envelope encryption (KMS) | Keys rotated regularly, never in code |
| Disk encryption (full-disk) | AES-256-XTS (or AES-256-GCM for LUKS2) | XTS mode designed for storage encryption |
One critical distinction: password storage is not encryption at all. It is hashing, which is covered in the next chapter. You should never be able to decrypt a stored password -- only verify it by hashing the attempt and comparing. If someone asks you to "decrypt" a user's password, the system is designed wrong.
The Quantum Threat
You keep reading about quantum computers breaking encryption. Here is the nuanced answer instead of the clickbait version.
What quantum computers threaten:
- RSA: Shor's algorithm can factor large numbers in polynomial time, breaking RSA completely. A sufficiently large quantum computer could break RSA-2048 in hours.
- ECC: Shor's algorithm also solves the elliptic curve discrete logarithm problem, breaking all ECC (including Curve25519 and P-256) completely.
- AES: Grover's algorithm effectively halves the key size. AES-256 becomes 128-bit security (still safe). AES-128 becomes 64-bit security (potentially breakable).
What quantum computers DON'T threaten:
- Symmetric encryption with large enough keys (AES-256 is quantum-safe)
- Hash functions with sufficient output length (SHA-256 provides 128-bit quantum security)
- Properly sized MACs
Timeline (realistic estimates, as of 2025):
| Milestone | Estimated Year | Status |
|---|---|---|
| Current largest quantum computer | ~1,000+ qubits (noisy) | Not cryptographically relevant |
| Error-corrected logical qubits needed to break RSA-2048 | ~4,000 logical qubits (~20 million physical qubits) | Decades away |
| NIST post-quantum standards published | 2024 | ML-KEM (Kyber), ML-DSA (Dilithium), SLH-DSA (SPHINCS+) |
| "Harvest now, decrypt later" threat | NOW | Active adversaries are recording encrypted traffic |
| Hybrid key exchange deployment | 2023-present | Chrome, Cloudflare, AWS already deploying |
The "harvest now, decrypt later" threat is why some organizations are already migrating to post-quantum cryptography. If an adversary captures TLS-encrypted traffic today and stores it, they might be able to decrypt it in 15-20 years when quantum computers are available. For data with a long secrecy requirement (state secrets: 50+ years, medical records: lifetime, financial data: 7+ years), this is a real concern.
NIST finalized three post-quantum cryptographic standards in 2024:
- **ML-KEM (Module-Lattice-Based Key-Encapsulation Mechanism)**: Formerly CRYSTALS-Kyber. For key exchange. Replaces ECDHE.
- **ML-DSA (Module-Lattice-Based Digital Signature Algorithm)**: Formerly CRYSTALS-Dilithium. For digital signatures. Replaces ECDSA/Ed25519.
- **SLH-DSA (Stateless Hash-Based Digital Signature Algorithm)**: Formerly SPHINCS+. Backup signature scheme based on hash functions rather than lattices (different mathematical assumption).
Chrome and Cloudflare are already deploying hybrid key exchange (X25519 + ML-KEM-768) in production TLS connections. This means the key exchange uses both classical ECDHE and post-quantum ML-KEM — if either algorithm is secure, the combined key exchange is secure. This is the recommended migration path: hybrid mode that doesn't sacrifice security even if one algorithm is later broken.
The practical advice: start planning for post-quantum migration, but do not panic. Use AES-256 (quantum-resistant) for symmetric encryption. Deploy hybrid key exchange where available. And design your systems for cryptographic agility -- the ability to swap algorithms without rewriting your architecture.
Envelope Encryption: How the Real World Manages Keys
Key management is where cryptographic theory meets operational reality. You cannot just have one AES key for everything. The pattern used by every major cloud provider and security-conscious organization is called envelope encryption.
flowchart TD
subgraph CLOUD["Cloud KMS (AWS KMS, GCP KMS, Azure Key Vault)"]
CMK["Customer Master Key (CMK)<br/>Never leaves the KMS HSM<br/>Used only to encrypt/decrypt DEKs"]
end
subgraph APP["Your Application"]
CMK -->|"Encrypt DEK"| EDEK["Encrypted DEK<br/>(stored alongside data)"]
CMK -->|"Decrypt DEK"| DEK["Data Encryption Key (DEK)<br/>(plaintext, in memory only)"]
DEK -->|"AES-256-GCM"| DATA["Encrypted Data<br/>(stored in database/S3/disk)"]
end
subgraph STORAGE["Storage"]
EDEK2["Encrypted DEK + Encrypted Data<br/>stored together"]
end
NOTE["Why envelope encryption?<br/>• CMK never leaves HSM — hardware protection<br/>• Each data item can have its own DEK<br/>• Re-keying = re-encrypt DEKs, not all data<br/>• Key rotation: generate new CMK, re-wrap DEKs<br/>• Audit log: every CMK use is logged by KMS"]
style CMK fill:#e53e3e,color:#fff
style DEK fill:#dd6b20,color:#fff
style NOTE fill:#fff3cd,color:#1a202c
The process works like this:
-
Encrypting data: Your application asks KMS to generate a new Data Encryption Key (DEK). KMS returns both the plaintext DEK and a copy encrypted with the Customer Master Key (CMK). Your application encrypts the data with the plaintext DEK using AES-256-GCM, then stores the encrypted data alongside the encrypted DEK. The plaintext DEK is immediately deleted from memory.
-
Decrypting data: Your application reads the encrypted DEK and sends it to KMS for decryption. KMS returns the plaintext DEK. Your application decrypts the data with the DEK, then deletes the plaintext DEK from memory.
-
Key rotation: Generate a new CMK. Re-encrypt all DEKs with the new CMK. The actual data does not need to be re-encrypted -- only the small DEKs change. This makes rotation fast and cheap, even for petabytes of encrypted data.
# AWS KMS envelope encryption example
# Step 1: Generate a data key
$ aws kms generate-data-key \
--key-id alias/my-app-key \
--key-spec AES_256 \
--output json
# Returns:
# {
# "CiphertextBlob": "AQIDAHh...", ← Encrypted DEK (store this)
# "Plaintext": "a4f3b2c1...", ← Plaintext DEK (use, then delete)
# "KeyId": "arn:aws:kms:..."
# }
# Step 2: Encrypt data with the plaintext DEK (in your application code)
# Step 3: Store encrypted data + CiphertextBlob together
# Step 4: Delete plaintext DEK from memory
# To decrypt: send CiphertextBlob to KMS decrypt API
$ aws kms decrypt \
--ciphertext-blob fileb://encrypted_dek.bin \
--output json
The beauty of envelope encryption is that the CMK -- the most sensitive key -- never leaves the Hardware Security Module (HSM) inside the KMS. Your application never sees it. Even if your application server is completely compromised, the attacker gets encrypted data and encrypted DEKs. To decrypt anything, they need to call the KMS API, which requires IAM authentication and is fully audited. You can detect and revoke compromised credentials before significant data is exfiltrated.
Cryptographic Agility
Cryptographic agility means designing your systems so that you can change cryptographic algorithms without rewriting your application. When MD5 was broken, when SHA-1 was broken, when DES was retired -- organizations with cryptographic agility migrated quickly. Those without it spent years on painful migrations.
Practical guidelines for cryptographic agility:
- Store the algorithm identifier alongside encrypted data. Instead of storing just the ciphertext, store
{"algorithm": "AES-256-GCM", "iv": "...", "ciphertext": "...", "tag": "..."}. When you need to migrate to a new algorithm, new data uses the new algorithm, and old data can still be decrypted using the stored algorithm identifier. - Abstract cryptographic operations behind an interface. Your application code should call
encrypt(data)anddecrypt(data), notAES.new(key, AES.MODE_GCM, nonce=nonce).encrypt(data). The implementation is hidden behind the interface and can be swapped. - Use versioned key identifiers. Key
v1uses AES-256-GCM. Keyv2might use something else. The key version is stored with the ciphertext. - Plan for re-encryption. Design data pipelines that can re-encrypt data in the background when algorithms change, without downtime.
Putting It Into Practice
Let's do a hands-on exercise that ties together symmetric and asymmetric encryption.
**Scenario:** You need to send an encrypted file to a colleague using hybrid encryption.
Step 1: Your colleague generates a key pair and sends you their public key:
```bash
# Colleague generates RSA key pair (using RSA for compatibility)
openssl genrsa -out colleague_private.pem 4096
openssl rsa -in colleague_private.pem -pubout -out colleague_public.pem
# They send you colleague_public.pem (safe to share publicly!)
Step 2: You generate a random AES session key, encrypt your file, then encrypt the AES key with their public key:
# Generate random 256-bit AES session key
openssl rand -out session_key.bin 32
# Encrypt the file with AES-256-CBC (GCM not well-supported in CLI)
openssl enc -aes-256-cbc -salt -pbkdf2 -in secret_report.pdf \
-out secret_report.enc -pass file:session_key.bin
# Encrypt the session key with colleague's RSA public key
openssl pkeyutl -encrypt -pubin -inkey colleague_public.pem \
-in session_key.bin -out session_key.enc
# Send both secret_report.enc and session_key.enc
# Securely delete the plaintext session key!
shred -u session_key.bin # Linux
# rm -P session_key.bin # macOS
Step 3: Your colleague decrypts:
# Decrypt the session key with their private key
openssl pkeyutl -decrypt -inkey colleague_private.pem \
-in session_key.enc -out session_key.bin
# Decrypt the file with the recovered AES key
openssl enc -aes-256-cbc -d -salt -pbkdf2 -in secret_report.enc \
-out secret_report.pdf -pass file:session_key.bin
# Clean up
shred -u session_key.bin
This is exactly what PGP/GPG does internally. This is exactly what TLS does internally. You just implemented hybrid encryption by hand. In production, use GPG (gpg --encrypt --recipient colleague@company.com file.pdf) which handles all of this automatically with better key management.
---
## What You've Learned
This chapter covered the two fundamental types of encryption and how they work together:
- **Symmetric encryption** (AES) uses the same key for encryption and decryption. It is fast (8+ GB/s with hardware acceleration) and suitable for bulk data encryption. AES-256-GCM is the recommended choice, providing both confidentiality and integrity in a single operation. ChaCha20-Poly1305 is the alternative for platforms without AES hardware support.
- **Block cipher modes** are critical: ECB is broken and leaks patterns, CBC has padding oracle vulnerabilities, GCM is the modern AEAD standard that provides authenticated encryption.
- **Asymmetric encryption** (RSA, ECC) uses a key pair -- public for encryption, private for decryption. It solves the key distribution problem but is approximately 1000x slower than symmetric encryption. ECC (Curve25519, P-256) is preferred over RSA for new systems due to smaller keys and better performance.
- **Hybrid encryption** combines both: asymmetric crypto exchanges a symmetric key, then symmetric crypto encrypts the data. This is the foundation of TLS, PGP, and most real-world encryption systems.
- **Common mistakes** include ECB mode, nonce reuse (catastrophic for GCM), hardcoded keys, missing authentication (encrypt-only without MAC), weak key derivation from passwords, and rolling your own crypto.
- **Quantum computing** threatens asymmetric algorithms (RSA, ECC) but not AES-256. NIST has published post-quantum standards (ML-KEM, ML-DSA). Hybrid key exchange (classical + post-quantum) is being deployed now.
Now you understand how to keep data confidential. But encryption alone does not tell you if data has been modified. For that, you need hashing, MACs, and digital signatures -- which is where the next chapter goes. That is the "I" in CIA: integrity.