Encryption Flow

Envelope encryption and decryption paths for S3 objects using chunked AES-256-GCM with pluggable key providers. Hover over any component in the diagram for implementation details.

How it works

When encryption is enabled, every object stored through the S3 Orchestrator is encrypted before it leaves the server. The system uses envelope encryption — a two-layer key scheme where each object gets its own throwaway key, and that key is itself encrypted by a master key. Encryption is optional — see the configuration reference for how to enable it.

Key concepts

DEK (Data Encryption Key) — A random 32-byte (256-bit) key generated fresh for every object. This is the key that actually encrypts the data. No two objects share a DEK, so compromising one object’s key reveals nothing about any other object.
Master key — A long-lived key managed by a key provider (Vault Transit, a config value, or a key file). The master key never touches the data directly — it only encrypts and decrypts DEKs. This means the master key can be rotated without re-encrypting every object.
Nonce — A 12-byte random value that ensures the same plaintext encrypted with the same key produces different ciphertext each time. The system generates one base nonce per object, then mathematically derives a unique nonce for each chunk by XORing the base nonce with the chunk’s index number.
AES-256-GCM — The encryption algorithm. GCM mode provides both confidentiality (data is unreadable) and authenticity (any tampering is detected). Each chunk produces a 16-byte authentication tag that acts as a tamper seal.

Encrypting an object (write path)

Generate a DEK: 32 random bytes from the OS cryptographic random source.
Wrap the DEK: The master key encrypts the DEK via the configured key provider (e.g., Vault Transit API call). This produces a wrapped DEK — an opaque blob that can only be unwrapped by the same master key.
Generate a base nonce: 12 random bytes.
Write the header: A 32-byte header is prepended to the ciphertext stream — "SENC" magic bytes, format version, chunk size, and the base nonce.
Encrypt chunk by chunk (default 1 MB per chunk):
- Read up to 1 MB of plaintext.
- Derive this chunk’s nonce: take the base nonce and XOR the chunk index into its last 8 bytes.
- Encrypt the chunk with AES-256-GCM using the DEK and the derived nonce. This produces ciphertext + a 16-byte auth tag.
- Write to the output: nonce (12 bytes) | ciphertext | auth tag (16 bytes).
- Repeat until all plaintext is consumed.
Upload the ciphertext stream (header + chunks) to the S3 backend.
Store metadata in PostgreSQL: the wrapped DEK, the master key ID, the base nonce (packed together as baseNonce || wrappedDEK in the encryption_key column), and the original plaintext size.

The plaintext DEK is never stored anywhere — it exists only in memory during the encryption operation.

Decrypting an object (read path)

Retrieve metadata from PostgreSQL: the wrapped DEK, key ID, and base nonce.
Unwrap the DEK: The key provider decrypts the wrapped DEK using the master key identified by the stored key ID, recovering the original 32-byte DEK.
Parse the header: Read the 32-byte header from the ciphertext stream to get the chunk size and base nonce.
Decrypt chunk by chunk:
- Read one chunk: nonce (12B) | ciphertext | auth tag (16B).
- Verify the nonce matches the expected value (base nonce XOR chunk index) — this detects reordering or insertion attacks.
- Decrypt with AES-256-GCM, which also verifies the auth tag — if the data was tampered with, decryption fails.
- Stream the plaintext to the client.

Range requests (partial reads)

When a client requests a byte range (e.g., Range: bytes=1000-2000), the system doesn’t need to download and decrypt the entire object:

Translate the plaintext byte range to ciphertext byte range: figure out which chunks contain the requested bytes, accounting for the 32-byte header and 28-byte overhead per chunk.
Fetch only those chunks from the S3 backend using a range request.
Decrypt just the fetched chunks (typically 1–2 chunks for a small range).
Slice the plaintext to the exact requested bytes and stream them to the client.

The base nonce is stored in the database specifically so range decryption can derive per-chunk nonces without fetching the ciphertext header from the backend.

Legend

Color	Meaning
Forest green	Entry point
Amber	Integrity / filtering
Green border	Decision / branch
Teal	Processing step
Teal	Storage / KMS / DB
Green	Output