vinterliste/SECURITY.md
Ole-Morten Duesund 47963c9225 Scaffold Vinterliste — end-to-end encrypted winter activity list
Foundation for an E2E-encrypted activity list per
winter-list-claude-code-prompt.md.

Server (Bun + Hono):
- bun:sqlite with WAL and the spec's schema (idempotent migration)
- opaque server-stored sessions, httpOnly cookie
- signup / challenge / login / logout / me / password / recovery-challenge /
  recovery-complete
- activity CRUD with strict visibility rules: private uses ciphertext+nonce,
  semi never serializes owner_id, public attributes the owner
- tag store with normalisation + autocomplete (semi/public only)

Frontend (Svelte 5 + Vite):
- libsodium-wrappers-sumo for client crypto (Argon2id + XChaCha20-Poly1305).
  SUMO is required because the standard build omits crypto_pwhash.
- IndexedDB-backed private tag index (never leaves the browser)
- in-memory DEK (no localStorage); page reload re-prompts for password
- signup shows the recovery code once; tag input merges server + private
  sources with clear labelling
- Bokmål UI

Crypto module (shared/crypto.ts):
- pure, runs in both Bun and the browser via a runtime-conditional loader
  that papers over libsodium-wrappers-sumo's broken ESM entry (createRequire
  on server, Vite alias in the browser)
- DEK wrap/unwrap, AEAD payload encryption, recovery code generation with
  a visually-unambiguous alphabet

Verification:
- 22 crypto round-trip tests (wrap/unwrap, AEAD tamper rejection, password
  change preserves ciphertexts, recovery still works after rotation)
- typecheck passes for server and frontend
- Vite production build succeeds; libsodium SUMO chunk is ~315 KB gzipped

Single-image Containerfile for podman: builds frontend in a builder stage,
runs Bun in a slim runtime; one volume for the SQLite file; BUILD_DATE /
GIT_REVISION baked into OCI labels and /etc/build-info.

Known limitation deferred for this commit: the recovery endpoint has no
server-side proof of the recovery code (anyone who knows an email can lock
out the legitimate user, though they can't read any data). Closed in the
next commit.
2026-05-25 12:27:14 +02:00

10 KiB

SECURITY.md — Vinterliste key & trust model

This document is the authoritative description of what the server can and cannot see, and how keys are derived, wrapped, and rotated. The crypto code in shared/crypto.ts is written against this document; if behaviour diverges from what's described here, the document is the source of truth and the code must be fixed.

Threat model

We assume:

  • The server operator is honest-but-curious. They may inspect the database file, the request logs, and memory of the server process at any point.
  • The TLS terminator is trusted not to MITM. (For local podman deployment behind a reverse proxy, this is operator-controlled.)
  • The user's browser session is trusted while the user is logged in (the DEK lives in memory there).
  • An attacker may know the user's email address.

We protect against:

  • Server-side disclosure of private activity contents (title, tags, location, scheduled time).
  • Server-side disclosure of the user's password.
  • Account takeover by an attacker who has the database but not the password or the recovery code.

We do not protect against:

  • Compromise of the user's browser (XSS, malicious extensions). A logged-in client holds the DEK in JS memory — anything in the same origin can read it. Mitigated but not eliminated by a strict CSP.
  • Targeted denial-of-service via password resets. See "Recovery flow" below.
  • Side-channel attacks against libsodium's WASM build (we use the SUMO build of libsodium-wrappers-sumo; the standard build omits crypto_pwhash).
  • Traffic analysis (number/timing/size of private activities is observable).

Primitives (locked — do not substitute)

Purpose Primitive libsodium API
Password / recovery-code key derivation Argon2id, 32-byte raw output crypto_pwhash
Authenticated encryption (AEAD) XChaCha20-Poly1305-IETF, 24-byte nonce crypto_aead_xchacha20poly1305_ietf_*
Random bytes (DEK, salts, nonces, code) CSPRNG randombytes_buf
Server-side verifier hash Argon2id (Bun's default tuning) Bun.password.hash / .verify

Argon2id parameters for crypto_pwhash use the libsodium MODERATE profile (crypto_pwhash_OPSLIMIT_MODERATE, crypto_pwhash_MEMLIMIT_MODERATE). They are recorded as constants in shared/crypto.ts and must be kept consistent between signup and any future unlock — if parameters are tuned upward in the future, the old parameters must be stored per-user so unlock still works.

Per-user state (server-stored)

For each user the server stores exactly:

auth_salt            (16 bytes, public)   -- distinct from kek_salt
auth_verifier_hash   (text)               -- Bun.password.hash of the auth verifier
kek_salt             (16 bytes, public)   -- for password-derived KEK
wrapped_dek_pw       (48 bytes)           -- DEK encrypted under KEK_pw
dek_pw_nonce         (24 bytes)
rec_salt             (16 bytes, public)   -- for recovery-code-derived KEK
wrapped_dek_rec      (48 bytes)           -- DEK encrypted under KEK_rec
dek_rec_nonce        (24 bytes)

The server never sees, derives, or stores:

  • The user's raw password.
  • The recovery code.
  • The DEK itself.
  • Plaintext title / tags / location / scheduled time for any private activity.

Why three salts?

Three independently random salts ensure that knowing one derivation tells you nothing about another:

  • auth_salt — input to the auth verifier the server holds.
  • kek_salt — input to KEK_pw, which unwraps the DEK.
  • rec_salt — input to KEK_rec, the recovery-code-derived unwrap key.

In particular, auth_salt ≠ kek_salt guarantees that even if a server-side breach leaks the verifier hash and an attacker brute-forces it, they still need to redo Argon2id against kek_salt to derive the KEK. The verifier hash is never sufficient on its own.

Signup flow (client-driven)

  1. Client generates dek (32 bytes), kek_salt, rec_salt, auth_salt (16 bytes each).
  2. Client generates a high-entropy recovery_code (≥120 bits), shows it to the user, and never sends it.
  3. Client derives:
    • kek_pw = pwhash(password, kek_salt)
    • kek_rec = pwhash(recovery_code, rec_salt)
    • auth_verifier = pwhash(password, auth_salt) (≠ kek_pw because salts differ)
  4. Client wraps:
    • wrapped_dek_pw = AEAD(kek_pw, dek, dek_pw_nonce)
    • wrapped_dek_rec = AEAD(kek_rec, dek, dek_rec_nonce)
  5. Client posts the salts, wraps, nonces, and auth_verifier to the server.
  6. Server hashes auth_verifier with Bun.password.hash and stores the row.

Unlock / login flow

  1. Client posts { email } to /api/auth/challenge; server returns the public parameters: { auth_salt, kek_salt, wrapped_dek_pw, dek_pw_nonce }.
  2. Client derives kek_pw and auth_verifier locally (two pwhash calls).
  3. Client posts { email, auth_verifier } to /api/auth/login; server verifies via Bun.password.verify and, on success, issues an httpOnly session cookie.
  4. Client unwraps the DEK locally. DEK lives in JS memory for the session.

Password change

  1. Client unwraps DEK with the old kek_pw.
  2. Client generates kek_salt_new, dek_pw_nonce_new, auth_salt_new, derives kek_pw_new and auth_verifier_new, and produces wrapped_dek_pw_new.
  3. Client posts the new material to /api/auth/password. The server updates the password wrap and verifier in a single transaction.
  4. The recovery wrap (wrapped_dek_rec, rec_salt) is not touched — the recovery code still works.
  5. Activity ciphertexts are not re-encrypted — they're still under the same DEK.

Recovery flow

The recovery code path is intentionally symmetric to the password path. The server cannot tell whether the submitted new wrap is "of the same DEK" — it just stores what the client sends. Trust is anchored entirely in the recovery-code holder.

  1. Client posts { email } to /api/auth/recovery-challenge; server returns { rec_salt, wrapped_dek_rec, dek_rec_nonce }.
  2. Client derives kek_rec = pwhash(recovery_code, rec_salt) and unwraps the DEK.
  3. Client chooses a new password, derives new salts/verifier/wrap as in signup, and posts to /api/auth/recovery-complete.
  4. Server replaces the password wrap, auth salt, and verifier in a single transaction. The recovery wrap is unchanged (the same recovery code keeps working).

Known limitation: lockout DoS

Anyone who knows a user's email can trigger /api/auth/recovery-complete and, without the recovery code, submit a "junk" new password wrap. The data is not disclosed (the attacker can't decrypt anything), but the legitimate user is locked out unless they still hold a logged-in session or the recovery code.

Mitigations (out of scope for the scaffold but intended):

  • Per-IP and per-email rate limiting on recovery endpoints.
  • Email confirmation before activating the new password wrap.
  • A "recovery verifier" stored server-side so the server can reject submissions that don't prove knowledge of the recovery code. This is a deviation from the spec's stated storage model and is therefore deferred.

Private activity encryption

A private activity has a JSON payload:

{ title: string; tags: string[]; loc_label?: string; loc_lat?: number; loc_lng?: number; scheduled_at?: number }

The payload is JSON-serialized, encoded as UTF-8, and encrypted with crypto_aead_xchacha20poly1305_ietf_encrypt(payload, /* additional data */ null, nonce, dek). A fresh nonce is generated for every write (including updates).

The row stores only ciphertext and nonce. The title, loc_*, and scheduled_at columns are NULL and not used. No row in tags or activity_tags is created for a private activity — those tables hold only public/semi tag data.

Visibility transitions

Visibility transitions are explicit, client-driven operations rather than server flags:

  • private → semi/public: client decrypts locally, then issues a normal update that sets plaintext columns and clears ciphertext/nonce.
  • semi/public → private: client reads the plaintext, encrypts locally, then issues an update that sets ciphertext/nonce and clears plaintext columns. Server also deletes any rows in activity_tags for that activity.
  • semi ↔ public: server-side toggle. owner_id is unchanged; the API simply starts or stops including the owner in serialized responses.

Serialization rules

  • For semi activities, API responses must not include owner_id or any field that identifies the creator. The server still has owner_id for authorization (only the owner can edit/delete), but it is stripped from responses.
  • For public activities, owner_id (or a derived public handle) is serialized.
  • For private activities, responses are only returned to the owner and contain ciphertext + nonce; the client decrypts.

Tags

  • Server-side tags and activity_tags tables hold only tags for semi and public activities. They are normalized to lowercase trimmed strings, joined to activities via activity_tags.
  • Private tags are stored only inside the encrypted activity payload, and indexed client-side in IndexedDB for autocomplete. They never reach the server.
  • The autocomplete endpoint returns matches from the server-side tags table only. The frontend may merge those results with the IndexedDB index, clearly labelled.

Things to flag, not silently change

The spec invites flagging anything cryptographically unsound. The above design follows the spec exactly. The one place where I would push back if asked to go to production (not deferred for this scaffold) is the recovery lockout DoS — without a server-side proof of the recovery code, the recovery endpoint is a soft-DoS vector. Documented above, deferred for the scaffold.