vinterliste/SECURITY.md

215 lines
10 KiB
Markdown
Raw Normal View History

Scaffold Vinterliste — end-to-end encrypted winter activity list Foundation for an E2E-encrypted activity list per winter-list-claude-code-prompt.md. Server (Bun + Hono): - bun:sqlite with WAL and the spec's schema (idempotent migration) - opaque server-stored sessions, httpOnly cookie - signup / challenge / login / logout / me / password / recovery-challenge / recovery-complete - activity CRUD with strict visibility rules: private uses ciphertext+nonce, semi never serializes owner_id, public attributes the owner - tag store with normalisation + autocomplete (semi/public only) Frontend (Svelte 5 + Vite): - libsodium-wrappers-sumo for client crypto (Argon2id + XChaCha20-Poly1305). SUMO is required because the standard build omits crypto_pwhash. - IndexedDB-backed private tag index (never leaves the browser) - in-memory DEK (no localStorage); page reload re-prompts for password - signup shows the recovery code once; tag input merges server + private sources with clear labelling - Bokmål UI Crypto module (shared/crypto.ts): - pure, runs in both Bun and the browser via a runtime-conditional loader that papers over libsodium-wrappers-sumo's broken ESM entry (createRequire on server, Vite alias in the browser) - DEK wrap/unwrap, AEAD payload encryption, recovery code generation with a visually-unambiguous alphabet Verification: - 22 crypto round-trip tests (wrap/unwrap, AEAD tamper rejection, password change preserves ciphertexts, recovery still works after rotation) - typecheck passes for server and frontend - Vite production build succeeds; libsodium SUMO chunk is ~315 KB gzipped Single-image Containerfile for podman: builds frontend in a builder stage, runs Bun in a slim runtime; one volume for the SQLite file; BUILD_DATE / GIT_REVISION baked into OCI labels and /etc/build-info. Known limitation deferred for this commit: the recovery endpoint has no server-side proof of the recovery code (anyone who knows an email can lock out the legitimate user, though they can't read any data). Closed in the next commit.
2026-05-25 12:27:14 +02:00
# SECURITY.md — Vinterliste key & trust model
This document is the authoritative description of what the server can and cannot see,
and how keys are derived, wrapped, and rotated. The crypto code in `shared/crypto.ts`
is written against this document; if behaviour diverges from what's described here,
the document is the source of truth and the code must be fixed.
## Threat model
We assume:
- The server operator is honest-but-curious. They may inspect the database file,
the request logs, and memory of the server process at any point.
- The TLS terminator is trusted not to MITM. (For local podman deployment behind a
reverse proxy, this is operator-controlled.)
- The user's browser session is trusted while the user is logged in (the DEK lives
in memory there).
- An attacker may know the user's email address.
We protect against:
- Server-side disclosure of **private** activity contents (title, tags, location,
scheduled time).
- Server-side disclosure of the user's password.
- Account takeover by an attacker who has the database but not the password or the
recovery code.
We do **not** protect against:
- Compromise of the user's browser (XSS, malicious extensions). A logged-in client
holds the DEK in JS memory — anything in the same origin can read it. Mitigated
but not eliminated by a strict CSP.
- Targeted denial-of-service via password resets. See "Recovery flow" below.
- Side-channel attacks against `libsodium`'s WASM build (we use the SUMO
build of `libsodium-wrappers-sumo`; the standard build omits `crypto_pwhash`).
- Traffic analysis (number/timing/size of private activities is observable).
## Primitives (locked — do not substitute)
| Purpose | Primitive | libsodium API |
|----------------------------------------|-----------------------------------------------|--------------------------------------------|
| Password / recovery-code key derivation | Argon2id, 32-byte raw output | `crypto_pwhash` |
| Authenticated encryption (AEAD) | XChaCha20-Poly1305-IETF, 24-byte nonce | `crypto_aead_xchacha20poly1305_ietf_*` |
| Random bytes (DEK, salts, nonces, code) | CSPRNG | `randombytes_buf` |
| Server-side verifier hash | Argon2id (Bun's default tuning) | `Bun.password.hash` / `.verify` |
Argon2id parameters for `crypto_pwhash` use the libsodium `MODERATE` profile
(`crypto_pwhash_OPSLIMIT_MODERATE`, `crypto_pwhash_MEMLIMIT_MODERATE`). They are
recorded as constants in `shared/crypto.ts` and must be kept consistent between
signup and any future unlock — if parameters are tuned upward in the future, the
old parameters must be stored per-user so unlock still works.
## Per-user state (server-stored)
For each user the server stores **exactly**:
```
auth_salt (16 bytes, public) -- distinct from kek_salt
auth_verifier_hash (text) -- Bun.password.hash of the auth verifier
kek_salt (16 bytes, public) -- for password-derived KEK
wrapped_dek_pw (48 bytes) -- DEK encrypted under KEK_pw
dek_pw_nonce (24 bytes)
rec_salt (16 bytes, public) -- for recovery-code-derived KEK
wrapped_dek_rec (48 bytes) -- DEK encrypted under KEK_rec
dek_rec_nonce (24 bytes)
```
The server never sees, derives, or stores:
- The user's raw password.
- The recovery code.
- The DEK itself.
- Plaintext title / tags / location / scheduled time for any **private** activity.
## Why three salts?
Three independently random salts ensure that knowing one derivation tells you
nothing about another:
- `auth_salt` — input to the **auth verifier** the server holds.
- `kek_salt` — input to **KEK_pw**, which unwraps the DEK.
- `rec_salt` — input to **KEK_rec**, the recovery-code-derived unwrap key.
In particular, `auth_salt ≠ kek_salt` guarantees that even if a server-side
breach leaks the verifier hash *and* an attacker brute-forces it, they still
need to redo Argon2id against `kek_salt` to derive the KEK. The verifier hash
is never sufficient on its own.
## Signup flow (client-driven)
1. Client generates `dek` (32 bytes), `kek_salt`, `rec_salt`, `auth_salt` (16 bytes each).
2. Client generates a high-entropy `recovery_code` (≥120 bits), shows it to the
user, and never sends it.
3. Client derives:
- `kek_pw = pwhash(password, kek_salt)`
- `kek_rec = pwhash(recovery_code, rec_salt)`
- `auth_verifier = pwhash(password, auth_salt)` (≠ kek_pw because salts differ)
4. Client wraps:
- `wrapped_dek_pw = AEAD(kek_pw, dek, dek_pw_nonce)`
- `wrapped_dek_rec = AEAD(kek_rec, dek, dek_rec_nonce)`
5. Client posts the salts, wraps, nonces, and `auth_verifier` to the server.
6. Server hashes `auth_verifier` with `Bun.password.hash` and stores the row.
## Unlock / login flow
1. Client posts `{ email }` to `/api/auth/challenge`; server returns the public
parameters: `{ auth_salt, kek_salt, wrapped_dek_pw, dek_pw_nonce }`.
2. Client derives `kek_pw` and `auth_verifier` locally (two `pwhash` calls).
3. Client posts `{ email, auth_verifier }` to `/api/auth/login`; server verifies
via `Bun.password.verify` and, on success, issues an httpOnly session cookie.
4. Client unwraps the DEK locally. DEK lives in JS memory for the session.
## Password change
1. Client unwraps DEK with the old `kek_pw`.
2. Client generates `kek_salt_new`, `dek_pw_nonce_new`, `auth_salt_new`, derives
`kek_pw_new` and `auth_verifier_new`, and produces `wrapped_dek_pw_new`.
3. Client posts the new material to `/api/auth/password`. The server updates the
password wrap and verifier in a single transaction.
4. The recovery wrap (`wrapped_dek_rec`, `rec_salt`) is **not touched**
the recovery code still works.
5. Activity ciphertexts are **not re-encrypted** — they're still under the
same DEK.
## Recovery flow
The recovery code path is intentionally symmetric to the password path. The
server cannot tell whether the submitted new wrap is "of the same DEK" — it
just stores what the client sends. Trust is anchored entirely in the
recovery-code holder.
1. Client posts `{ email }` to `/api/auth/recovery-challenge`; server returns
`{ rec_salt, wrapped_dek_rec, dek_rec_nonce }`.
2. Client derives `kek_rec = pwhash(recovery_code, rec_salt)` and unwraps the DEK.
3. Client chooses a new password, derives new salts/verifier/wrap as in signup,
and posts to `/api/auth/recovery-complete`.
4. Server replaces the password wrap, auth salt, and verifier in a single
transaction. The recovery wrap is unchanged (the same recovery code keeps
working).
### Known limitation: lockout DoS
Anyone who knows a user's email can trigger `/api/auth/recovery-complete` and,
without the recovery code, submit a "junk" new password wrap. The data is **not
disclosed** (the attacker can't decrypt anything), but the legitimate user is
locked out unless they still hold a logged-in session or the recovery code.
Mitigations (out of scope for the scaffold but intended):
- Per-IP and per-email rate limiting on recovery endpoints.
- Email confirmation before activating the new password wrap.
- A "recovery verifier" stored server-side so the server can reject submissions
that don't prove knowledge of the recovery code. This is a deviation from the
spec's stated storage model and is therefore deferred.
## Private activity encryption
A `private` activity has a JSON payload:
```ts
{ title: string; tags: string[]; loc_label?: string; loc_lat?: number; loc_lng?: number; scheduled_at?: number }
```
The payload is JSON-serialized, encoded as UTF-8, and encrypted with
`crypto_aead_xchacha20poly1305_ietf_encrypt(payload, /* additional data */ null,
nonce, dek)`. A fresh nonce is generated for every write (including updates).
The row stores only `ciphertext` and `nonce`. The `title`, `loc_*`, and
`scheduled_at` columns are `NULL` and not used. No row in `tags` or
`activity_tags` is created for a private activity — those tables hold only
public/semi tag data.
## Visibility transitions
Visibility transitions are explicit, client-driven operations rather than
server flags:
- `private → semi/public`: client decrypts locally, then issues a normal update
that sets plaintext columns and clears `ciphertext`/`nonce`.
- `semi/public → private`: client reads the plaintext, encrypts locally, then
issues an update that sets `ciphertext`/`nonce` and clears plaintext columns.
Server also deletes any rows in `activity_tags` for that activity.
- `semi ↔ public`: server-side toggle. `owner_id` is unchanged; the API simply
starts or stops including the owner in serialized responses.
## Serialization rules
- For `semi` activities, API responses must **not** include `owner_id` or any
field that identifies the creator. The server still has `owner_id` for
authorization (only the owner can edit/delete), but it is stripped from
responses.
- For `public` activities, `owner_id` (or a derived public handle) **is**
serialized.
- For `private` activities, responses are only returned to the owner and contain
`ciphertext` + `nonce`; the client decrypts.
## Tags
- Server-side `tags` and `activity_tags` tables hold only tags for `semi` and
`public` activities. They are normalized to lowercase trimmed strings, joined
to activities via `activity_tags`.
- Private tags are stored only inside the encrypted activity payload, and
indexed client-side in IndexedDB for autocomplete. They never reach the
server.
- The autocomplete endpoint returns matches from the server-side `tags` table
only. The frontend may merge those results with the IndexedDB index, clearly
labelled.
## Things to flag, not silently change
The spec invites flagging anything cryptographically unsound. The above design
follows the spec exactly. The one place where I would push back if asked to go
to production (not deferred for this scaffold) is the recovery lockout DoS —
without a server-side proof of the recovery code, the recovery endpoint is a
soft-DoS vector. Documented above, deferred for the scaffold.