Scaffold Vinterliste — end-to-end encrypted winter activity list

Foundation for an E2E-encrypted activity list per
winter-list-claude-code-prompt.md.

Server (Bun + Hono):
- bun:sqlite with WAL and the spec's schema (idempotent migration)
- opaque server-stored sessions, httpOnly cookie
- signup / challenge / login / logout / me / password / recovery-challenge /
  recovery-complete
- activity CRUD with strict visibility rules: private uses ciphertext+nonce,
  semi never serializes owner_id, public attributes the owner
- tag store with normalisation + autocomplete (semi/public only)

Frontend (Svelte 5 + Vite):
- libsodium-wrappers-sumo for client crypto (Argon2id + XChaCha20-Poly1305).
  SUMO is required because the standard build omits crypto_pwhash.
- IndexedDB-backed private tag index (never leaves the browser)
- in-memory DEK (no localStorage); page reload re-prompts for password
- signup shows the recovery code once; tag input merges server + private
  sources with clear labelling
- Bokmål UI

Crypto module (shared/crypto.ts):
- pure, runs in both Bun and the browser via a runtime-conditional loader
  that papers over libsodium-wrappers-sumo's broken ESM entry (createRequire
  on server, Vite alias in the browser)
- DEK wrap/unwrap, AEAD payload encryption, recovery code generation with
  a visually-unambiguous alphabet

Verification:
- 22 crypto round-trip tests (wrap/unwrap, AEAD tamper rejection, password
  change preserves ciphertexts, recovery still works after rotation)
- typecheck passes for server and frontend
- Vite production build succeeds; libsodium SUMO chunk is ~315 KB gzipped

Single-image Containerfile for podman: builds frontend in a builder stage,
runs Bun in a slim runtime; one volume for the SQLite file; BUILD_DATE /
GIT_REVISION baked into OCI labels and /etc/build-info.

Known limitation deferred for this commit: the recovery endpoint has no
server-side proof of the recovery code (anyone who knows an email can lock
out the legitimate user, though they can't read any data). Closed in the
next commit.
This commit is contained in:
Ole-Morten Duesund 2026-05-25 12:27:14 +02:00
commit 47963c9225
39 changed files with 4007 additions and 0 deletions

215
SECURITY.md Normal file
View file

@ -0,0 +1,215 @@
# SECURITY.md — Vinterliste key & trust model
This document is the authoritative description of what the server can and cannot see,
and how keys are derived, wrapped, and rotated. The crypto code in `shared/crypto.ts`
is written against this document; if behaviour diverges from what's described here,
the document is the source of truth and the code must be fixed.
## Threat model
We assume:
- The server operator is honest-but-curious. They may inspect the database file,
the request logs, and memory of the server process at any point.
- The TLS terminator is trusted not to MITM. (For local podman deployment behind a
reverse proxy, this is operator-controlled.)
- The user's browser session is trusted while the user is logged in (the DEK lives
in memory there).
- An attacker may know the user's email address.
We protect against:
- Server-side disclosure of **private** activity contents (title, tags, location,
scheduled time).
- Server-side disclosure of the user's password.
- Account takeover by an attacker who has the database but not the password or the
recovery code.
We do **not** protect against:
- Compromise of the user's browser (XSS, malicious extensions). A logged-in client
holds the DEK in JS memory — anything in the same origin can read it. Mitigated
but not eliminated by a strict CSP.
- Targeted denial-of-service via password resets. See "Recovery flow" below.
- Side-channel attacks against `libsodium`'s WASM build (we use the SUMO
build of `libsodium-wrappers-sumo`; the standard build omits `crypto_pwhash`).
- Traffic analysis (number/timing/size of private activities is observable).
## Primitives (locked — do not substitute)
| Purpose | Primitive | libsodium API |
|----------------------------------------|-----------------------------------------------|--------------------------------------------|
| Password / recovery-code key derivation | Argon2id, 32-byte raw output | `crypto_pwhash` |
| Authenticated encryption (AEAD) | XChaCha20-Poly1305-IETF, 24-byte nonce | `crypto_aead_xchacha20poly1305_ietf_*` |
| Random bytes (DEK, salts, nonces, code) | CSPRNG | `randombytes_buf` |
| Server-side verifier hash | Argon2id (Bun's default tuning) | `Bun.password.hash` / `.verify` |
Argon2id parameters for `crypto_pwhash` use the libsodium `MODERATE` profile
(`crypto_pwhash_OPSLIMIT_MODERATE`, `crypto_pwhash_MEMLIMIT_MODERATE`). They are
recorded as constants in `shared/crypto.ts` and must be kept consistent between
signup and any future unlock — if parameters are tuned upward in the future, the
old parameters must be stored per-user so unlock still works.
## Per-user state (server-stored)
For each user the server stores **exactly**:
```
auth_salt (16 bytes, public) -- distinct from kek_salt
auth_verifier_hash (text) -- Bun.password.hash of the auth verifier
kek_salt (16 bytes, public) -- for password-derived KEK
wrapped_dek_pw (48 bytes) -- DEK encrypted under KEK_pw
dek_pw_nonce (24 bytes)
rec_salt (16 bytes, public) -- for recovery-code-derived KEK
wrapped_dek_rec (48 bytes) -- DEK encrypted under KEK_rec
dek_rec_nonce (24 bytes)
```
The server never sees, derives, or stores:
- The user's raw password.
- The recovery code.
- The DEK itself.
- Plaintext title / tags / location / scheduled time for any **private** activity.
## Why three salts?
Three independently random salts ensure that knowing one derivation tells you
nothing about another:
- `auth_salt` — input to the **auth verifier** the server holds.
- `kek_salt` — input to **KEK_pw**, which unwraps the DEK.
- `rec_salt` — input to **KEK_rec**, the recovery-code-derived unwrap key.
In particular, `auth_salt ≠ kek_salt` guarantees that even if a server-side
breach leaks the verifier hash *and* an attacker brute-forces it, they still
need to redo Argon2id against `kek_salt` to derive the KEK. The verifier hash
is never sufficient on its own.
## Signup flow (client-driven)
1. Client generates `dek` (32 bytes), `kek_salt`, `rec_salt`, `auth_salt` (16 bytes each).
2. Client generates a high-entropy `recovery_code` (≥120 bits), shows it to the
user, and never sends it.
3. Client derives:
- `kek_pw = pwhash(password, kek_salt)`
- `kek_rec = pwhash(recovery_code, rec_salt)`
- `auth_verifier = pwhash(password, auth_salt)` (≠ kek_pw because salts differ)
4. Client wraps:
- `wrapped_dek_pw = AEAD(kek_pw, dek, dek_pw_nonce)`
- `wrapped_dek_rec = AEAD(kek_rec, dek, dek_rec_nonce)`
5. Client posts the salts, wraps, nonces, and `auth_verifier` to the server.
6. Server hashes `auth_verifier` with `Bun.password.hash` and stores the row.
## Unlock / login flow
1. Client posts `{ email }` to `/api/auth/challenge`; server returns the public
parameters: `{ auth_salt, kek_salt, wrapped_dek_pw, dek_pw_nonce }`.
2. Client derives `kek_pw` and `auth_verifier` locally (two `pwhash` calls).
3. Client posts `{ email, auth_verifier }` to `/api/auth/login`; server verifies
via `Bun.password.verify` and, on success, issues an httpOnly session cookie.
4. Client unwraps the DEK locally. DEK lives in JS memory for the session.
## Password change
1. Client unwraps DEK with the old `kek_pw`.
2. Client generates `kek_salt_new`, `dek_pw_nonce_new`, `auth_salt_new`, derives
`kek_pw_new` and `auth_verifier_new`, and produces `wrapped_dek_pw_new`.
3. Client posts the new material to `/api/auth/password`. The server updates the
password wrap and verifier in a single transaction.
4. The recovery wrap (`wrapped_dek_rec`, `rec_salt`) is **not touched**
the recovery code still works.
5. Activity ciphertexts are **not re-encrypted** — they're still under the
same DEK.
## Recovery flow
The recovery code path is intentionally symmetric to the password path. The
server cannot tell whether the submitted new wrap is "of the same DEK" — it
just stores what the client sends. Trust is anchored entirely in the
recovery-code holder.
1. Client posts `{ email }` to `/api/auth/recovery-challenge`; server returns
`{ rec_salt, wrapped_dek_rec, dek_rec_nonce }`.
2. Client derives `kek_rec = pwhash(recovery_code, rec_salt)` and unwraps the DEK.
3. Client chooses a new password, derives new salts/verifier/wrap as in signup,
and posts to `/api/auth/recovery-complete`.
4. Server replaces the password wrap, auth salt, and verifier in a single
transaction. The recovery wrap is unchanged (the same recovery code keeps
working).
### Known limitation: lockout DoS
Anyone who knows a user's email can trigger `/api/auth/recovery-complete` and,
without the recovery code, submit a "junk" new password wrap. The data is **not
disclosed** (the attacker can't decrypt anything), but the legitimate user is
locked out unless they still hold a logged-in session or the recovery code.
Mitigations (out of scope for the scaffold but intended):
- Per-IP and per-email rate limiting on recovery endpoints.
- Email confirmation before activating the new password wrap.
- A "recovery verifier" stored server-side so the server can reject submissions
that don't prove knowledge of the recovery code. This is a deviation from the
spec's stated storage model and is therefore deferred.
## Private activity encryption
A `private` activity has a JSON payload:
```ts
{ title: string; tags: string[]; loc_label?: string; loc_lat?: number; loc_lng?: number; scheduled_at?: number }
```
The payload is JSON-serialized, encoded as UTF-8, and encrypted with
`crypto_aead_xchacha20poly1305_ietf_encrypt(payload, /* additional data */ null,
nonce, dek)`. A fresh nonce is generated for every write (including updates).
The row stores only `ciphertext` and `nonce`. The `title`, `loc_*`, and
`scheduled_at` columns are `NULL` and not used. No row in `tags` or
`activity_tags` is created for a private activity — those tables hold only
public/semi tag data.
## Visibility transitions
Visibility transitions are explicit, client-driven operations rather than
server flags:
- `private → semi/public`: client decrypts locally, then issues a normal update
that sets plaintext columns and clears `ciphertext`/`nonce`.
- `semi/public → private`: client reads the plaintext, encrypts locally, then
issues an update that sets `ciphertext`/`nonce` and clears plaintext columns.
Server also deletes any rows in `activity_tags` for that activity.
- `semi ↔ public`: server-side toggle. `owner_id` is unchanged; the API simply
starts or stops including the owner in serialized responses.
## Serialization rules
- For `semi` activities, API responses must **not** include `owner_id` or any
field that identifies the creator. The server still has `owner_id` for
authorization (only the owner can edit/delete), but it is stripped from
responses.
- For `public` activities, `owner_id` (or a derived public handle) **is**
serialized.
- For `private` activities, responses are only returned to the owner and contain
`ciphertext` + `nonce`; the client decrypts.
## Tags
- Server-side `tags` and `activity_tags` tables hold only tags for `semi` and
`public` activities. They are normalized to lowercase trimmed strings, joined
to activities via `activity_tags`.
- Private tags are stored only inside the encrypted activity payload, and
indexed client-side in IndexedDB for autocomplete. They never reach the
server.
- The autocomplete endpoint returns matches from the server-side `tags` table
only. The frontend may merge those results with the IndexedDB index, clearly
labelled.
## Things to flag, not silently change
The spec invites flagging anything cryptographically unsound. The above design
follows the spec exactly. The one place where I would push back if asked to go
to production (not deferred for this scaffold) is the recovery lockout DoS —
without a server-side proof of the recovery code, the recovery endpoint is a
soft-DoS vector. Documented above, deferred for the scaffold.