vinterliste/SECURITY.md

# SECURITY.md — Vinterliste key & trust model

This document is the authoritative description of what the server can and cannot see,
and how keys are derived, wrapped, and rotated. The crypto code in `shared/crypto.ts`
is written against this document; if behaviour diverges from what's described here,
the document is the source of truth and the code must be fixed.

## Threat model

We assume:

- The server operator is honest-but-curious. They may inspect the database file,
  the request logs, and memory of the server process at any point.
- The TLS terminator is trusted not to MITM. (For local podman deployment behind a
  reverse proxy, this is operator-controlled.)
- The user's browser session is trusted while the user is logged in (the DEK lives
  in memory there).
- An attacker may know the user's email address.

We protect against:

- Server-side disclosure of **private** activity contents (title, tags, location,
  scheduled time).
- Server-side disclosure of the user's password.
- Account takeover by an attacker who has the database but not the password or the
  recovery code.

We do **not** protect against:

- Compromise of the user's browser (XSS, malicious extensions). A logged-in client
  holds the DEK in JS memory — anything in the same origin can read it. Mitigated
  but not eliminated by a strict CSP.
- Targeted denial-of-service via password resets. See "Recovery flow" below.
- Side-channel attacks against `libsodium`'s WASM build (we use the SUMO
  build of `libsodium-wrappers-sumo`; the standard build omits `crypto_pwhash`).
- Traffic analysis (number/timing/size of private activities is observable).

## Primitives (locked — do not substitute)

| Purpose                                | Primitive                                     | libsodium API                              |
|----------------------------------------|-----------------------------------------------|--------------------------------------------|
| Password / recovery-code key derivation | Argon2id, 32-byte raw output                  | `crypto_pwhash`                            |
| Authenticated encryption (AEAD)         | XChaCha20-Poly1305-IETF, 24-byte nonce        | `crypto_aead_xchacha20poly1305_ietf_*`     |
| Random bytes (DEK, salts, nonces, code) | CSPRNG                                        | `randombytes_buf`                          |
| Server-side verifier hash               | Argon2id (Bun's default tuning)               | `Bun.password.hash` / `.verify`            |

Argon2id parameters for `crypto_pwhash` use the libsodium `MODERATE` profile
(`crypto_pwhash_OPSLIMIT_MODERATE`, `crypto_pwhash_MEMLIMIT_MODERATE`). They are
recorded as constants in `shared/crypto.ts` and must be kept consistent between
signup and any future unlock — if parameters are tuned upward in the future, the
old parameters must be stored per-user so unlock still works.

## Per-user state (server-stored)

For each user the server stores **exactly**:

```
auth_salt            (16 bytes, public)   -- distinct from kek_salt
auth_verifier_hash   (text)               -- Bun.password.hash of the auth verifier
kek_salt             (16 bytes, public)   -- for password-derived KEK
wrapped_dek_pw       (48 bytes)           -- DEK encrypted under KEK_pw
dek_pw_nonce         (24 bytes)
rec_salt             (16 bytes, public)   -- for recovery-code-derived KEK
wrapped_dek_rec      (48 bytes)           -- DEK encrypted under KEK_rec
dek_rec_nonce        (24 bytes)
```

The server never sees, derives, or stores:

- The user's raw password.
- The recovery code.
- The DEK itself.
- Plaintext title / tags / location / scheduled time for any **private** activity.

## Why three salts?

Three independently random salts ensure that knowing one derivation tells you
nothing about another:

- `auth_salt` — input to the **auth verifier** the server holds.
- `kek_salt` — input to **KEK_pw**, which unwraps the DEK.
- `rec_salt` — input to **KEK_rec**, the recovery-code-derived unwrap key.

In particular, `auth_salt ≠ kek_salt` guarantees that even if a server-side
breach leaks the verifier hash *and* an attacker brute-forces it, they still
need to redo Argon2id against `kek_salt` to derive the KEK. The verifier hash
is never sufficient on its own.

## Signup flow (client-driven)

1. Client generates `dek` (32 bytes), `kek_salt`, `rec_salt`, `auth_salt` (16 bytes each).
2. Client generates a high-entropy `recovery_code` (≥120 bits), shows it to the
   user, and never sends it.
3. Client derives:
   - `kek_pw  = pwhash(password, kek_salt)`
   - `kek_rec = pwhash(recovery_code, rec_salt)`
   - `auth_verifier = pwhash(password, auth_salt)`  (≠ kek_pw because salts differ)
4. Client wraps:
   - `wrapped_dek_pw  = AEAD(kek_pw,  dek, dek_pw_nonce)`
   - `wrapped_dek_rec = AEAD(kek_rec, dek, dek_rec_nonce)`
5. Client posts the salts, wraps, nonces, and `auth_verifier` to the server.
6. Server hashes `auth_verifier` with `Bun.password.hash` and stores the row.

## Unlock / login flow

1. Client posts `{ email }` to `/api/auth/challenge`; server returns the public
   parameters: `{ auth_salt, kek_salt, wrapped_dek_pw, dek_pw_nonce }`.
2. Client derives `kek_pw` and `auth_verifier` locally (two `pwhash` calls).
3. Client posts `{ email, auth_verifier }` to `/api/auth/login`; server verifies
   via `Bun.password.verify` and, on success, issues an httpOnly session cookie.
4. Client unwraps the DEK locally. DEK lives in JS memory for the session.

## Password change

1. Client unwraps DEK with the old `kek_pw`.
2. Client generates `kek_salt_new`, `dek_pw_nonce_new`, `auth_salt_new`, derives
   `kek_pw_new` and `auth_verifier_new`, and produces `wrapped_dek_pw_new`.
3. Client posts the new material to `/api/auth/password`. The server updates the
   password wrap and verifier in a single transaction.
4. The recovery wrap (`wrapped_dek_rec`, `rec_salt`) is **not touched** —
   the recovery code still works.
5. Activity ciphertexts are **not re-encrypted** — they're still under the
   same DEK.

## Recovery flow

The recovery code path is intentionally symmetric to the password path. The
server cannot tell whether the submitted new wrap is "of the same DEK" — it
just stores what the client sends. Trust is anchored entirely in the
recovery-code holder.

1. Client posts `{ email }` to `/api/auth/recovery-challenge`; server returns
   `{ rec_salt, wrapped_dek_rec, dek_rec_nonce }`.
2. Client derives `kek_rec = pwhash(recovery_code, rec_salt)` and unwraps the DEK.
3. Client chooses a new password, derives new salts/verifier/wrap as in signup,
   and posts to `/api/auth/recovery-complete`.
4. Server replaces the password wrap, auth salt, and verifier in a single
   transaction. The recovery wrap is unchanged (the same recovery code keeps
   working).

### Known limitation: lockout DoS

Anyone who knows a user's email can trigger `/api/auth/recovery-complete` and,
without the recovery code, submit a "junk" new password wrap. The data is **not
disclosed** (the attacker can't decrypt anything), but the legitimate user is
locked out unless they still hold a logged-in session or the recovery code.

Mitigations (out of scope for the scaffold but intended):

- Per-IP and per-email rate limiting on recovery endpoints.
- Email confirmation before activating the new password wrap.
- A "recovery verifier" stored server-side so the server can reject submissions
  that don't prove knowledge of the recovery code. This is a deviation from the
  spec's stated storage model and is therefore deferred.

## Private activity encryption

A `private` activity has a JSON payload:

```ts
{ title: string; tags: string[]; loc_label?: string; loc_lat?: number; loc_lng?: number; scheduled_at?: number }
```

The payload is JSON-serialized, encoded as UTF-8, and encrypted with
`crypto_aead_xchacha20poly1305_ietf_encrypt(payload, /* additional data */ null,
nonce, dek)`. A fresh nonce is generated for every write (including updates).

The row stores only `ciphertext` and `nonce`. The `title`, `loc_*`, and
`scheduled_at` columns are `NULL` and not used. No row in `tags` or
`activity_tags` is created for a private activity — those tables hold only
public/semi tag data.

## Visibility transitions

Visibility transitions are explicit, client-driven operations rather than
server flags:

- `private → semi/public`: client decrypts locally, then issues a normal update
  that sets plaintext columns and clears `ciphertext`/`nonce`.
- `semi/public → private`: client reads the plaintext, encrypts locally, then
  issues an update that sets `ciphertext`/`nonce` and clears plaintext columns.
  Server also deletes any rows in `activity_tags` for that activity.
- `semi ↔ public`: server-side toggle. `owner_id` is unchanged; the API simply
  starts or stops including the owner in serialized responses.

## Serialization rules

- For `semi` activities, API responses must **not** include `owner_id` or any
  field that identifies the creator. The server still has `owner_id` for
  authorization (only the owner can edit/delete), but it is stripped from
  responses.
- For `public` activities, `owner_id` (or a derived public handle) **is**
  serialized.
- For `private` activities, responses are only returned to the owner and contain
  `ciphertext` + `nonce`; the client decrypts.

## Tags

- Server-side `tags` and `activity_tags` tables hold only tags for `semi` and
  `public` activities. They are normalized to lowercase trimmed strings, joined
  to activities via `activity_tags`.
- Private tags are stored only inside the encrypted activity payload, and
  indexed client-side in IndexedDB for autocomplete. They never reach the
  server.
- The autocomplete endpoint returns matches from the server-side `tags` table
  only. The frontend may merge those results with the IndexedDB index, clearly
  labelled.

## Things to flag, not silently change

The spec invites flagging anything cryptographically unsound. The above design
follows the spec exactly. The one place where I would push back if asked to go
to production (not deferred for this scaffold) is the recovery lockout DoS —
without a server-side proof of the recovery code, the recovery endpoint is a
soft-DoS vector. Documented above, deferred for the scaffold.
Scaffold Vinterliste — end-to-end encrypted winter activity list Foundation for an E2E-encrypted activity list per winter-list-claude-code-prompt.md. Server (Bun + Hono): - bun:sqlite with WAL and the spec's schema (idempotent migration) - opaque server-stored sessions, httpOnly cookie - signup / challenge / login / logout / me / password / recovery-challenge / recovery-complete - activity CRUD with strict visibility rules: private uses ciphertext+nonce, semi never serializes owner_id, public attributes the owner - tag store with normalisation + autocomplete (semi/public only) Frontend (Svelte 5 + Vite): - libsodium-wrappers-sumo for client crypto (Argon2id + XChaCha20-Poly1305). SUMO is required because the standard build omits crypto_pwhash. - IndexedDB-backed private tag index (never leaves the browser) - in-memory DEK (no localStorage); page reload re-prompts for password - signup shows the recovery code once; tag input merges server + private sources with clear labelling - Bokmål UI Crypto module (shared/crypto.ts): - pure, runs in both Bun and the browser via a runtime-conditional loader that papers over libsodium-wrappers-sumo's broken ESM entry (createRequire on server, Vite alias in the browser) - DEK wrap/unwrap, AEAD payload encryption, recovery code generation with a visually-unambiguous alphabet Verification: - 22 crypto round-trip tests (wrap/unwrap, AEAD tamper rejection, password change preserves ciphertexts, recovery still works after rotation) - typecheck passes for server and frontend - Vite production build succeeds; libsodium SUMO chunk is ~315 KB gzipped Single-image Containerfile for podman: builds frontend in a builder stage, runs Bun in a slim runtime; one volume for the SQLite file; BUILD_DATE / GIT_REVISION baked into OCI labels and /etc/build-info. Known limitation deferred for this commit: the recovery endpoint has no server-side proof of the recovery code (anyone who knows an email can lock out the legitimate user, though they can't read any data). Closed in the next commit. 2026-05-25 12:27:14 +02:00			`# SECURITY.md — Vinterliste key & trust model`

			`This document is the authoritative description of what the server can and cannot see,`
			and how keys are derived, wrapped, and rotated. The crypto code in `shared/crypto.ts`
			`is written against this document; if behaviour diverges from what's described here,`
			`the document is the source of truth and the code must be fixed.`

			`## Threat model`

			`We assume:`

			`- The server operator is honest-but-curious. They may inspect the database file,`
			`the request logs, and memory of the server process at any point.`
			`- The TLS terminator is trusted not to MITM. (For local podman deployment behind a`
			`reverse proxy, this is operator-controlled.)`
			`- The user's browser session is trusted while the user is logged in (the DEK lives`
			`in memory there).`
			`- An attacker may know the user's email address.`

			`We protect against:`

			`- Server-side disclosure of private activity contents (title, tags, location,`
			`scheduled time).`
			`- Server-side disclosure of the user's password.`
			`- Account takeover by an attacker who has the database but not the password or the`
			`recovery code.`

			`We do not protect against:`

			`- Compromise of the user's browser (XSS, malicious extensions). A logged-in client`
			`holds the DEK in JS memory — anything in the same origin can read it. Mitigated`
			`but not eliminated by a strict CSP.`
			`- Targeted denial-of-service via password resets. See "Recovery flow" below.`
			- Side-channel attacks against `libsodium`'s WASM build (we use the SUMO
			build of `libsodium-wrappers-sumo`; the standard build omits `crypto_pwhash`).
			`- Traffic analysis (number/timing/size of private activities is observable).`

			`## Primitives (locked — do not substitute)`

			`\| Purpose \| Primitive \| libsodium API \|`
			`\|----------------------------------------\|-----------------------------------------------\|--------------------------------------------\|`
			\| Password / recovery-code key derivation \| Argon2id, 32-byte raw output \| `crypto_pwhash` \|
			\| Authenticated encryption (AEAD) \| XChaCha20-Poly1305-IETF, 24-byte nonce \| `crypto_aead_xchacha20poly1305_ietf_*` \|
			\| Random bytes (DEK, salts, nonces, code) \| CSPRNG \| `randombytes_buf` \|
			\| Server-side verifier hash \| Argon2id (Bun's default tuning) \| `Bun.password.hash` / `.verify` \|

			Argon2id parameters for `crypto_pwhash` use the libsodium `MODERATE` profile
			(`crypto_pwhash_OPSLIMIT_MODERATE`, `crypto_pwhash_MEMLIMIT_MODERATE`). They are
			recorded as constants in `shared/crypto.ts` and must be kept consistent between
			`signup and any future unlock — if parameters are tuned upward in the future, the`
			`old parameters must be stored per-user so unlock still works.`

			`## Per-user state (server-stored)`

			`For each user the server stores exactly:`

			```
			`auth_salt (16 bytes, public) -- distinct from kek_salt`
			`auth_verifier_hash (text) -- Bun.password.hash of the auth verifier`
			`kek_salt (16 bytes, public) -- for password-derived KEK`
			`wrapped_dek_pw (48 bytes) -- DEK encrypted under KEK_pw`
			`dek_pw_nonce (24 bytes)`
			`rec_salt (16 bytes, public) -- for recovery-code-derived KEK`
			`wrapped_dek_rec (48 bytes) -- DEK encrypted under KEK_rec`
			`dek_rec_nonce (24 bytes)`
			```

			`The server never sees, derives, or stores:`

			`- The user's raw password.`
			`- The recovery code.`
			`- The DEK itself.`
			`- Plaintext title / tags / location / scheduled time for any private activity.`

			`## Why three salts?`

			`Three independently random salts ensure that knowing one derivation tells you`
			`nothing about another:`

			- `auth_salt` — input to the auth verifier the server holds.
			- `kek_salt` — input to KEK_pw, which unwraps the DEK.
			- `rec_salt` — input to KEK_rec, the recovery-code-derived unwrap key.

			In particular, `auth_salt ≠ kek_salt` guarantees that even if a server-side
			`breach leaks the verifier hash and an attacker brute-forces it, they still`
			need to redo Argon2id against `kek_salt` to derive the KEK. The verifier hash
			`is never sufficient on its own.`

			`## Signup flow (client-driven)`

			1. Client generates `dek` (32 bytes), `kek_salt`, `rec_salt`, `auth_salt` (16 bytes each).
			2. Client generates a high-entropy `recovery_code` (≥120 bits), shows it to the
			`user, and never sends it.`
			`3. Client derives:`
			- `kek_pw = pwhash(password, kek_salt)`
			- `kek_rec = pwhash(recovery_code, rec_salt)`
			- `auth_verifier = pwhash(password, auth_salt)` (≠ kek_pw because salts differ)
			`4. Client wraps:`
			- `wrapped_dek_pw = AEAD(kek_pw, dek, dek_pw_nonce)`
			- `wrapped_dek_rec = AEAD(kek_rec, dek, dek_rec_nonce)`
			5. Client posts the salts, wraps, nonces, and `auth_verifier` to the server.
			6. Server hashes `auth_verifier` with `Bun.password.hash` and stores the row.

			`## Unlock / login flow`

			1. Client posts `{ email }` to `/api/auth/challenge`; server returns the public
			parameters: `{ auth_salt, kek_salt, wrapped_dek_pw, dek_pw_nonce }`.
			2. Client derives `kek_pw` and `auth_verifier` locally (two `pwhash` calls).
			3. Client posts `{ email, auth_verifier }` to `/api/auth/login`; server verifies
			via `Bun.password.verify` and, on success, issues an httpOnly session cookie.
			`4. Client unwraps the DEK locally. DEK lives in JS memory for the session.`

			`## Password change`

			1. Client unwraps DEK with the old `kek_pw`.
			2. Client generates `kek_salt_new`, `dek_pw_nonce_new`, `auth_salt_new`, derives
			`kek_pw_new` and `auth_verifier_new`, and produces `wrapped_dek_pw_new`.
			3. Client posts the new material to `/api/auth/password`. The server updates the
			`password wrap and verifier in a single transaction.`
			4. The recovery wrap (`wrapped_dek_rec`, `rec_salt`) is not touched —
			`the recovery code still works.`
			`5. Activity ciphertexts are not re-encrypted — they're still under the`
			`same DEK.`

			`## Recovery flow`

			`The recovery code path is intentionally symmetric to the password path. The`
			`server cannot tell whether the submitted new wrap is "of the same DEK" — it`
			`just stores what the client sends. Trust is anchored entirely in the`
			`recovery-code holder.`

			1. Client posts `{ email }` to `/api/auth/recovery-challenge`; server returns
			`{ rec_salt, wrapped_dek_rec, dek_rec_nonce }`.
			2. Client derives `kek_rec = pwhash(recovery_code, rec_salt)` and unwraps the DEK.
			`3. Client chooses a new password, derives new salts/verifier/wrap as in signup,`
			and posts to `/api/auth/recovery-complete`.
			`4. Server replaces the password wrap, auth salt, and verifier in a single`
			`transaction. The recovery wrap is unchanged (the same recovery code keeps`
			`working).`

			`### Known limitation: lockout DoS`

			Anyone who knows a user's email can trigger `/api/auth/recovery-complete` and,
			`without the recovery code, submit a "junk" new password wrap. The data is **not`
			`disclosed** (the attacker can't decrypt anything), but the legitimate user is`
			`locked out unless they still hold a logged-in session or the recovery code.`

			`Mitigations (out of scope for the scaffold but intended):`

			`- Per-IP and per-email rate limiting on recovery endpoints.`
			`- Email confirmation before activating the new password wrap.`
			`- A "recovery verifier" stored server-side so the server can reject submissions`
			`that don't prove knowledge of the recovery code. This is a deviation from the`
			`spec's stated storage model and is therefore deferred.`

			`## Private activity encryption`

			A `private` activity has a JSON payload:

			```ts
			`{ title: string; tags: string[]; loc_label?: string; loc_lat?: number; loc_lng?: number; scheduled_at?: number }`
			```

			`The payload is JSON-serialized, encoded as UTF-8, and encrypted with`
			`crypto_aead_xchacha20poly1305_ietf_encrypt(payload, /* additional data */ null,
			nonce, dek)`. A fresh nonce is generated for every write (including updates).

			The row stores only `ciphertext` and `nonce`. The `title`, `loc_*`, and
			`scheduled_at` columns are `NULL` and not used. No row in `tags` or
			`activity_tags` is created for a private activity — those tables hold only
			`public/semi tag data.`

			`## Visibility transitions`

			`Visibility transitions are explicit, client-driven operations rather than`
			`server flags:`

			- `private → semi/public`: client decrypts locally, then issues a normal update
			that sets plaintext columns and clears `ciphertext`/`nonce`.
			- `semi/public → private`: client reads the plaintext, encrypts locally, then
			issues an update that sets `ciphertext`/`nonce` and clears plaintext columns.
			Server also deletes any rows in `activity_tags` for that activity.
			- `semi ↔ public`: server-side toggle. `owner_id` is unchanged; the API simply
			`starts or stops including the owner in serialized responses.`

			`## Serialization rules`

			- For `semi` activities, API responses must not include `owner_id` or any
			field that identifies the creator. The server still has `owner_id` for
			`authorization (only the owner can edit/delete), but it is stripped from`
			`responses.`
			- For `public` activities, `owner_id` (or a derived public handle) is
			`serialized.`
			- For `private` activities, responses are only returned to the owner and contain
			`ciphertext` + `nonce`; the client decrypts.

			`## Tags`

			- Server-side `tags` and `activity_tags` tables hold only tags for `semi` and
			`public` activities. They are normalized to lowercase trimmed strings, joined
			to activities via `activity_tags`.
			`- Private tags are stored only inside the encrypted activity payload, and`
			`indexed client-side in IndexedDB for autocomplete. They never reach the`
			`server.`
			- The autocomplete endpoint returns matches from the server-side `tags` table
			`only. The frontend may merge those results with the IndexedDB index, clearly`
			`labelled.`

			`## Things to flag, not silently change`

			`The spec invites flagging anything cryptographically unsound. The above design`
			`follows the spec exactly. The one place where I would push back if asked to go`
			`to production (not deferred for this scaffold) is the recovery lockout DoS —`
			`without a server-side proof of the recovery code, the recovery endpoint is a`
			`soft-DoS vector. Documented above, deferred for the scaffold.`