Close the recovery lockout-DoS hole on /auth/recovery-complete

The original spec stored only `kek_salt`, `wrapped_dek_pw`+nonce, `rec_salt`, and `wrapped_dek_rec`+nonce. Under that model, anyone who knew a user's email could POST to /auth/recovery-complete with junk material and overwrite the password-side wrap, locking the legitimate user out. The data stayed safe (the attacker couldn't decrypt anything) but the account was effectively DoS'd until the user dug up their recovery code. Fix: add a recovery-side verifier mirroring the password-side one. Storage: two new columns on `users`: - rec_auth_salt BLOB NOT NULL — independent of rec_salt - rec_auth_verifier_hash TEXT NOT NULL — Bun.password.hash output The migration adds them via ensureColumn() for forward-compat with scaffold DBs that pre-date this commit; new tables get them via the CREATE TABLE statement. Wire protocol: - SignupRequest gains rec_auth_salt + rec_auth_verifier - RecoveryChallengeResponse gains rec_auth_salt - RecoveryCompleteRequest gains rec_auth_verifier Server (server/auth.ts): - signup hashes the recovery verifier alongside the auth verifier and stores both - recovery-challenge returns rec_auth_salt so the client can derive the verifier; refuses with 409 for pre-fix accounts that have a NULL rec_auth_salt - recovery-complete calls Bun.password.verify against the stored hash BEFORE touching any state. Always runs verify even for unknown emails (against a dummy hash) so timing doesn't leak existence — same pattern we already used for /auth/login. Client (frontend/src/lib/auth.ts): - signup() generates a fourth salt and derives the recovery verifier from the recovery code - recover() fetches the new rec_auth_salt and submits the derived verifier as part of recovery-complete Recovery.svelte distinguishes the new 401 ("Feil gjenopprettingskode") and 409 ("Denne kontoen mangler gjenopprettingsverifikator") cases. Regression test (tests/auth.test.ts) asserts the gate is real: - junk recovery verifier → 401, no state changes - unknown email → 401 (constant-time) - challenge response includes rec_auth_salt - correctly-derived verifier passes the gate SECURITY.md is updated to describe four salts instead of three, the new key-model storage, and the closed lockout DoS. CLAUDE.md flags the rec_auth_* columns as load-bearing — removing them re-opens the hole. This is the only deviation from the spec's stated storage model; documented as such in both SECURITY.md and CLAUDE.md.
2026-05-25 12:28:26 +02:00 · 2026-05-25 12:28:26 +02:00 · add76be486
commit add76be486
parent 47963c9225
9 changed files with 414 additions and 72 deletions
--- a/SECURITY.md
+++ b/SECURITY.md
@ -55,16 +55,26 @@ old parameters must be stored per-user so unlock still works.
 For each user the server stores **exactly**:

 ```
-auth_salt            (16 bytes, public)   -- distinct from kek_salt
-auth_verifier_hash   (text)               -- Bun.password.hash of the auth verifier
-kek_salt             (16 bytes, public)   -- for password-derived KEK
-wrapped_dek_pw       (48 bytes)           -- DEK encrypted under KEK_pw
-dek_pw_nonce         (24 bytes)
-rec_salt             (16 bytes, public)   -- for recovery-code-derived KEK
-wrapped_dek_rec      (48 bytes)           -- DEK encrypted under KEK_rec
-dek_rec_nonce        (24 bytes)
+auth_salt                 (16 bytes, public)  -- distinct from kek_salt
+auth_verifier_hash        (text)              -- Bun.password.hash of the auth verifier
+kek_salt                  (16 bytes, public)  -- for password-derived KEK
+wrapped_dek_pw            (48 bytes)          -- DEK encrypted under KEK_pw
+dek_pw_nonce              (24 bytes)
+rec_salt                  (16 bytes, public)  -- for recovery-code-derived KEK
+wrapped_dek_rec           (48 bytes)          -- DEK encrypted under KEK_rec
+dek_rec_nonce             (24 bytes)
+rec_auth_salt             (16 bytes, public)  -- distinct from rec_salt
+rec_auth_verifier_hash    (text)              -- Bun.password.hash of the recovery verifier
 ```

+The recovery verifier is the proof-of-recovery-code that
+`/api/auth/recovery-complete` requires before it changes any state. Without
+it (the original spec's storage model), an attacker who knows only the email
+could submit a junk new password wrap and lock the legitimate user out — the
+data would still be safe but the account would be DoS'd. With it, recovery
+requires knowledge of the recovery code; an attacker can no longer cause a
+lockout.
+
 The server never sees, derives, or stores:

 - The user's raw password.
@ -72,34 +82,40 @@ The server never sees, derives, or stores:
 - The DEK itself.
 - Plaintext title / tags / location / scheduled time for any **private** activity.

-## Why three salts?
+## Why four salts?

-Three independently random salts ensure that knowing one derivation tells you
+Four independently random salts ensure that knowing one derivation tells you
 nothing about another:

 - `auth_salt` — input to the **auth verifier** the server holds.
 - `kek_salt` — input to **KEK_pw**, which unwraps the DEK.
 - `rec_salt` — input to **KEK_rec**, the recovery-code-derived unwrap key.
+- `rec_auth_salt` — input to the **recovery verifier**, the proof-of-knowledge
+  the server checks before completing a recovery.

 In particular, `auth_salt ≠ kek_salt` guarantees that even if a server-side
 breach leaks the verifier hash *and* an attacker brute-forces it, they still
-need to redo Argon2id against `kek_salt` to derive the KEK. The verifier hash
-is never sufficient on its own.
+need to redo Argon2id against `kek_salt` to derive the KEK. The same property
+holds for `rec_auth_salt ≠ rec_salt`: brute-forcing the recovery verifier
+hash doesn't directly hand the attacker KEK_rec.

 ## Signup flow (client-driven)

-1. Client generates `dek` (32 bytes), `kek_salt`, `rec_salt`, `auth_salt` (16 bytes each).
+1. Client generates `dek` (32 bytes), `kek_salt`, `rec_salt`, `auth_salt`,
+   `rec_auth_salt` (16 bytes each).
 2. Client generates a high-entropy `recovery_code` (≥120 bits), shows it to the
   user, and never sends it.
 3. Client derives:
   - `kek_pw  = pwhash(password, kek_salt)`
   - `kek_rec = pwhash(recovery_code, rec_salt)`
-   - `auth_verifier = pwhash(password, auth_salt)`  (≠ kek_pw because salts differ)
+   - `auth_verifier     = pwhash(password,      auth_salt)`     (≠ kek_pw  because salts differ)
+   - `rec_auth_verifier = pwhash(recovery_code, rec_auth_salt)` (≠ kek_rec because salts differ)
 4. Client wraps:
   - `wrapped_dek_pw  = AEAD(kek_pw,  dek, dek_pw_nonce)`
   - `wrapped_dek_rec = AEAD(kek_rec, dek, dek_rec_nonce)`
-5. Client posts the salts, wraps, nonces, and `auth_verifier` to the server.
-6. Server hashes `auth_verifier` with `Bun.password.hash` and stores the row.
+5. Client posts the salts, wraps, nonces, `auth_verifier`, and
+   `rec_auth_verifier` to the server.
+6. Server hashes both verifiers with `Bun.password.hash` and stores the row.

 ## Unlock / login flow

@ -124,34 +140,52 @@ is never sufficient on its own.

 ## Recovery flow

-The recovery code path is intentionally symmetric to the password path. The
-server cannot tell whether the submitted new wrap is "of the same DEK" — it
-just stores what the client sends. Trust is anchored entirely in the
-recovery-code holder.
+The recovery code path is symmetric to the password path. The server cannot
+tell whether the submitted new wrap is "of the same DEK" — it just stores what
+the client sends. Trust is anchored in the recovery-code holder, with the
+server using a recovery verifier as a proof-of-knowledge gate before any state
+change.

 1. Client posts `{ email }` to `/api/auth/recovery-challenge`; server returns
-   `{ rec_salt, wrapped_dek_rec, dek_rec_nonce }`.
+   `{ rec_salt, wrapped_dek_rec, dek_rec_nonce, rec_auth_salt }`.
 2. Client derives `kek_rec = pwhash(recovery_code, rec_salt)` and unwraps the DEK.
-3. Client chooses a new password, derives new salts/verifier/wrap as in signup,
-   and posts to `/api/auth/recovery-complete`.
-4. Server replaces the password wrap, auth salt, and verifier in a single
-   transaction. The recovery wrap is unchanged (the same recovery code keeps
-   working).
+3. Client also derives `rec_auth_verifier = pwhash(recovery_code, rec_auth_salt)` —
+   the proof the server will check.
+4. Client chooses a new password, derives new password-side
+   salts/verifier/wrap as in signup.
+5. Client posts to `/api/auth/recovery-complete` with `rec_auth_verifier` plus
+   the new password-side material.
+6. **Server first calls `Bun.password.verify(rec_auth_verifier, rec_auth_verifier_hash)`.**
+   If it fails, the request is rejected with `401` and no state changes. To
+   keep the timing of "no such user" indistinguishable from "wrong code", the
+   server runs `Bun.password.verify` against a dummy hash even when the email
+   isn't found.
+7. On success, the server replaces password-side material in a single
+   transaction. The recovery wrap, `rec_salt`, `rec_auth_salt`, and
+   `rec_auth_verifier_hash` are unchanged — the same recovery code keeps
+   working.
+8. The server deletes all existing sessions for the user so any hijacked
+   session is invalidated.

-### Known limitation: lockout DoS
+### Why the recovery verifier closes the DoS

-Anyone who knows a user's email can trigger `/api/auth/recovery-complete` and,
-without the recovery code, submit a "junk" new password wrap. The data is **not
-disclosed** (the attacker can't decrypt anything), but the legitimate user is
-locked out unless they still hold a logged-in session or the recovery code.
+The original spec stored only the recovery wrap. Anyone who knew the email
+could submit a new password wrap and lock the legitimate user out (the data
+itself remained safe — only the recovery-code holder could read it).

-Mitigations (out of scope for the scaffold but intended):
+With the recovery verifier in place, the server has a cryptographic proof that
+the submitter knows the recovery code before any write happens. An attacker
+without the code can no longer cause a lockout. The `auth_salt ≠ kek_salt`
+property carries over: `rec_auth_salt ≠ rec_salt`, so brute-forcing the
+verifier hash doesn't give the attacker the unwrap key.

- Per-IP and per-email rate limiting on recovery endpoints.
- Email confirmation before activating the new password wrap.
- A "recovery verifier" stored server-side so the server can reject submissions
-  that don't prove knowledge of the recovery code. This is a deviation from the
-  spec's stated storage model and is therefore deferred.
+### Remaining recovery-flow caveats
+
+- Per-IP and per-email rate limiting is still desirable on the recovery
+  endpoints (and on login) to slow online brute-force; out of scope for the
+  scaffold.
+- Email-confirmation flows would add a second factor for additional defense
+  in depth; also out of scope.

 ## Private activity encryption

@ -208,8 +242,9 @@ server flags:

 ## Things to flag, not silently change

-The spec invites flagging anything cryptographically unsound. The above design
-follows the spec exactly. The one place where I would push back if asked to go
-to production (not deferred for this scaffold) is the recovery lockout DoS —
-without a server-side proof of the recovery code, the recovery endpoint is a
-soft-DoS vector. Documented above, deferred for the scaffold.
+The spec invites flagging anything cryptographically unsound. The original
+spec stored only `kek_salt`, `wrapped_dek_pw`+nonce, `rec_salt`,
+`wrapped_dek_rec`+nonce. We deviate by additionally storing `rec_auth_salt`
+and `rec_auth_verifier_hash` to close the lockout DoS on `/auth/recovery-complete`.
+The deviation is documented above. Salts and verifier hashes are not secret;
+the storage shape is otherwise unchanged.