docs: initial planning artifacts for fjmcp-broker
Establish project scope, architecture, and phased implementation plan for an OAuth 2.1 broker that fronts forgejo-mcp, delegating user authentication to Forgejo and spawning a per-session stdio forgejo-mcp subprocess scoped to each authenticated user's token. No code yet — planning only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
commit
2c7b50012c
4 changed files with 539 additions and 0 deletions
175
docs/plan.md
Normal file
175
docs/plan.md
Normal file
|
|
@ -0,0 +1,175 @@
|
|||
# Implementation plan
|
||||
|
||||
Seven phases, each independently reviewable. Don't skip phases — the later ones depend on foundations from earlier ones, and each phase has a natural integration test that keeps the next phase honest.
|
||||
|
||||
See [`design.md`](design.md) for the architecture this plan implements.
|
||||
|
||||
## Phase 1 — Skeleton
|
||||
|
||||
**Goal.** An empty binary that starts, logs, serves a health endpoint, opens its SQLite store, and shuts down cleanly.
|
||||
|
||||
**Scope.**
|
||||
- `cmd/broker/main.go` with flag + env parsing using `flag` + `os.Getenv` (keep deps small; no cobra/viper yet).
|
||||
- Package layout: `internal/config`, `internal/log`, `internal/store`, `internal/httpserver`.
|
||||
- Config validation at startup: required fields present, public URL parseable, SQLite path writable.
|
||||
- SQLite open + schema migration (embed SQL with `embed.FS`, apply in a transaction).
|
||||
- `GET /healthz` returns `200 OK` with build info.
|
||||
- Structured JSON logging to stderr.
|
||||
- `SIGTERM` / `SIGINT` triggers orderly shutdown.
|
||||
|
||||
**Out of scope.** OAuth, MCP, subprocesses.
|
||||
|
||||
**Acceptance.**
|
||||
- `go test ./...` passes (config parsing, migration applies cleanly).
|
||||
- Binary starts with required env set; fails fast with a clear error when required env missing.
|
||||
- `curl localhost:8080/healthz` returns `{"status":"ok","version":"...","git":"..."}`.
|
||||
- Sending `SIGTERM` closes the listener and exits within 2 seconds.
|
||||
|
||||
## Phase 2 — OAuth authorization-server facade
|
||||
|
||||
**Goal.** A fully functional OAuth 2.1 AS that delegates user auth to Forgejo. Testable end-to-end with curl and a real Forgejo instance, without any MCP code in the picture.
|
||||
|
||||
**Scope.**
|
||||
- Discovery endpoints (`/.well-known/oauth-protected-resource`, `/.well-known/oauth-authorization-server`).
|
||||
- DCR (`POST /oauth/register`).
|
||||
- Authorize flow (`GET /oauth/authorize` → Forgejo → `/oauth/callback`).
|
||||
- Token endpoint (`POST /oauth/token`) — authorization code grant and refresh token grant.
|
||||
- Revoke endpoint (`POST /oauth/revoke`).
|
||||
- PKCE enforcement (S256 only; reject flows without it).
|
||||
- Token store: `clients`, `auth_codes`, `access_tokens`, `refresh_tokens` tables.
|
||||
- Forgejo upstream client: authorize URL builder, token exchange, refresh, userinfo.
|
||||
- **Decision point**: hand-rolled vs. fosite vs. zitadel/oidc — prototype the hand-rolled path first; swap if it balloons past ~1000 lines.
|
||||
|
||||
**Out of scope.** MCP endpoint, subprocess management.
|
||||
|
||||
**Acceptance.**
|
||||
- Walk through the full flow with curl against a real Forgejo test instance:
|
||||
1. `POST /oauth/register` → get `client_id`.
|
||||
2. Browser hits `/oauth/authorize` with PKCE → bounces to Forgejo → consent → back to `/oauth/callback` → redirects to `redirect_uri` with code.
|
||||
3. `POST /oauth/token` with the code + verifier → receive broker access+refresh tokens.
|
||||
4. `POST /oauth/token` with `grant_type=refresh_token` → new access token.
|
||||
5. `POST /oauth/revoke` → subsequent uses of the token fail.
|
||||
- Discovery documents validate against RFC 8414 / 9728 schemas.
|
||||
- PKCE missing → 400. Non-S256 → 400. Wrong verifier → 400.
|
||||
- Expired codes and tokens rejected.
|
||||
- Tokens stored as SHA-256 hashes; cleartext never persisted.
|
||||
- Test coverage on the AS handlers ≥ 80%.
|
||||
|
||||
## Phase 3 — Subprocess supervisor
|
||||
|
||||
**Goal.** A reusable component that spawns, babysits, and reaps `forgejo-mcp` child processes. Zero knowledge of OAuth or MCP yet — it's a generic "managed stdio subprocess" abstraction.
|
||||
|
||||
**Scope.**
|
||||
- `internal/supervisor` package: `type Child` with `Start`, `Stop(ctx)`, `Stdin() io.Writer`, `StdoutReader() *bufio.Reader`.
|
||||
- Correct `Wait()` in a goroutine on every start — no zombies.
|
||||
- Graceful stop: `SIGTERM` → wait up to N seconds → `SIGKILL`.
|
||||
- Stderr drainer: reads stderr line-by-line and logs with a prefix supplied at spawn time.
|
||||
- Process death detection: closes `Done` channel; exposes `ExitErr()`.
|
||||
- Optional startup health probe: wait for first newline on stdout with timeout — catches "child exited immediately" early.
|
||||
|
||||
**Out of scope.** The registry. Per-session state. MCP-specific framing.
|
||||
|
||||
**Acceptance.**
|
||||
- Unit tests with a tiny `echo`-loop helper binary:
|
||||
- Spawn → write line → read line → stop gracefully.
|
||||
- Kill-after-grace when child ignores SIGTERM.
|
||||
- `Done` closes when child exits on its own.
|
||||
- Manual test: spawn a real `forgejo-mcp --transport stdio` with a test token; confirm clean startup and shutdown.
|
||||
- No goroutine leaks (check with `goleak`).
|
||||
- No FD leaks across 1000 spawn/stop cycles.
|
||||
|
||||
## Phase 4 — Stdio-to-SSE bridge
|
||||
|
||||
**Goal.** A handler that takes an HTTP request with JSON-RPC body, pipes it to a supervised child's stdin, and streams the child's stdout back as an SSE-framed HTTP response.
|
||||
|
||||
**Scope.**
|
||||
- `internal/bridge` package.
|
||||
- Per-child reader goroutine that reads full JSON-RPC messages (newline-delimited) and dispatches them to registered response writers keyed by request id.
|
||||
- SSE writer: writes `event:` + `data:` frames, flushes after each, handles client disconnect.
|
||||
- Send timeout and backpressure: if the HTTP client is slow, don't OOM the broker.
|
||||
|
||||
**Out of scope.** Session identity. OAuth. Registry.
|
||||
|
||||
**Acceptance.**
|
||||
- Unit tests against a mock `Child` that echoes input:
|
||||
- Request → response round trip.
|
||||
- Multiple concurrent requests on one child, correct id routing.
|
||||
- Client disconnect mid-stream cleanly stops forwarding.
|
||||
- Integration test against a real `forgejo-mcp` child:
|
||||
- `initialize` handshake completes.
|
||||
- `tools/list` returns the known tool set.
|
||||
- `tools/call` against `get_forgejo_mcp_server_version` succeeds.
|
||||
|
||||
## Phase 5 — Glue: gated `/mcp` endpoint
|
||||
|
||||
**Goal.** Everything wired. An authenticated Claude.ai-style client can connect, initialize a session, and call tools.
|
||||
|
||||
**Scope.**
|
||||
- Session registry keyed by `Mcp-Session-Id`.
|
||||
- Bearer-token middleware on `/mcp`: resolves to Forgejo access token via the store; rejects missing/expired.
|
||||
- On `initialize` with no session: generate `sid`, spawn `forgejo-mcp` via supervisor with the user's Forgejo token, attach via bridge.
|
||||
- On subsequent requests: look up session, dispatch via bridge.
|
||||
- Reaper goroutine: idle timeout enforcement.
|
||||
- Forgejo token rotation (Forgejo refresh + child respawn) per `design.md` §6.
|
||||
- Token-revocation signal: kill any sessions backed by the revoked broker token.
|
||||
|
||||
**Out of scope.** Pretty logs, metrics, packaging.
|
||||
|
||||
**Acceptance.**
|
||||
- End-to-end with curl, simulating a full MCP client:
|
||||
1. OAuth dance → broker access token.
|
||||
2. `POST /mcp` with `initialize` → session created, spawn visible in logs.
|
||||
3. `POST /mcp` with `tools/list` using `Mcp-Session-Id` → response from forgejo-mcp.
|
||||
4. Idle → child reaped after timeout.
|
||||
5. Revoke token → sessions torn down.
|
||||
- Load test: 20 concurrent sessions stable for 10 minutes. No FD leaks, no zombies, no goroutine leaks.
|
||||
|
||||
## Phase 6 — Packaging and deployment artifacts
|
||||
|
||||
**Goal.** One-command deploy.
|
||||
|
||||
**Scope.**
|
||||
- `Containerfile` with multi-stage build, nonroot user, OCI labels (`org.opencontainers.image.created`, `.revision`), `/etc/build-info`.
|
||||
- `Makefile` targets: `build`, `test`, `lint`, `image`, `image-push`.
|
||||
- Example `Caddyfile` fragment in `deploy/caddy/`.
|
||||
- Example `compose.yaml` in `deploy/compose/` that stands up broker + Caddy together.
|
||||
- Example systemd unit (optional) for non-container deploys.
|
||||
- `README.md` updated with concrete quick-start: "clone, set five env vars, `docker compose up`".
|
||||
|
||||
**Out of scope.** Helm chart, nixpkg, AUR (can follow later if there's demand).
|
||||
|
||||
**Acceptance.**
|
||||
- `make image` produces an image under 50 MB.
|
||||
- `docker compose up` → broker healthy, Caddy serving valid TLS on a test hostname.
|
||||
- A fresh developer can go from clone to working Claude.ai connection in under 15 minutes following the README.
|
||||
|
||||
## Phase 7 — Claude.ai end-to-end
|
||||
|
||||
**Goal.** Prove the whole thing works against the actual target client.
|
||||
|
||||
**Scope.**
|
||||
- Deploy a reachable instance (staging Forgejo + public DNS + TLS).
|
||||
- Configure as a Claude.ai custom connector.
|
||||
- Walk through: tool discovery, tool invocation, session timeout, reconnect, token refresh.
|
||||
- Write up findings: what worked, what surprised us, what needs tweaking.
|
||||
|
||||
**Out of scope.** Publicising the project, marketing, submitting to MCP directories.
|
||||
|
||||
**Acceptance.**
|
||||
- Claude.ai can complete OAuth and list tools.
|
||||
- All `forgejo-mcp` tools invocable from Claude.ai with expected results.
|
||||
- A 30-minute idle session reconnects without manual intervention.
|
||||
- A Forgejo token refresh occurs during an active session without breaking anything the user can see.
|
||||
- Postmortem document captured in `docs/phase7-findings.md`.
|
||||
|
||||
---
|
||||
|
||||
## Cross-cutting conventions
|
||||
|
||||
- **Go version**: track the latest stable minor (update `go.mod` as needed).
|
||||
- **Dependencies**: `stdlib + modernc.org/sqlite + golang.org/x/oauth2` baseline; every new dep needs a line in a `docs/deps.md` justifying it.
|
||||
- **Linting**: `golangci-lint run` clean before merge.
|
||||
- **Testing**: `go test -race ./...` clean before merge; prefer table-driven tests.
|
||||
- **Logging**: structured JSON via `log/slog`. Never log tokens, even hashed.
|
||||
- **Commits**: conventional commits (`feat:`, `fix:`, `chore:`…), atomic, referencing issue IDs once an issue tracker is in place.
|
||||
- **Issue tracking**: set up `bd` (beads) inside this repo at the start of phase 1, so every phase's work lands as discrete issues.
|
||||
Loading…
Add table
Add a link
Reference in a new issue