docs: initial planning artifacts for fjmcp-broker
Establish project scope, architecture, and phased implementation plan for an OAuth 2.1 broker that fronts forgejo-mcp, delegating user authentication to Forgejo and spawning a per-session stdio forgejo-mcp subprocess scoped to each authenticated user's token. No code yet — planning only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
commit
2c7b50012c
4 changed files with 539 additions and 0 deletions
31
.gitignore
vendored
Normal file
31
.gitignore
vendored
Normal file
|
|
@ -0,0 +1,31 @@
|
||||||
|
# Binaries
|
||||||
|
/fjmcp-broker
|
||||||
|
/dist/
|
||||||
|
*.exe
|
||||||
|
|
||||||
|
# Go build artifacts
|
||||||
|
*.test
|
||||||
|
*.out
|
||||||
|
/vendor/
|
||||||
|
|
||||||
|
# SQLite data files (broker token store)
|
||||||
|
*.db
|
||||||
|
*.db-journal
|
||||||
|
*.db-wal
|
||||||
|
*.db-shm
|
||||||
|
/data/
|
||||||
|
|
||||||
|
# Environment / secrets
|
||||||
|
.env
|
||||||
|
.env.local
|
||||||
|
*.pem
|
||||||
|
*.key
|
||||||
|
|
||||||
|
# Editor / OS
|
||||||
|
.idea/
|
||||||
|
.vscode/
|
||||||
|
.DS_Store
|
||||||
|
*.swp
|
||||||
|
|
||||||
|
# Logs
|
||||||
|
*.log
|
||||||
34
README.md
Normal file
34
README.md
Normal file
|
|
@ -0,0 +1,34 @@
|
||||||
|
# forgejo-mcp-broker
|
||||||
|
|
||||||
|
OAuth 2.1 authorization server and MCP session broker for [forgejo-mcp](https://codeberg.org/goern/forgejo-mcp).
|
||||||
|
|
||||||
|
Lets MCP clients such as **Claude.ai** connect to a Forgejo instance through a single public HTTPS endpoint, with per-user authentication delegated to Forgejo's own OAuth2 provider. The broker handles the OAuth dance, then spawns a dedicated `forgejo-mcp --transport stdio` subprocess for each authenticated session, scoped to the authenticated user's Forgejo access token.
|
||||||
|
|
||||||
|
**Status:** Planning. No code yet. See [`docs/design.md`](docs/design.md) for the architecture and [`docs/plan.md`](docs/plan.md) for the phased implementation plan.
|
||||||
|
|
||||||
|
## How it fits
|
||||||
|
|
||||||
|
```
|
||||||
|
Claude.ai ──HTTPS──▶ Caddy ──▶ fjmcp-broker ──stdio──▶ forgejo-mcp ──▶ Forgejo API
|
||||||
|
(this) (one per user (per-user
|
||||||
|
session) token)
|
||||||
|
```
|
||||||
|
|
||||||
|
- **`fjmcp-broker`** (this project): one long-running process. Handles OAuth discovery, dynamic client registration, the authorization-code flow against Forgejo, session lifecycle, and stdio-to-streamable-HTTP bridging.
|
||||||
|
- **`forgejo-mcp`** (existing project): used as-is. Spawned per-session with the authenticated user's `FORGEJO_ACCESS_TOKEN` in the environment.
|
||||||
|
- **Caddy**: terminates TLS for the public hostname and reverse-proxies to the broker.
|
||||||
|
|
||||||
|
## Why a broker instead of adding OAuth to forgejo-mcp?
|
||||||
|
|
||||||
|
Process-level isolation. Each user's Forgejo token lives in exactly one subprocess — the broker never needs to demultiplex tokens inside a single shared client. This keeps forgejo-mcp's `sync.Once` singleton-client pattern valid and avoids a refactor of every tool handler. Full trade-off in [`docs/design.md`](docs/design.md).
|
||||||
|
|
||||||
|
## Quick map
|
||||||
|
|
||||||
|
| File | What |
|
||||||
|
|---|---|
|
||||||
|
| [`docs/design.md`](docs/design.md) | Architecture, components, token flow, deployment, security |
|
||||||
|
| [`docs/plan.md`](docs/plan.md) | Seven-phase implementation plan with acceptance criteria |
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
TBD — pick one before the first public release. Likely MIT or Apache-2.0 to match the Forgejo ecosystem.
|
||||||
299
docs/design.md
Normal file
299
docs/design.md
Normal file
|
|
@ -0,0 +1,299 @@
|
||||||
|
# Design: forgejo-mcp-broker
|
||||||
|
|
||||||
|
## 1. Problem
|
||||||
|
|
||||||
|
Claude.ai (and other MCP clients following the MCP authorization spec) expect to connect to an MCP server over **streamable HTTP** with an **OAuth 2.1** authorization flow, including:
|
||||||
|
|
||||||
|
- RFC 9728 protected-resource metadata (`/.well-known/oauth-protected-resource`)
|
||||||
|
- RFC 8414 authorization-server metadata (`/.well-known/oauth-authorization-server`)
|
||||||
|
- RFC 7591 dynamic client registration (`POST /register`)
|
||||||
|
- PKCE + authorization code flow
|
||||||
|
|
||||||
|
`forgejo-mcp` speaks streamable HTTP but authenticates with a single shared Forgejo personal access token baked into the process at startup. It has no notion of per-user identity, and cannot serve multiple users at once.
|
||||||
|
|
||||||
|
Forgejo is a capable OAuth2 provider (endpoints under `/login/oauth/*`, OIDC discovery at `/.well-known/openid-configuration`) — but it **does not support RFC 7591 dynamic client registration**. Claude.ai cannot register itself as a client against a Forgejo instance directly.
|
||||||
|
|
||||||
|
We need something in the middle.
|
||||||
|
|
||||||
|
## 2. Non-goals
|
||||||
|
|
||||||
|
- **Not a Forgejo OAuth proxy for arbitrary API use.** Only the MCP protocol surface is exposed.
|
||||||
|
- **Not multi-Forgejo.** One broker instance speaks to one Forgejo URL.
|
||||||
|
- **Not a drop-in replacement for `forgejo-mcp`.** It wraps and supervises `forgejo-mcp`, it does not replace it.
|
||||||
|
- **Not a horizontally-scaled service for public SaaS use.** Target is self-hosted / team-scale deployments (tens of concurrent sessions). Scaling further requires design changes (see section 9).
|
||||||
|
|
||||||
|
## 3. Architecture
|
||||||
|
|
||||||
|
The broker plays two roles simultaneously:
|
||||||
|
|
||||||
|
- **OAuth Authorization Server** to the MCP client (Claude.ai).
|
||||||
|
- **OAuth Client** of Forgejo.
|
||||||
|
|
||||||
|
Tokens issued to the MCP client are opaque strings minted by the broker. Each one maps internally to a real Forgejo access+refresh token, which the broker holds in its store and passes to subprocesses via environment.
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────── container / pod ─────────┐
|
||||||
|
│ │
|
||||||
|
│ ┌──────────────────────────┐ │
|
||||||
|
│ │ fjmcp-broker :8080 │ │
|
||||||
|
Claude.ai ──HTTPS─▶ Caddy─▶ │ │
|
||||||
|
│ │ • Discovery endpoints │ │
|
||||||
|
│ │ • /register (DCR) │ │
|
||||||
|
│ │ • /authorize ─┐ │ │───▶ Forgejo /login/oauth/authorize
|
||||||
|
│ │ • /callback ◀─┘ │ │◀── code
|
||||||
|
│ │ • /token │ │───▶ Forgejo /login/oauth/access_token
|
||||||
|
│ │ • /revoke │ │
|
||||||
|
│ │ • /mcp (gated) │ │
|
||||||
|
│ │ • Session registry │ │
|
||||||
|
│ │ • Supervisor + reaper │ │
|
||||||
|
│ └──────────┬───────────────┘ │
|
||||||
|
│ │ spawn + pipes │
|
||||||
|
│ ▼ │
|
||||||
|
│ ┌──────────────────────────┐ │
|
||||||
|
│ │ forgejo-mcp (stdio) │ ──▶│──▶ Forgejo API
|
||||||
|
│ │ FORGEJO_ACCESS_TOKEN=… │ │
|
||||||
|
│ │ one per active session │ │
|
||||||
|
│ └──────────────────────────┘ │
|
||||||
|
└───────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Component: OAuth authorization-server facade
|
||||||
|
|
||||||
|
### 4.1 Endpoints
|
||||||
|
|
||||||
|
| Method | Path | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| `GET` | `/.well-known/oauth-protected-resource` | Advertise: I am a resource server, my AS is at this issuer. |
|
||||||
|
| `GET` | `/.well-known/oauth-authorization-server` | Advertise endpoints, PKCE required, supported scopes. |
|
||||||
|
| `POST` | `/oauth/register` | RFC 7591 dynamic client registration. Accept any well-formed request; persist and return `client_id`. |
|
||||||
|
| `GET` | `/oauth/authorize` | Validate PKCE + `redirect_uri` + `client_id`. Stash state. 302 to Forgejo's `/login/oauth/authorize`. |
|
||||||
|
| `GET` | `/oauth/callback` | Receive Forgejo's auth code. Exchange for Forgejo access+refresh tokens. Mint broker auth code. Redirect back to MCP client's `redirect_uri`. |
|
||||||
|
| `POST` | `/oauth/token` | Exchange broker auth code → broker access+refresh token. Persist mapping `broker_token → forgejo_token`. |
|
||||||
|
| `POST` | `/oauth/revoke` | Invalidate a broker token; revoke upstream Forgejo token if possible. |
|
||||||
|
|
||||||
|
### 4.2 Token store (SQLite)
|
||||||
|
|
||||||
|
One file, mounted as a volume for persistence across container restarts. Pure-Go driver: `modernc.org/sqlite` — no CGO, keeps the container image fully static.
|
||||||
|
|
||||||
|
Tables:
|
||||||
|
|
||||||
|
- **`clients`** — `client_id`, `client_secret` (nullable for public clients), `redirect_uris[]`, `created_at`, `last_used`, optional `metadata_json`.
|
||||||
|
- **`auth_codes`** — `code`, `client_id`, `redirect_uri`, `code_challenge`, `code_challenge_method`, `forgejo_access_token`, `forgejo_refresh_token`, `forgejo_token_expires_at`, `forgejo_user_id`, `forgejo_username`, `scopes`, `expires_at` (~10 min), `used_at`.
|
||||||
|
- **`access_tokens`** — `token_hash`, `client_id`, `forgejo_user_id`, `forgejo_username`, `scopes`, `expires_at`, `forgejo_access_token`, `forgejo_refresh_token`, `forgejo_token_expires_at`, `revoked_at`.
|
||||||
|
- **`refresh_tokens`** — `token_hash`, `access_token_hash`, `client_id`, `expires_at`, `revoked_at`.
|
||||||
|
|
||||||
|
Broker tokens are stored **hashed** (SHA-256) — the plaintext leaves the broker exactly once, when handed to the MCP client.
|
||||||
|
|
||||||
|
Forgejo tokens are stored in cleartext (the broker must be able to use them to spawn subprocesses). This means the SQLite file is a sensitive secret at rest. Mitigations:
|
||||||
|
|
||||||
|
- Volume permissions locked to the broker's UID/GID.
|
||||||
|
- Consider OS-level encryption of the mount (LUKS, cloud KMS-backed volume) for production deployments.
|
||||||
|
- Optional: encrypt Forgejo tokens at the application layer with a key loaded from env — adds complexity, decide in phase 2.
|
||||||
|
|
||||||
|
### 4.3 Forgejo OAuth app configuration (one-time, operator task)
|
||||||
|
|
||||||
|
1. Sign in to Forgejo as the operator / service account that should "own" this integration.
|
||||||
|
2. **Settings → Applications → OAuth2 Applications → Create application**.
|
||||||
|
3. Redirect URI: `https://<public-hostname>/oauth/callback` (the broker's public URL).
|
||||||
|
4. Save `client_id` and `client_secret` into broker env:
|
||||||
|
- `FORGEJO_OAUTH_CLIENT_ID`
|
||||||
|
- `FORGEJO_OAUTH_CLIENT_SECRET`
|
||||||
|
5. Pick the scope set. Forgejo scopes are coarse. A superset that matches `forgejo-mcp`'s current tool surface: `read:user write:repository write:issue write:notification read:organization`. Configurable via `FORGEJO_OAUTH_SCOPES`.
|
||||||
|
|
||||||
|
### 4.4 Public base URL
|
||||||
|
|
||||||
|
The broker must know its own public URL to emit correct redirect URIs and discovery metadata. Required config:
|
||||||
|
|
||||||
|
- `--public-url` / `FJMCP_BROKER_PUBLIC_URL`, e.g. `https://mcp.example.com`.
|
||||||
|
|
||||||
|
All issuer URLs in discovery documents are built from this value — **never** from the inbound `Host` or `X-Forwarded-*` headers. Publishing attacker-controlled issuer URLs is a classic OAuth vulnerability.
|
||||||
|
|
||||||
|
### 4.5 Library choice
|
||||||
|
|
||||||
|
Two candidates for the AS implementation:
|
||||||
|
|
||||||
|
- **Hand-rolled minimal AS**: the flow is narrow (authorization code + PKCE + DCR + refresh + revoke). Probably 500–800 lines plus tests. Pro: no heavy dependency, full control of the security surface. Con: we own every edge case.
|
||||||
|
- **`github.com/ory/fosite`**: fully compliant OAuth 2.1 / OIDC building blocks. Pro: fewer footguns, wide adoption. Con: heavyweight API, larger binary, bigger attack surface from unused features.
|
||||||
|
|
||||||
|
**Leaning toward hand-rolled** because the flow is small and fosite adds complexity we don't need. Decision to be reconfirmed at start of phase 2.
|
||||||
|
|
||||||
|
## 5. Component: session multiplexer
|
||||||
|
|
||||||
|
### 5.1 Session state
|
||||||
|
|
||||||
|
```go
|
||||||
|
type Session struct {
|
||||||
|
ID string // Mcp-Session-Id header value
|
||||||
|
ForgejoUser string // for logging / revocation
|
||||||
|
Proc *exec.Cmd // the spawned forgejo-mcp child
|
||||||
|
Stdin io.WriteCloser // broker writes JSON-RPC here
|
||||||
|
Stdout io.ReadCloser // broker reads JSON-RPC from here
|
||||||
|
Stderr io.ReadCloser // drained to logs, prefixed with sid
|
||||||
|
LastActive atomic.Int64
|
||||||
|
Done chan struct{}
|
||||||
|
forgejoTokenID string // ref to access_tokens row, for refresh
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5.2 Spawn
|
||||||
|
|
||||||
|
On the first `initialize` request for a new session, after bearer-token validation:
|
||||||
|
|
||||||
|
```go
|
||||||
|
cmd := exec.CommandContext(ctx, brokerCfg.ForgejoMCPBinary,
|
||||||
|
"--transport", "stdio",
|
||||||
|
"--url", brokerCfg.ForgejoURL,
|
||||||
|
)
|
||||||
|
cmd.Env = append(os.Environ(),
|
||||||
|
"FORGEJO_ACCESS_TOKEN="+session.ForgejoAccessToken,
|
||||||
|
"FORGEJO_USER_AGENT=fjmcp-broker/"+version,
|
||||||
|
)
|
||||||
|
cmd.Stdin, _ = cmd.StdinPipe()
|
||||||
|
cmd.Stdout, _ = cmd.StdoutPipe()
|
||||||
|
cmd.Stderr, _ = cmd.StderrPipe()
|
||||||
|
cmd.Start()
|
||||||
|
go drainStderr(sid, stderrPipe)
|
||||||
|
go waitReap(cmd) // must call Wait() to avoid zombies
|
||||||
|
```
|
||||||
|
|
||||||
|
`forgejo-mcp` runs its own `VerifyConnection()` at startup — one round trip to Forgejo. Expect ~100–300 ms before the subprocess is ready to accept input. The first `initialize` response is the natural place for this latency to hide.
|
||||||
|
|
||||||
|
### 5.3 Bridge
|
||||||
|
|
||||||
|
MCP is JSON-RPC 2.0 over both transports. Message shapes are identical. The broker can pipe messages opaquely without parsing them.
|
||||||
|
|
||||||
|
Request path (claude.ai → forgejo-mcp):
|
||||||
|
|
||||||
|
1. `POST /mcp` with `Authorization: Bearer <broker_token>` and (after first message) `Mcp-Session-Id: <sid>`.
|
||||||
|
2. Middleware: resolve token → session. 401 if missing/expired.
|
||||||
|
3. Look up session. If none and method is `initialize`, create one. Otherwise 404.
|
||||||
|
4. Write the request body as one `\n`-terminated line to `stdin`.
|
||||||
|
5. Read one or more response lines from `stdout`, stream them to the HTTP response (SSE framing).
|
||||||
|
6. Bump `LastActive`.
|
||||||
|
|
||||||
|
If Caddy is in front, `flush_interval -1` on its reverse-proxy directive is mandatory — default response buffering breaks SSE.
|
||||||
|
|
||||||
|
### 5.4 Lifecycle
|
||||||
|
|
||||||
|
| Event | Action |
|
||||||
|
|---|---|
|
||||||
|
| SSE stream closed by client | Start idle countdown. Don't kill immediately — Claude.ai reconnects frequently. |
|
||||||
|
| Idle timeout exceeded (default 15 min) | `SIGTERM`; after 5 s grace, `SIGKILL`. Remove from registry. |
|
||||||
|
| Child exits (EOF on stdout) | Mark session dead. Tombstone the `sid` so late requests return 410 Gone. |
|
||||||
|
| Broker shutdown | Iterate sessions, `SIGTERM` all children, wait grace period, then `SIGKILL`. |
|
||||||
|
| Token revoked | Find sessions using that broker token, kill their children, remove sessions. |
|
||||||
|
| Forgejo token expired | See section 6. |
|
||||||
|
|
||||||
|
A reaper goroutine runs every 30 s and applies the idle-timeout rule.
|
||||||
|
|
||||||
|
### 5.5 Do not try to resume sessions across child restarts
|
||||||
|
|
||||||
|
MCP's `initialize` handshake is stateful (protocol version negotiation, capability exchange). If a child crashes, the session is dead; the MCP client must re-initialize. Any attempt to persist and replay protocol state in the broker is a rathole. Don't go there.
|
||||||
|
|
||||||
|
## 6. Forgejo access-token rotation
|
||||||
|
|
||||||
|
Forgejo access tokens expire. The broker has the refresh token and must keep things working without forcing the user to re-authenticate.
|
||||||
|
|
||||||
|
Strategy:
|
||||||
|
|
||||||
|
- Track `forgejo_token_expires_at` in the token store.
|
||||||
|
- Background goroutine runs every minute. For any active session whose Forgejo token expires in less than 2 minutes: call Forgejo's refresh endpoint, update the store.
|
||||||
|
- **The child already holds the old token in its env.** After refresh: `SIGTERM` the child, spawn a new one with the new token, let the MCP client `initialize` again on its next request.
|
||||||
|
|
||||||
|
This causes a user-visible blip (~200 ms reconnect) once per Forgejo token lifetime. Acceptable default. A future optimisation could use a side-channel (e.g., `SIGHUP` handled by forgejo-mcp to re-read a token file) to avoid the blip — explicitly out of scope for v1.
|
||||||
|
|
||||||
|
## 7. Deployment
|
||||||
|
|
||||||
|
### 7.1 Container
|
||||||
|
|
||||||
|
Single container, multi-stage build. Both binaries ship in the final image; the broker `exec`s `forgejo-mcp` as a sibling.
|
||||||
|
|
||||||
|
```dockerfile
|
||||||
|
FROM docker.io/library/golang:1.23 AS build
|
||||||
|
WORKDIR /src
|
||||||
|
COPY . .
|
||||||
|
RUN CGO_ENABLED=0 go build -trimpath -ldflags='-s -w' -o /out/fjmcp-broker ./cmd/broker
|
||||||
|
# forgejo-mcp is vendored as a submodule or fetched during build:
|
||||||
|
RUN go install -trimpath -ldflags='-s -w' codeberg.org/goern/forgejo-mcp/v2@<pinned>
|
||||||
|
|
||||||
|
FROM gcr.io/distroless/static-debian12:nonroot
|
||||||
|
COPY --from=build /out/fjmcp-broker /usr/local/bin/
|
||||||
|
COPY --from=build /go/bin/forgejo-mcp /usr/local/bin/
|
||||||
|
USER nonroot:nonroot
|
||||||
|
EXPOSE 8080
|
||||||
|
ENTRYPOINT ["/usr/local/bin/fjmcp-broker"]
|
||||||
|
```
|
||||||
|
|
||||||
|
Container labels include build timestamp and git revision per the user's global standards.
|
||||||
|
|
||||||
|
### 7.2 Caddy
|
||||||
|
|
||||||
|
```caddy
|
||||||
|
mcp.example.com {
|
||||||
|
encode zstd gzip
|
||||||
|
|
||||||
|
reverse_proxy forgejo-mcp-broker:8080 {
|
||||||
|
header_up Host {host}
|
||||||
|
header_up X-Forwarded-Proto https
|
||||||
|
header_up X-Forwarded-For {remote_host}
|
||||||
|
flush_interval -1 # REQUIRED for SSE
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The broker itself terminates plain HTTP; Caddy handles TLS with Let's Encrypt.
|
||||||
|
|
||||||
|
### 7.3 Config surface (all optional unless noted)
|
||||||
|
|
||||||
|
| Flag | Env | Required | Purpose |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `--public-url` | `FJMCP_BROKER_PUBLIC_URL` | yes | Public issuer URL, e.g. `https://mcp.example.com` |
|
||||||
|
| `--listen` | `FJMCP_BROKER_LISTEN` | | Listen addr, default `:8080` |
|
||||||
|
| `--forgejo-url` | `FORGEJO_URL` | yes | Upstream Forgejo instance URL |
|
||||||
|
| `--forgejo-oauth-client-id` | `FORGEJO_OAUTH_CLIENT_ID` | yes | Forgejo OAuth app credentials |
|
||||||
|
| `--forgejo-oauth-client-secret` | `FORGEJO_OAUTH_CLIENT_SECRET` | yes | |
|
||||||
|
| `--forgejo-oauth-scopes` | `FORGEJO_OAUTH_SCOPES` | | Space-separated, default covers the full tool surface |
|
||||||
|
| `--forgejo-mcp-binary` | `FJMCP_BROKER_MCP_BINARY` | | Path to `forgejo-mcp`, default `/usr/local/bin/forgejo-mcp` |
|
||||||
|
| `--store-path` | `FJMCP_BROKER_STORE` | | SQLite file path, default `/data/broker.db` |
|
||||||
|
| `--max-sessions` | `FJMCP_BROKER_MAX_SESSIONS` | | Hard cap, default `100` |
|
||||||
|
| `--session-idle-timeout` | `FJMCP_BROKER_IDLE_TIMEOUT` | | Default `15m` |
|
||||||
|
| `--debug` | `FJMCP_BROKER_DEBUG` | | Verbose logging |
|
||||||
|
|
||||||
|
## 8. Security
|
||||||
|
|
||||||
|
- **Public-URL authority.** Never derive issuer URLs from inbound headers — always from config. Publishing the wrong issuer allows an attacker to redirect flows to endpoints they control.
|
||||||
|
- **PKCE required.** Reject authorize requests without `code_challenge`. Only `S256` method supported.
|
||||||
|
- **Token storage.** Broker access/refresh tokens stored as SHA-256 hashes. Forgejo tokens stored cleartext (they must be usable for subprocess spawning); file permissions and optional encrypted volume mitigate at-rest risk.
|
||||||
|
- **Subprocess environment.** Each subprocess sees only its own user's `FORGEJO_ACCESS_TOKEN`. On the same UID, `/proc/<pid>/environ` is readable — acceptable given single-tenant container, but worth noting. A `--token-fd` flag on `forgejo-mcp` would eliminate this; defer unless threat model demands it.
|
||||||
|
- **Rate limits.** `/oauth/register` and `/oauth/token` should have request limits to blunt abuse. Start with Caddy-level rate limits; move into the broker if finer control is needed.
|
||||||
|
- **Audit log.** Structured log line per: client registration, authorize start, authorize callback success/failure, token issuance, token revocation, session spawn, session reap, child crash. Include `client_id`, `forgejo_username`, and session id. Do **not** log tokens.
|
||||||
|
- **Dependencies.** Keep the dependency tree small and pinned. Review before adding any new dep.
|
||||||
|
|
||||||
|
## 9. Scaling notes
|
||||||
|
|
||||||
|
Single-instance design:
|
||||||
|
|
||||||
|
- **Sessions** are process-local — no state sharing between broker instances. You can run exactly one broker pod.
|
||||||
|
- **Token store** in SQLite on a local volume — can't be shared safely across instances.
|
||||||
|
|
||||||
|
Acceptable for self-hosted / team use (tens of concurrent sessions). To scale horizontally you'd need: session-affinity routing (sticky sessions), or move the session registry and token store to a shared service (Postgres, Redis). Out of scope for v1.
|
||||||
|
|
||||||
|
## 10. Open questions
|
||||||
|
|
||||||
|
1. **Hand-rolled AS vs. fosite vs. zitadel/oidc.** Revisit at start of phase 2 with a prototype to ground the decision.
|
||||||
|
2. **Per-user scope narrowing.** Forgejo OAuth lets the user approve or deny the requested scopes. Do we expose scope choice in our own consent screen (requires interstitial UI), or inherit Forgejo's consent screen 1:1 (simpler, probably fine)? Lean toward inheriting.
|
||||||
|
3. **Shared broker vs. per-user forgejo-mcp process.** Current design: one child per **session**. Could also be one per **user** (multiple sessions share a child). Per-session wins on isolation; per-user wins on footprint. Stick with per-session unless measurements show a problem.
|
||||||
|
4. **Forgejo token rotation UX.** Accept a 200 ms reconnect blip, or invest in a no-restart rotation path via `--token-fd` or a signal-based re-read? Defer unless users complain.
|
||||||
|
5. **Observability surface.** Plain structured JSON logs to stderr for v1. Prometheus metrics (`/metrics`) is a natural follow-up — session count, spawn/reap rates, OAuth endpoint latencies, Forgejo refresh success rate.
|
||||||
|
6. **License.** MIT and Apache-2.0 both fit. Pick before the first tagged release.
|
||||||
|
|
||||||
|
## 11. Relationship to `forgejo-mcp`
|
||||||
|
|
||||||
|
The broker treats `forgejo-mcp` as an **opaque PAT-consuming stdio MCP server**. No API dependency beyond the CLI flags `--transport stdio --url <url>` and the `FORGEJO_ACCESS_TOKEN` env var.
|
||||||
|
|
||||||
|
Two optional hardenings to `forgejo-mcp` itself, both deferrable:
|
||||||
|
|
||||||
|
- **`--token-fd N`**: read the token from an inherited file descriptor instead of env. Removes the `/proc/<pid>/environ` leak path.
|
||||||
|
- **Verified clean exit on stdin EOF**: should already work via mcp-go's `ServeStdio` internal behavior, but worth an explicit test.
|
||||||
|
|
||||||
|
Neither is required for v1 of the broker. Both can be contributed upstream as independent PRs later.
|
||||||
175
docs/plan.md
Normal file
175
docs/plan.md
Normal file
|
|
@ -0,0 +1,175 @@
|
||||||
|
# Implementation plan
|
||||||
|
|
||||||
|
Seven phases, each independently reviewable. Don't skip phases — the later ones depend on foundations from earlier ones, and each phase has a natural integration test that keeps the next phase honest.
|
||||||
|
|
||||||
|
See [`design.md`](design.md) for the architecture this plan implements.
|
||||||
|
|
||||||
|
## Phase 1 — Skeleton
|
||||||
|
|
||||||
|
**Goal.** An empty binary that starts, logs, serves a health endpoint, opens its SQLite store, and shuts down cleanly.
|
||||||
|
|
||||||
|
**Scope.**
|
||||||
|
- `cmd/broker/main.go` with flag + env parsing using `flag` + `os.Getenv` (keep deps small; no cobra/viper yet).
|
||||||
|
- Package layout: `internal/config`, `internal/log`, `internal/store`, `internal/httpserver`.
|
||||||
|
- Config validation at startup: required fields present, public URL parseable, SQLite path writable.
|
||||||
|
- SQLite open + schema migration (embed SQL with `embed.FS`, apply in a transaction).
|
||||||
|
- `GET /healthz` returns `200 OK` with build info.
|
||||||
|
- Structured JSON logging to stderr.
|
||||||
|
- `SIGTERM` / `SIGINT` triggers orderly shutdown.
|
||||||
|
|
||||||
|
**Out of scope.** OAuth, MCP, subprocesses.
|
||||||
|
|
||||||
|
**Acceptance.**
|
||||||
|
- `go test ./...` passes (config parsing, migration applies cleanly).
|
||||||
|
- Binary starts with required env set; fails fast with a clear error when required env missing.
|
||||||
|
- `curl localhost:8080/healthz` returns `{"status":"ok","version":"...","git":"..."}`.
|
||||||
|
- Sending `SIGTERM` closes the listener and exits within 2 seconds.
|
||||||
|
|
||||||
|
## Phase 2 — OAuth authorization-server facade
|
||||||
|
|
||||||
|
**Goal.** A fully functional OAuth 2.1 AS that delegates user auth to Forgejo. Testable end-to-end with curl and a real Forgejo instance, without any MCP code in the picture.
|
||||||
|
|
||||||
|
**Scope.**
|
||||||
|
- Discovery endpoints (`/.well-known/oauth-protected-resource`, `/.well-known/oauth-authorization-server`).
|
||||||
|
- DCR (`POST /oauth/register`).
|
||||||
|
- Authorize flow (`GET /oauth/authorize` → Forgejo → `/oauth/callback`).
|
||||||
|
- Token endpoint (`POST /oauth/token`) — authorization code grant and refresh token grant.
|
||||||
|
- Revoke endpoint (`POST /oauth/revoke`).
|
||||||
|
- PKCE enforcement (S256 only; reject flows without it).
|
||||||
|
- Token store: `clients`, `auth_codes`, `access_tokens`, `refresh_tokens` tables.
|
||||||
|
- Forgejo upstream client: authorize URL builder, token exchange, refresh, userinfo.
|
||||||
|
- **Decision point**: hand-rolled vs. fosite vs. zitadel/oidc — prototype the hand-rolled path first; swap if it balloons past ~1000 lines.
|
||||||
|
|
||||||
|
**Out of scope.** MCP endpoint, subprocess management.
|
||||||
|
|
||||||
|
**Acceptance.**
|
||||||
|
- Walk through the full flow with curl against a real Forgejo test instance:
|
||||||
|
1. `POST /oauth/register` → get `client_id`.
|
||||||
|
2. Browser hits `/oauth/authorize` with PKCE → bounces to Forgejo → consent → back to `/oauth/callback` → redirects to `redirect_uri` with code.
|
||||||
|
3. `POST /oauth/token` with the code + verifier → receive broker access+refresh tokens.
|
||||||
|
4. `POST /oauth/token` with `grant_type=refresh_token` → new access token.
|
||||||
|
5. `POST /oauth/revoke` → subsequent uses of the token fail.
|
||||||
|
- Discovery documents validate against RFC 8414 / 9728 schemas.
|
||||||
|
- PKCE missing → 400. Non-S256 → 400. Wrong verifier → 400.
|
||||||
|
- Expired codes and tokens rejected.
|
||||||
|
- Tokens stored as SHA-256 hashes; cleartext never persisted.
|
||||||
|
- Test coverage on the AS handlers ≥ 80%.
|
||||||
|
|
||||||
|
## Phase 3 — Subprocess supervisor
|
||||||
|
|
||||||
|
**Goal.** A reusable component that spawns, babysits, and reaps `forgejo-mcp` child processes. Zero knowledge of OAuth or MCP yet — it's a generic "managed stdio subprocess" abstraction.
|
||||||
|
|
||||||
|
**Scope.**
|
||||||
|
- `internal/supervisor` package: `type Child` with `Start`, `Stop(ctx)`, `Stdin() io.Writer`, `StdoutReader() *bufio.Reader`.
|
||||||
|
- Correct `Wait()` in a goroutine on every start — no zombies.
|
||||||
|
- Graceful stop: `SIGTERM` → wait up to N seconds → `SIGKILL`.
|
||||||
|
- Stderr drainer: reads stderr line-by-line and logs with a prefix supplied at spawn time.
|
||||||
|
- Process death detection: closes `Done` channel; exposes `ExitErr()`.
|
||||||
|
- Optional startup health probe: wait for first newline on stdout with timeout — catches "child exited immediately" early.
|
||||||
|
|
||||||
|
**Out of scope.** The registry. Per-session state. MCP-specific framing.
|
||||||
|
|
||||||
|
**Acceptance.**
|
||||||
|
- Unit tests with a tiny `echo`-loop helper binary:
|
||||||
|
- Spawn → write line → read line → stop gracefully.
|
||||||
|
- Kill-after-grace when child ignores SIGTERM.
|
||||||
|
- `Done` closes when child exits on its own.
|
||||||
|
- Manual test: spawn a real `forgejo-mcp --transport stdio` with a test token; confirm clean startup and shutdown.
|
||||||
|
- No goroutine leaks (check with `goleak`).
|
||||||
|
- No FD leaks across 1000 spawn/stop cycles.
|
||||||
|
|
||||||
|
## Phase 4 — Stdio-to-SSE bridge
|
||||||
|
|
||||||
|
**Goal.** A handler that takes an HTTP request with JSON-RPC body, pipes it to a supervised child's stdin, and streams the child's stdout back as an SSE-framed HTTP response.
|
||||||
|
|
||||||
|
**Scope.**
|
||||||
|
- `internal/bridge` package.
|
||||||
|
- Per-child reader goroutine that reads full JSON-RPC messages (newline-delimited) and dispatches them to registered response writers keyed by request id.
|
||||||
|
- SSE writer: writes `event:` + `data:` frames, flushes after each, handles client disconnect.
|
||||||
|
- Send timeout and backpressure: if the HTTP client is slow, don't OOM the broker.
|
||||||
|
|
||||||
|
**Out of scope.** Session identity. OAuth. Registry.
|
||||||
|
|
||||||
|
**Acceptance.**
|
||||||
|
- Unit tests against a mock `Child` that echoes input:
|
||||||
|
- Request → response round trip.
|
||||||
|
- Multiple concurrent requests on one child, correct id routing.
|
||||||
|
- Client disconnect mid-stream cleanly stops forwarding.
|
||||||
|
- Integration test against a real `forgejo-mcp` child:
|
||||||
|
- `initialize` handshake completes.
|
||||||
|
- `tools/list` returns the known tool set.
|
||||||
|
- `tools/call` against `get_forgejo_mcp_server_version` succeeds.
|
||||||
|
|
||||||
|
## Phase 5 — Glue: gated `/mcp` endpoint
|
||||||
|
|
||||||
|
**Goal.** Everything wired. An authenticated Claude.ai-style client can connect, initialize a session, and call tools.
|
||||||
|
|
||||||
|
**Scope.**
|
||||||
|
- Session registry keyed by `Mcp-Session-Id`.
|
||||||
|
- Bearer-token middleware on `/mcp`: resolves to Forgejo access token via the store; rejects missing/expired.
|
||||||
|
- On `initialize` with no session: generate `sid`, spawn `forgejo-mcp` via supervisor with the user's Forgejo token, attach via bridge.
|
||||||
|
- On subsequent requests: look up session, dispatch via bridge.
|
||||||
|
- Reaper goroutine: idle timeout enforcement.
|
||||||
|
- Forgejo token rotation (Forgejo refresh + child respawn) per `design.md` §6.
|
||||||
|
- Token-revocation signal: kill any sessions backed by the revoked broker token.
|
||||||
|
|
||||||
|
**Out of scope.** Pretty logs, metrics, packaging.
|
||||||
|
|
||||||
|
**Acceptance.**
|
||||||
|
- End-to-end with curl, simulating a full MCP client:
|
||||||
|
1. OAuth dance → broker access token.
|
||||||
|
2. `POST /mcp` with `initialize` → session created, spawn visible in logs.
|
||||||
|
3. `POST /mcp` with `tools/list` using `Mcp-Session-Id` → response from forgejo-mcp.
|
||||||
|
4. Idle → child reaped after timeout.
|
||||||
|
5. Revoke token → sessions torn down.
|
||||||
|
- Load test: 20 concurrent sessions stable for 10 minutes. No FD leaks, no zombies, no goroutine leaks.
|
||||||
|
|
||||||
|
## Phase 6 — Packaging and deployment artifacts
|
||||||
|
|
||||||
|
**Goal.** One-command deploy.
|
||||||
|
|
||||||
|
**Scope.**
|
||||||
|
- `Containerfile` with multi-stage build, nonroot user, OCI labels (`org.opencontainers.image.created`, `.revision`), `/etc/build-info`.
|
||||||
|
- `Makefile` targets: `build`, `test`, `lint`, `image`, `image-push`.
|
||||||
|
- Example `Caddyfile` fragment in `deploy/caddy/`.
|
||||||
|
- Example `compose.yaml` in `deploy/compose/` that stands up broker + Caddy together.
|
||||||
|
- Example systemd unit (optional) for non-container deploys.
|
||||||
|
- `README.md` updated with concrete quick-start: "clone, set five env vars, `docker compose up`".
|
||||||
|
|
||||||
|
**Out of scope.** Helm chart, nixpkg, AUR (can follow later if there's demand).
|
||||||
|
|
||||||
|
**Acceptance.**
|
||||||
|
- `make image` produces an image under 50 MB.
|
||||||
|
- `docker compose up` → broker healthy, Caddy serving valid TLS on a test hostname.
|
||||||
|
- A fresh developer can go from clone to working Claude.ai connection in under 15 minutes following the README.
|
||||||
|
|
||||||
|
## Phase 7 — Claude.ai end-to-end
|
||||||
|
|
||||||
|
**Goal.** Prove the whole thing works against the actual target client.
|
||||||
|
|
||||||
|
**Scope.**
|
||||||
|
- Deploy a reachable instance (staging Forgejo + public DNS + TLS).
|
||||||
|
- Configure as a Claude.ai custom connector.
|
||||||
|
- Walk through: tool discovery, tool invocation, session timeout, reconnect, token refresh.
|
||||||
|
- Write up findings: what worked, what surprised us, what needs tweaking.
|
||||||
|
|
||||||
|
**Out of scope.** Publicising the project, marketing, submitting to MCP directories.
|
||||||
|
|
||||||
|
**Acceptance.**
|
||||||
|
- Claude.ai can complete OAuth and list tools.
|
||||||
|
- All `forgejo-mcp` tools invocable from Claude.ai with expected results.
|
||||||
|
- A 30-minute idle session reconnects without manual intervention.
|
||||||
|
- A Forgejo token refresh occurs during an active session without breaking anything the user can see.
|
||||||
|
- Postmortem document captured in `docs/phase7-findings.md`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Cross-cutting conventions
|
||||||
|
|
||||||
|
- **Go version**: track the latest stable minor (update `go.mod` as needed).
|
||||||
|
- **Dependencies**: `stdlib + modernc.org/sqlite + golang.org/x/oauth2` baseline; every new dep needs a line in a `docs/deps.md` justifying it.
|
||||||
|
- **Linting**: `golangci-lint run` clean before merge.
|
||||||
|
- **Testing**: `go test -race ./...` clean before merge; prefer table-driven tests.
|
||||||
|
- **Logging**: structured JSON via `log/slog`. Never log tokens, even hashed.
|
||||||
|
- **Commits**: conventional commits (`feat:`, `fix:`, `chore:`…), atomic, referencing issue IDs once an issue tracker is in place.
|
||||||
|
- **Issue tracking**: set up `bd` (beads) inside this repo at the start of phase 1, so every phase's work lands as discrete issues.
|
||||||
Loading…
Add table
Add a link
Reference in a new issue