Structured review of every OAuth/auth handler against the standard
attack catalog. Findings table added to design.md §8.2.
Two real issues found and fixed:
- Refresh-token replay race: tokenRefreshGrant read the row, validated
it, then minted a new pair before unconditionally revoking the old
refresh. Two concurrent /token requests with the same refresh would
both pass validation and both mint a fresh pair — token-quota
duplication and a hint to a stolen-refresh attacker. Fixed with the
same atomic UPDATE rows-affected pattern already used for auth-code
single-use. New TestToken_Refresh_ConcurrentReplay_OnlyOneSucceeds
races two goroutines and verifies exactly one wins.
- Permissive redirect_uri schemes: validateRedirectURI accepted any
non-empty scheme, including javascript: and data:. Tightened to
require https, http for loopback only, or a reverse-DNS private-use
scheme per RFC 8252 §7.1. TestValidateRedirectURI updated to cover
each variant including the rejected javascript:/data: cases.
Items deferred to backlog (already filed):
- Rate limits on /oauth/register and /oauth/token (-ttl)
- --token-fd to close the /proc/<pid>/environ window (-1n2)
- AES-GCM at-rest encryption of Forgejo tokens (-sd4)
Closes forgejo-mcp-broker-wgo. Phase 2 complete.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
22 KiB
Design: forgejo-mcp-broker
1. Problem
Claude.ai (and other MCP clients following the MCP authorization spec) expect to connect to an MCP server over streamable HTTP with an OAuth 2.1 authorization flow, including:
- RFC 9728 protected-resource metadata (
/.well-known/oauth-protected-resource) - RFC 8414 authorization-server metadata (
/.well-known/oauth-authorization-server) - RFC 7591 dynamic client registration (
POST /register) - PKCE + authorization code flow
forgejo-mcp speaks streamable HTTP but authenticates with a single shared Forgejo personal access token baked into the process at startup. It has no notion of per-user identity, and cannot serve multiple users at once.
Forgejo is a capable OAuth2 provider (endpoints under /login/oauth/*, OIDC discovery at /.well-known/openid-configuration) — but it does not support RFC 7591 dynamic client registration. Claude.ai cannot register itself as a client against a Forgejo instance directly.
We need something in the middle.
2. Non-goals
- Not a Forgejo OAuth proxy for arbitrary API use. Only the MCP protocol surface is exposed.
- Not multi-Forgejo. One broker instance speaks to one Forgejo URL.
- Not a drop-in replacement for
forgejo-mcp. It wraps and supervisesforgejo-mcp, it does not replace it. - Not a horizontally-scaled service for public SaaS use. Target is self-hosted / team-scale deployments (tens of concurrent sessions). Scaling further requires design changes (see section 9).
3. Architecture
The broker plays two roles simultaneously:
- OAuth Authorization Server to the MCP client (Claude.ai).
- OAuth Client of Forgejo.
Tokens issued to the MCP client are opaque strings minted by the broker. Each one maps internally to a real Forgejo access+refresh token, which the broker holds in its store and passes to subprocesses via environment.
┌──────── container / pod ─────────┐
│ │
│ ┌──────────────────────────┐ │
│ │ fjmcp-broker :8080 │ │
Claude.ai ──HTTPS─▶ Caddy─▶ │ │
│ │ • Discovery endpoints │ │
│ │ • /register (DCR) │ │
│ │ • /authorize ─┐ │ │───▶ Forgejo /login/oauth/authorize
│ │ • /callback ◀─┘ │ │◀── code
│ │ • /token │ │───▶ Forgejo /login/oauth/access_token
│ │ • /revoke │ │
│ │ • /mcp (gated) │ │
│ │ • Session registry │ │
│ │ • Supervisor + reaper │ │
│ └──────────┬───────────────┘ │
│ │ spawn + pipes │
│ ▼ │
│ ┌──────────────────────────┐ │
│ │ forgejo-mcp (stdio) │ ──▶│──▶ Forgejo API
│ │ FORGEJO_ACCESS_TOKEN=… │ │
│ │ one per active session │ │
│ └──────────────────────────┘ │
└───────────────────────────────────┘
4. Component: OAuth authorization-server facade
4.1 Endpoints
| Method | Path | Purpose |
|---|---|---|
GET |
/.well-known/oauth-protected-resource |
Advertise: I am a resource server, my AS is at this issuer. |
GET |
/.well-known/oauth-authorization-server |
Advertise endpoints, PKCE required, supported scopes. |
POST |
/oauth/register |
RFC 7591 dynamic client registration. Accept any well-formed request; persist and return client_id. |
GET |
/oauth/authorize |
Validate PKCE + redirect_uri + client_id. Stash state. 302 to Forgejo's /login/oauth/authorize. |
GET |
/oauth/callback |
Receive Forgejo's auth code. Exchange for Forgejo access+refresh tokens. Mint broker auth code. Redirect back to MCP client's redirect_uri. |
POST |
/oauth/token |
Exchange broker auth code → broker access+refresh token. Persist mapping broker_token → forgejo_token. |
POST |
/oauth/revoke |
Invalidate a broker token; revoke upstream Forgejo token if possible. |
4.2 Token store (SQLite)
One file, mounted as a volume for persistence across container restarts. Pure-Go driver: modernc.org/sqlite — no CGO, keeps the container image fully static.
Tables:
clients—client_id,client_secret(nullable for public clients),redirect_uris[],created_at,last_used, optionalmetadata_json.auth_codes—code,client_id,redirect_uri,code_challenge,code_challenge_method,forgejo_access_token,forgejo_refresh_token,forgejo_token_expires_at,forgejo_user_id,forgejo_username,scopes,expires_at(~10 min),used_at.access_tokens—token_hash,client_id,forgejo_user_id,forgejo_username,scopes,expires_at,forgejo_access_token,forgejo_refresh_token,forgejo_token_expires_at,revoked_at.refresh_tokens—token_hash,access_token_hash,client_id,expires_at,revoked_at.
Broker tokens are stored hashed (SHA-256) — the plaintext leaves the broker exactly once, when handed to the MCP client.
Forgejo tokens are stored in cleartext (the broker must be able to use them to spawn subprocesses). This means the SQLite file is a sensitive secret at rest. Mitigations:
- Volume permissions locked to the broker's UID/GID.
- Consider OS-level encryption of the mount (LUKS, cloud KMS-backed volume) for production deployments.
- Optional: encrypt Forgejo tokens at the application layer with a key loaded from env — adds complexity, decide in phase 2.
4.3 Forgejo OAuth app configuration (one-time, operator task)
- Sign in to Forgejo as the operator / service account that should "own" this integration.
- Settings → Applications → OAuth2 Applications → Create application.
- Redirect URI:
https://<public-hostname>/oauth/callback(the broker's public URL). - Save
client_idandclient_secretinto broker env:FORGEJO_OAUTH_CLIENT_IDFORGEJO_OAUTH_CLIENT_SECRET
- Pick the scope set. Forgejo scopes are coarse. A superset that matches
forgejo-mcp's current tool surface:read:user write:repository write:issue write:notification read:organization. Configurable viaFORGEJO_OAUTH_SCOPES.
4.4 Public base URL
The broker must know its own public URL to emit correct redirect URIs and discovery metadata. Required config:
--public-url/FJMCP_BROKER_PUBLIC_URL, e.g.https://mcp.example.com.
All issuer URLs in discovery documents are built from this value — never from the inbound Host or X-Forwarded-* headers. Publishing attacker-controlled issuer URLs is a classic OAuth vulnerability.
4.5 Library choice
Two candidates for the AS implementation:
- Hand-rolled minimal AS: the flow is narrow (authorization code + PKCE + DCR + refresh + revoke). Probably 500–800 lines plus tests. Pro: no heavy dependency, full control of the security surface. Con: we own every edge case.
github.com/ory/fosite: fully compliant OAuth 2.1 / OIDC building blocks. Pro: fewer footguns, wide adoption. Con: heavyweight API, larger binary, bigger attack surface from unused features.
Leaning toward hand-rolled because the flow is small and fosite adds complexity we don't need. Decision to be reconfirmed at start of phase 2.
5. Component: session multiplexer
5.1 Session state
type Session struct {
ID string // Mcp-Session-Id header value
ForgejoUser string // for logging / revocation
Proc *exec.Cmd // the spawned forgejo-mcp child
Stdin io.WriteCloser // broker writes JSON-RPC here
Stdout io.ReadCloser // broker reads JSON-RPC from here
Stderr io.ReadCloser // drained to logs, prefixed with sid
LastActive atomic.Int64
Done chan struct{}
forgejoTokenID string // ref to access_tokens row, for refresh
}
5.2 Spawn
On the first initialize request for a new session, after bearer-token validation:
cmd := exec.CommandContext(ctx, brokerCfg.ForgejoMCPBinary,
"--transport", "stdio",
"--url", brokerCfg.ForgejoURL,
)
cmd.Env = append(os.Environ(),
"FORGEJO_ACCESS_TOKEN="+session.ForgejoAccessToken,
"FORGEJO_USER_AGENT=fjmcp-broker/"+version,
)
cmd.Stdin, _ = cmd.StdinPipe()
cmd.Stdout, _ = cmd.StdoutPipe()
cmd.Stderr, _ = cmd.StderrPipe()
cmd.Start()
go drainStderr(sid, stderrPipe)
go waitReap(cmd) // must call Wait() to avoid zombies
forgejo-mcp runs its own VerifyConnection() at startup — one round trip to Forgejo. Expect ~100–300 ms before the subprocess is ready to accept input. The first initialize response is the natural place for this latency to hide.
5.3 Bridge
MCP is JSON-RPC 2.0 over both transports. Message shapes are identical. The broker can pipe messages opaquely without parsing them.
Request path (claude.ai → forgejo-mcp):
POST /mcpwithAuthorization: Bearer <broker_token>and (after first message)Mcp-Session-Id: <sid>.- Middleware: resolve token → session. 401 if missing/expired.
- Look up session. If none and method is
initialize, create one. Otherwise 404. - Write the request body as one
\n-terminated line tostdin. - Read one or more response lines from
stdout, stream them to the HTTP response (SSE framing). - Bump
LastActive.
If Caddy is in front, flush_interval -1 on its reverse-proxy directive is mandatory — default response buffering breaks SSE.
5.4 Lifecycle
| Event | Action |
|---|---|
| SSE stream closed by client | Start idle countdown. Don't kill immediately — Claude.ai reconnects frequently. |
| Idle timeout exceeded (default 15 min) | SIGTERM; after 5 s grace, SIGKILL. Remove from registry. |
| Child exits (EOF on stdout) | Mark session dead. Tombstone the sid so late requests return 410 Gone. |
| Broker shutdown | Iterate sessions, SIGTERM all children, wait grace period, then SIGKILL. |
| Token revoked | Find sessions using that broker token, kill their children, remove sessions. |
| Forgejo token expired | See section 6. |
A reaper goroutine runs every 30 s and applies the idle-timeout rule.
5.5 Do not try to resume sessions across child restarts
MCP's initialize handshake is stateful (protocol version negotiation, capability exchange). If a child crashes, the session is dead; the MCP client must re-initialize. Any attempt to persist and replay protocol state in the broker is a rathole. Don't go there.
6. Forgejo access-token rotation
Forgejo access tokens expire. The broker has the refresh token and must keep things working without forcing the user to re-authenticate.
Strategy:
- Track
forgejo_token_expires_atin the token store. - Background goroutine runs every minute. For any active session whose Forgejo token expires in less than 2 minutes: call Forgejo's refresh endpoint, update the store.
- The child already holds the old token in its env. After refresh:
SIGTERMthe child, spawn a new one with the new token, let the MCP clientinitializeagain on its next request.
This causes a user-visible blip (~200 ms reconnect) once per Forgejo token lifetime. Acceptable default. A future optimisation could use a side-channel (e.g., SIGHUP handled by forgejo-mcp to re-read a token file) to avoid the blip — explicitly out of scope for v1.
7. Deployment
7.1 Container
Single container, multi-stage build. Both binaries ship in the final image; the broker execs forgejo-mcp as a sibling.
FROM docker.io/library/golang:1.23 AS build
WORKDIR /src
COPY . .
RUN CGO_ENABLED=0 go build -trimpath -ldflags='-s -w' -o /out/fjmcp-broker ./cmd/broker
# forgejo-mcp is vendored as a submodule or fetched during build:
RUN go install -trimpath -ldflags='-s -w' codeberg.org/goern/forgejo-mcp/v2@<pinned>
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=build /out/fjmcp-broker /usr/local/bin/
COPY --from=build /go/bin/forgejo-mcp /usr/local/bin/
USER nonroot:nonroot
EXPOSE 8080
ENTRYPOINT ["/usr/local/bin/fjmcp-broker"]
Container labels include build timestamp and git revision per the user's global standards.
7.2 Caddy
mcp.example.com {
encode zstd gzip
reverse_proxy forgejo-mcp-broker:8080 {
header_up Host {host}
header_up X-Forwarded-Proto https
header_up X-Forwarded-For {remote_host}
flush_interval -1 # REQUIRED for SSE
}
}
The broker itself terminates plain HTTP; Caddy handles TLS with Let's Encrypt.
7.3 Config surface (all optional unless noted)
| Flag | Env | Required | Purpose |
|---|---|---|---|
--public-url |
FJMCP_BROKER_PUBLIC_URL |
yes | Public issuer URL, e.g. https://mcp.example.com |
--listen |
FJMCP_BROKER_LISTEN |
Listen addr, default :8080 |
|
--forgejo-url |
FORGEJO_URL |
yes | Upstream Forgejo instance URL |
--forgejo-oauth-client-id |
FORGEJO_OAUTH_CLIENT_ID |
yes | Forgejo OAuth app credentials |
--forgejo-oauth-client-secret |
FORGEJO_OAUTH_CLIENT_SECRET |
yes | |
--forgejo-oauth-scopes |
FORGEJO_OAUTH_SCOPES |
Space-separated, default covers the full tool surface | |
--forgejo-mcp-binary |
FJMCP_BROKER_MCP_BINARY |
Path to forgejo-mcp, default /usr/local/bin/forgejo-mcp |
|
--store-path |
FJMCP_BROKER_STORE |
SQLite file path, default /data/broker.db |
|
--max-sessions |
FJMCP_BROKER_MAX_SESSIONS |
Hard cap, default 100 |
|
--session-idle-timeout |
FJMCP_BROKER_IDLE_TIMEOUT |
Default 15m |
|
--debug |
FJMCP_BROKER_DEBUG |
Verbose logging |
8. Security
8.1 Posture
- Public-URL authority. Never derive issuer URLs from inbound headers — always from config. Publishing the wrong issuer allows an attacker to redirect flows to endpoints they control.
- PKCE required. Reject authorize requests without
code_challenge. OnlyS256method supported. - Token storage. Broker access/refresh tokens stored as SHA-256 hashes. Forgejo tokens stored cleartext (they must be usable for subprocess spawning); file permissions and optional encrypted volume mitigate at-rest risk.
- Subprocess environment. Each subprocess sees only its own user's
FORGEJO_ACCESS_TOKEN. On the same UID,/proc/<pid>/environis readable — acceptable given single-tenant container, but worth noting. A--token-fdflag onforgejo-mcpwould eliminate this; defer unless threat model demands it. - Rate limits.
/oauth/registerand/oauth/tokenshould have request limits to blunt abuse. Start with Caddy-level rate limits; move into the broker if finer control is needed. - Audit log. Structured log line per: client registration, authorize start, authorize callback success/failure, token issuance, token revocation, session spawn, session reap, child crash. Include
client_id,forgejo_username, and session id. Do not log tokens. - Dependencies. Keep the dependency tree small and pinned. Review before adding any new dep.
8.2 Phase-2 attack-path review
Performed at the end of phase 2 (issue -wgo). Each row is an attack class checked against the implementation; "covered by" points at the test that exercises the defence (or notes that it's untested).
| # | Attack class | Defence | Covered by |
|---|---|---|---|
| 1 | Open redirect via crafted redirect_uri |
Exact-match against the URIs registered with the client; bare equality, no prefix or pattern matching. | TestAuthorize_MismatchedRedirectURI, TestToken_AuthCode_WrongRedirectURI |
| 2 | Auth-code replay (use the same code twice) | Atomic UPDATE auth_codes SET used_at=? WHERE used_at IS NULL with rows-affected check; second use sees 0 affected rows and returns invalid_grant. |
TestToken_AuthCode_Reuse |
| 3 | Auth-code substitution / mix-up across flows | PKCE S256: a code without the matching verifier can't be exchanged. code_verifier is required; verifyPKCE uses subtle.ConstantTimeCompare. |
TestToken_AuthCode_PKCEFails |
| 4 | PKCE downgrade (request plain or omit) |
code_challenge_method must equal "S256"; missing challenge or non-S256 method redirects to client with error=invalid_request. |
TestAuthorize_MissingPKCE_RedirectsWithError, TestAuthorize_NonS256_RedirectsWithError |
| 5 | Issuer-URL spoofing via Host header |
Discovery documents and redirect URIs derived from cfg.Issuer, never from the request. |
TestDiscovery_IssuerIgnoresHostHeader |
| 6 | Refresh-token replay (concurrent or sequential reuse) | Atomic UPDATE refresh_tokens SET revoked_at=? WHERE revoked_at IS NULL with rows-affected check before minting a new pair — same single-shot pattern as the auth-code grant. Fix landed in this review; previous code read+validated, then unconditionally revoked, racing two simultaneous refreshes. |
TestToken_Refresh_ConcurrentReplay_OnlyOneSucceeds, TestToken_Refresh_RevokedToken |
| 7 | Refresh expiry / revocation bypass | Bearer middleware checks expires_at and revoked_at on every request. Token rotation revokes old refresh on issue. |
TestRequireBearer_Expired/Revoked, TestToken_Refresh_RevokedToken, TestToken_Refresh_Expired |
| 8 | Token leakage in audit logs | Handler logs use slog.String("err", err.Error()) and structured fields (client_id, username, sid) — never the raw access/refresh token. Stored tokens are hashed. |
Manual review; potential future test using a captured slog handler. |
| 9 | Session fixation via attacker-supplied sid | Sids are minted server-side via crypto/rand; the registry never accepts a client-supplied sid. |
TestServe_NewSession_MintsSidAndDispatches |
| 10 | Sid hijack (legitimate sid reused under a different bearer) | Sids are bound to the bearer that minted them (BrokerTokenHash); cross-token access returns 403, not 410, so probing is distinguishable from "unknown sid". |
TestServe_TokenMismatch_403 |
| 11 | redirect_uri accepting code-execution schemes (javascript:, data:) |
validateRedirectURI rejects these explicitly. Non-https schemes must be either loopback http:// or a reverse-DNS private-use scheme per RFC 8252 §7.1. Tightened in this review. |
TestValidateRedirectURI |
| 12 | DCR-driven storage flooding | None at the broker level yet — relies on Caddy rate-limit. Tracked as backlog item -ttl. |
Not covered. |
| 13 | Missing PKCE verifier on /token |
code_verifier required; missing → invalid_request. |
TestToken_AuthCode_MissingFields |
| 14 | Authorization without state |
The broker preserves and echoes the client's state on every redirect (success + error paths) so the client can run its own CSRF check. |
TestAuthorize_RedirectsToForgejo, TestAuthorize_MissingPKCE_RedirectsWithError |
| 15 | Bearer-header parsing tricks (no scheme, empty token) | Middleware rejects missing header, wrong scheme, and empty token after Bearer . |
TestRequireBearer_NoHeader/WrongScheme/EmptyToken_401 |
Items deferred to follow-up issues: rate limits (-ttl), --token-fd to avoid /proc/<pid>/environ exposure (-1n2), AES-GCM at-rest encryption of Forgejo tokens (-sd4), structured-log assertion for the no-token-leakage check (no issue yet — file when wiring real audit logging).
9. Scaling notes
Single-instance design:
- Sessions are process-local — no state sharing between broker instances. You can run exactly one broker pod.
- Token store in SQLite on a local volume — can't be shared safely across instances.
Acceptable for self-hosted / team use (tens of concurrent sessions). To scale horizontally you'd need: session-affinity routing (sticky sessions), or move the session registry and token store to a shared service (Postgres, Redis). Out of scope for v1.
10. Open questions
- Hand-rolled AS vs. fosite vs. zitadel/oidc. Revisit at start of phase 2 with a prototype to ground the decision.
- Per-user scope narrowing. Forgejo OAuth lets the user approve or deny the requested scopes. Do we expose scope choice in our own consent screen (requires interstitial UI), or inherit Forgejo's consent screen 1:1 (simpler, probably fine)? Lean toward inheriting.
- Shared broker vs. per-user forgejo-mcp process. Current design: one child per session. Could also be one per user (multiple sessions share a child). Per-session wins on isolation; per-user wins on footprint. Stick with per-session unless measurements show a problem.
- Forgejo token rotation UX. Accept a 200 ms reconnect blip, or invest in a no-restart rotation path via
--token-fdor a signal-based re-read? Defer unless users complain. - Observability surface. Plain structured JSON logs to stderr for v1. Prometheus metrics (
/metrics) is a natural follow-up — session count, spawn/reap rates, OAuth endpoint latencies, Forgejo refresh success rate. - License. MIT and Apache-2.0 both fit. Pick before the first tagged release.
11. Relationship to forgejo-mcp
The broker treats forgejo-mcp as an opaque PAT-consuming stdio MCP server. No API dependency beyond the CLI flags --transport stdio --url <url> and the FORGEJO_ACCESS_TOKEN env var.
Two optional hardenings to forgejo-mcp itself, both deferrable:
--token-fd N: read the token from an inherited file descriptor instead of env. Removes the/proc/<pid>/environleak path.- Verified clean exit on stdin EOF: should already work via mcp-go's
ServeStdiointernal behavior, but worth an explicit test.
Neither is required for v1 of the broker. Both can be contributed upstream as independent PRs later.