Ole-Morten Duesund 5eeac663d8 feat(broker): wire OAuth + MCP session glue into main; e2e test (forgejo-mcp-broker-q6n)

cmd/broker/main.go now composes every phase-2-5 component into a live
binary:

  /healthz      → internal/httpserver
  /oauth/*      → internal/oauth.Server.Handler()
  /.well-known  → internal/oauth.Server.Handler()
  /mcp          → oauth.Authenticator.RequireBearer
                   over session.Registry.Handler()

The SpawnFunc passed to the registry composes supervisor + bridge: each
new MCP session forks `forgejo-mcp --transport stdio` with the user's
upstream token in env, wraps stdio with a bridge, and returns the
bridge's HandleSSE as the per-session http.Handler. The reaper is wired
with a refresh callback that calls forgejo.Client.Refresh and persists
rotated tokens back to access_tokens before the rotator swaps the
session's child.

cmd/broker/e2e_test.go is the gating local validation: builds the
binary, builds forgejo-mcp from the sibling repo (skipped if absent),
stands up a fake Forgejo, runs the broker, and walks
register → authorize → callback → token → /mcp initialize → tools/list.
This catches:

  - any component left unwired
  - the subprocess-context bug fixed in this commit (using a request
    context in supervisor.Start kills the child when the request that
    minted it returns; the fix is a long-lived childCtx)
  - the happy-path Mcp-Session-Id mint+reuse cycle that unit tests
    can't exercise without a real subprocess

docs/phase7-findings.md documents both the local automated validation
(this test) and the manual Claude.ai-side checklist (OAuth completes,
tool discovery, tool invocation, session reuse, idle reap, mid-session
token refresh, revocation). The Claude.ai half is fundamentally manual
and stays that way; the automated test catches the broker bugs that
would otherwise hide behind operator setup mistakes.

Closes forgejo-mcp-broker-q6n. Phase 7 — and the project's primary
implementation track — complete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-27 17:55:18 +02:00

7.1 KiB

Raw Permalink Blame History

Phase 7: end-to-end findings

This document covers the validation that closes phase 7 (issue -q6n). The work splits naturally into two halves:

Local end-to-end — automated, runs against a fake Forgejo and a real forgejo-mcp child spawned through the live broker binary. Performed in this repo as TestE2E_FullOAuthAndMCPFlow under cmd/broker/e2e_test.go.
Public Claude.ai validation — manual, requires a real Forgejo instance and a publicly-reachable broker. Documented below as a step-by-step recipe so the operator can repeat it cleanly.

The local automated half was the gating prerequisite: without it, every manual step would hide both setup mistakes and broker bugs. With it green, the manual run becomes mostly a configuration exercise.

What the wiring shipped

cmd/broker/main.go now composes every component built across phases 2–5 into a single binary:

http.ServeMux
├── /healthz              ← internal/httpserver
├── /oauth/*              ← internal/oauth.Server.Handler()
├── /.well-known/*        ← internal/oauth.Server.Handler()
└── /mcp                  ← internal/oauth.Authenticator.RequireBearer
                            (gates internal/session.Registry.Handler())

The SpawnFunc passed to the session registry composes internal/supervisor + internal/bridge: each new MCP session forks a forgejo-mcp --transport stdio subprocess with the user's upstream Forgejo token in env, wires its stdio through a bridge, and returns the bridge's HandleSSE as the per-session HTTP handler.

The reaper is wired with a RefreshForgejo callback that calls the upstream Forgejo OAuth client and persists rotated tokens back to the access_tokens row. When a token comes within the rotation lead time of expiry, the rotator atomically swaps the session's child for one spawned with the fresh token; the sid is preserved and the client re-issues initialize on its next request.

Local end-to-end test

go test -v -run TestE2E_FullOAuthAndMCPFlow ./cmd/broker/

The test:

Builds the broker binary in TestMain.
Builds forgejo-mcp from the sibling repo at ../../../forgejo-mcp (skipped if not present; alternatively set FORGEJO_MCP_BIN).
Stands up a fake Forgejo on httptest.Server covering /api/v1/version, /api/v1/user, /login/oauth/access_token, and /login/oauth/userinfo.
Runs the broker with all CLI flags wired against the fake.
Walks through register → authorize → callback → token → /mcp initialize → tools/list.
Verifies that discovery (/.well-known/oauth-authorization-server) advertises the configured issuer.

It catches at least three classes of regression:

Wiring: any component not plumbed correctly into main.go fails a step rather than reporting "not implemented".
Subprocess-context: an early version of the wiring used the per-request HTTP context for supervisor.Start. Because exec.CommandContext ties process lifetime to the ctx, this killed the forgejo-mcp child as soon as the initialize request returned; the subsequent tools/list came back as 502. Fixed by introducing a long-lived childCtx in run().
Session lifecycle: 410 / 403 / 503 paths are exercised in unit tests but the happy-path 200 with Mcp-Session-Id minted on initialize, then reused on follow-up RPCs, is exercised here.

Public Claude.ai validation (manual)

This is the part the operator runs once. It needs:

a publicly-reachable hostname (mcp.example.com) with a real TLS cert (Caddy's automatic Let's Encrypt is sufficient — see deploy-caddy.md);
a real Forgejo instance you control;
a Claude.ai workspace where you can register a custom MCP connector.

Setup

Provision the broker container per deploy-podman.md. Confirm https://mcp.example.com/healthz returns {"status":"ok"}.

Confirm discovery surface:

curl -fsS https://mcp.example.com/.well-known/oauth-authorization-server | jq .issuer
# → "https://mcp.example.com"
curl -fsS https://mcp.example.com/.well-known/oauth-protected-resource | jq .resource
# → "https://mcp.example.com/mcp"

In Claude.ai, add a custom MCP connector pointing at https://mcp.example.com/mcp (or whatever the Claude.ai UI asks for — at the time of writing, Claude.ai detects the OAuth requirement from the protected-resource discovery doc and walks the user through the OAuth flow automatically).

Walkthrough checklist

Run through these in order. Each is a check the operator marks done.

OAuth completes. Claude.ai redirects to the broker's /oauth/authorize; the broker bounces to the Forgejo consent screen; Forgejo redirects back to /oauth/callback; Claude.ai shows the connector as authenticated. Expect the round trip to take a few seconds.
Tool discovery. Claude.ai's UI should list every tool the bundled forgejo-mcp exports — check at least get_forgejo_mcp_server_version, list_my_repos, list_repo_issues appear.
Tool invocation. Ask Claude something that exercises a tool (e.g. "list issues open on <your repo>"). Verify the tool runs and the response cites real data from your Forgejo instance.
Session reuse. Run two queries in a row. The broker should send one initialize per session, but successive calls should reuse the same forgejo-mcp child — visible in the broker's logs as no new "session reaped" / spawn entries between calls.
Idle timeout. Leave the conversation idle for 16 minutes (idle timeout default is 15m + 30s reaper tick). Check broker logs for "session reaped" matching your sid. The next query should trigger a fresh spawn.
Forgejo token refresh during active session. Lower --session-idle-timeout to something tolerable (say 1 h) and use a Forgejo token TTL shorter than that. Run the connector continuously until the rotator fires ("session rotated" in logs). Confirm the next query still works — Claude.ai re-issues initialize on the swapped child without user intervention.
Revocation. Hit POST /oauth/revoke with the access token. The next /mcp request from Claude.ai should fail with 401; Claude.ai should automatically re-authenticate.

Deliberately deferred

Items not checked in this validation, tracked in the backlog:

Rate limits on /oauth/register and /oauth/token (-ttl).
/proc/<pid>/environ token exposure via the Forgejo token in child env (-1n2).
At-rest encryption of Forgejo tokens (-sd4).
Prometheus /metrics for production observability (-7fn).
Forgejo Actions CI (-0sq).

All of these are P3 hardenings, not blockers for the Claude.ai integration itself.

Closing notes

Every phase-1-through-6 component is now exercised end-to-end against a real forgejo-mcp child. The only wiring that can't be tested without a real Claude.ai workspace is the actual MCP-discovery → OAuth-handoff sequence in Claude.ai's client UI; that's left as a manual one-time operator task using the checklist above.

7.1 KiB Raw Permalink Blame History Unescape Escape