forgejo-mcp-broker/docs/phase7-findings.md
Ole-Morten Duesund 5eeac663d8 feat(broker): wire OAuth + MCP session glue into main; e2e test (forgejo-mcp-broker-q6n)
cmd/broker/main.go now composes every phase-2-5 component into a live
binary:

  /healthz      → internal/httpserver
  /oauth/*      → internal/oauth.Server.Handler()
  /.well-known  → internal/oauth.Server.Handler()
  /mcp          → oauth.Authenticator.RequireBearer
                   over session.Registry.Handler()

The SpawnFunc passed to the registry composes supervisor + bridge: each
new MCP session forks `forgejo-mcp --transport stdio` with the user's
upstream token in env, wraps stdio with a bridge, and returns the
bridge's HandleSSE as the per-session http.Handler. The reaper is wired
with a refresh callback that calls forgejo.Client.Refresh and persists
rotated tokens back to access_tokens before the rotator swaps the
session's child.

cmd/broker/e2e_test.go is the gating local validation: builds the
binary, builds forgejo-mcp from the sibling repo (skipped if absent),
stands up a fake Forgejo, runs the broker, and walks
register → authorize → callback → token → /mcp initialize → tools/list.
This catches:

  - any component left unwired
  - the subprocess-context bug fixed in this commit (using a request
    context in supervisor.Start kills the child when the request that
    minted it returns; the fix is a long-lived childCtx)
  - the happy-path Mcp-Session-Id mint+reuse cycle that unit tests
    can't exercise without a real subprocess

docs/phase7-findings.md documents both the local automated validation
(this test) and the manual Claude.ai-side checklist (OAuth completes,
tool discovery, tool invocation, session reuse, idle reap, mid-session
token refresh, revocation). The Claude.ai half is fundamentally manual
and stays that way; the automated test catches the broker bugs that
would otherwise hide behind operator setup mistakes.

Closes forgejo-mcp-broker-q6n. Phase 7 — and the project's primary
implementation track — complete.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:55:18 +02:00

168 lines
7.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 7: end-to-end findings
This document covers the validation that closes phase 7 (issue `-q6n`).
The work splits naturally into two halves:
1. **Local end-to-end** — automated, runs against a fake Forgejo and a
real forgejo-mcp child spawned through the live broker binary.
Performed in this repo as `TestE2E_FullOAuthAndMCPFlow` under
`cmd/broker/e2e_test.go`.
2. **Public Claude.ai validation** — manual, requires a real Forgejo
instance and a publicly-reachable broker. Documented below as a
step-by-step recipe so the operator can repeat it cleanly.
The local automated half was the gating prerequisite: without it, every
manual step would hide both setup mistakes and broker bugs. With it
green, the manual run becomes mostly a configuration exercise.
## What the wiring shipped
`cmd/broker/main.go` now composes every component built across phases
25 into a single binary:
```
http.ServeMux
├── /healthz ← internal/httpserver
├── /oauth/* ← internal/oauth.Server.Handler()
├── /.well-known/* ← internal/oauth.Server.Handler()
└── /mcp ← internal/oauth.Authenticator.RequireBearer
(gates internal/session.Registry.Handler())
```
The `SpawnFunc` passed to the session registry composes
`internal/supervisor` + `internal/bridge`: each new MCP session forks a
`forgejo-mcp --transport stdio` subprocess with the user's upstream
Forgejo token in env, wires its stdio through a bridge, and returns the
bridge's `HandleSSE` as the per-session HTTP handler.
The reaper is wired with a `RefreshForgejo` callback that calls the
upstream Forgejo OAuth client and persists rotated tokens back to the
`access_tokens` row. When a token comes within the rotation lead time of
expiry, the rotator atomically swaps the session's child for one
spawned with the fresh token; the sid is preserved and the client
re-issues `initialize` on its next request.
## Local end-to-end test
```bash
go test -v -run TestE2E_FullOAuthAndMCPFlow ./cmd/broker/
```
The test:
1. Builds the broker binary in TestMain.
2. Builds `forgejo-mcp` from the sibling repo at `../../../forgejo-mcp`
(skipped if not present; alternatively set `FORGEJO_MCP_BIN`).
3. Stands up a fake Forgejo on `httptest.Server` covering
`/api/v1/version`, `/api/v1/user`, `/login/oauth/access_token`, and
`/login/oauth/userinfo`.
4. Runs the broker with all CLI flags wired against the fake.
5. Walks through register → authorize → callback → token → /mcp
initialize → tools/list.
6. Verifies that discovery (`/.well-known/oauth-authorization-server`)
advertises the configured issuer.
It catches at least three classes of regression:
- **Wiring**: any component not plumbed correctly into main.go fails
a step rather than reporting "not implemented".
- **Subprocess-context**: an early version of the wiring used the
per-request HTTP context for `supervisor.Start`. Because
`exec.CommandContext` ties process lifetime to the ctx, this killed
the forgejo-mcp child as soon as the `initialize` request returned;
the subsequent `tools/list` came back as 502. Fixed by introducing
a long-lived `childCtx` in `run()`.
- **Session lifecycle**: 410 / 403 / 503 paths are exercised
in unit tests but the happy-path 200 with `Mcp-Session-Id` minted on
initialize, then reused on follow-up RPCs, is exercised here.
## Public Claude.ai validation (manual)
This is the part the operator runs once. It needs:
- a publicly-reachable hostname (`mcp.example.com`) with a real TLS
cert (Caddy's automatic Let's Encrypt is sufficient — see
`deploy-caddy.md`);
- a real Forgejo instance you control;
- a Claude.ai workspace where you can register a custom MCP connector.
### Setup
1. Provision the broker container per `deploy-podman.md`. Confirm
`https://mcp.example.com/healthz` returns `{"status":"ok"}`.
2. Confirm discovery surface:
```bash
curl -fsS https://mcp.example.com/.well-known/oauth-authorization-server | jq .issuer
# → "https://mcp.example.com"
curl -fsS https://mcp.example.com/.well-known/oauth-protected-resource | jq .resource
# → "https://mcp.example.com/mcp"
```
3. In Claude.ai, add a custom MCP connector pointing at
`https://mcp.example.com/mcp` (or whatever the Claude.ai UI asks for
— at the time of writing, Claude.ai detects the OAuth requirement
from the protected-resource discovery doc and walks the user
through the OAuth flow automatically).
### Walkthrough checklist
Run through these in order. Each is a check the operator marks done.
- [ ] **OAuth completes.** Claude.ai redirects to the broker's
`/oauth/authorize`; the broker bounces to the Forgejo consent screen;
Forgejo redirects back to `/oauth/callback`; Claude.ai shows the
connector as authenticated. Expect the round trip to take a few
seconds.
- [ ] **Tool discovery.** Claude.ai's UI should list every tool the
bundled `forgejo-mcp` exports — check at least
`get_forgejo_mcp_server_version`, `list_my_repos`, `list_repo_issues`
appear.
- [ ] **Tool invocation.** Ask Claude something that exercises a tool
(e.g. "list issues open on `<your repo>`"). Verify the tool runs and
the response cites real data from your Forgejo instance.
- [ ] **Session reuse.** Run two queries in a row. The broker should
send one initialize per session, but successive calls should reuse
the same forgejo-mcp child — visible in the broker's logs as no new
"session reaped" / spawn entries between calls.
- [ ] **Idle timeout.** Leave the conversation idle for 16 minutes
(idle timeout default is 15m + 30s reaper tick). Check broker logs
for `"session reaped"` matching your sid. The next query should
trigger a fresh spawn.
- [ ] **Forgejo token refresh during active session.** Lower
`--session-idle-timeout` to something tolerable (say 1 h) and use a
Forgejo token TTL shorter than that. Run the connector continuously
until the rotator fires (`"session rotated"` in logs). Confirm the
next query still works — Claude.ai re-issues `initialize` on the
swapped child without user intervention.
- [ ] **Revocation.** Hit `POST /oauth/revoke` with the access token.
The next /mcp request from Claude.ai should fail with 401; Claude.ai
should automatically re-authenticate.
### Deliberately deferred
Items not checked in this validation, tracked in the backlog:
- **Rate limits** on `/oauth/register` and `/oauth/token` (`-ttl`).
- **`/proc/<pid>/environ` token exposure** via the Forgejo token in
child env (`-1n2`).
- **At-rest encryption of Forgejo tokens** (`-sd4`).
- **Prometheus `/metrics`** for production observability (`-7fn`).
- **Forgejo Actions CI** (`-0sq`).
All of these are P3 hardenings, not blockers for the Claude.ai
integration itself.
## Closing notes
Every phase-1-through-6 component is now exercised end-to-end against a
real `forgejo-mcp` child. The only wiring that can't be tested without
a real Claude.ai workspace is the actual MCP-discovery → OAuth-handoff
sequence in Claude.ai's client UI; that's left as a manual one-time
operator task using the checklist above.