Adapstory Developer Docs

MCP (Model Context Protocol) is how LLMs in Adapstory discover and invoke capabilities your plugin exposes. The Plugin Gateway (BC-01) sits between the LLM and your code.

Anatomy of a tool call

LLM ──▶ Plugin Gateway (BC-01) ──▶ Your plugin (/mcp/<tool>)
           │
           ├─ auth: service-account JWT scoped to (tenant, plugin)
           ├─ guardrails: content filter, prompt-injection check
           ├─ budget:     token + per-call cost check
           └─ telemetry:  OpenTelemetry span starts here

Your plugin sees a normal HTTP request with:

Authorization: Bearer <jwt> — the calling user's identity
X-Tenant-Id: <tenant> — the tenant context
X-Correlation-Id: <uuid> — joins the trace
X-Llm-Session: <id> — the conversation this call is part of
JSON body matching your schema

Writing the handler

Keep handlers small. The pattern is:

Validate args against your JSON-Schema (the SDK does this automatically).
Call core APIs or the LLM Gateway for the actual work.
Return a JSON result matching your returnSchema (if declared).

# src/hello_learner/tools.py
from adapstory_plugin_sdk import mcp_tool, LlmGateway, Context

@mcp_tool(name="hello_learner")
async def hello_learner(
    learner_name: str,
    *,
    ctx: Context,
    gateway: LlmGateway,
) -> dict:
    ctx.logger.info("greeting", learner=learner_name)
    response = await gateway.chat(
        model=ctx.plugin.defaultModel,
        messages=[
            {"role": "system", "content": "Be brief and encouraging."},
            {"role": "user",   "content": f"Say hi to {learner_name}."},
        ],
    )
    return {"message": response.text}

Streaming

For tools that emit partial results (summaries, generations), set streaming: true in the manifest and return Server-Sent Events:

@mcp_tool(name="summarize_course", streaming=True)
async def summarize_course(course_id: str, *, gateway: LlmGateway):
    async for chunk in gateway.stream(
        model="claude-sonnet-4-6",
        messages=[...],
    ):
        yield { "delta": chunk.text }

The Gateway forwards SSE to the LLM client, buffering appropriately.

Cancellation

The Gateway sends an X-Llm-Cancel: <correlation-id> header when the upstream LLM drops. Respect it:

async for chunk in gateway.stream(...):
    if ctx.cancelled:
        break
    yield { "delta": chunk.text }

SDK propagates the cancellation to in-flight Gateway calls.

Budgets and back-pressure

If BC-01 returns 429 Too Many Requests with X-LLM-Budget-Reset, do not retry. Return a degraded result (cached, heuristic, or empty) and let the client decide.

Per-tool rate limits (declared in manifest.rateLimit) are enforced by the Gateway. You don't need to implement them.

Error contracts

Return structured errors so the LLM can recover:

from adapstory_plugin_sdk import ToolError

raise ToolError(
    code="learner_not_found",
    message="No learner with that ID in this tenant.",
    retryable=False,
    suggestion="Check that the learner has been enrolled.",
)

The Gateway converts this to a JSON response the LLM can reason about.

Observability contract

Every @mcp_tool-decorated handler automatically:

Starts an OpenTelemetry span plugin.{name}.tool.{tool_name} with attributes tenant.id, user.id, llm.session.id.
Emits a log line at INFO on entry + exit.
Reports duration + outcome to Prometheus via the OTLP collector.

You don't have to instrument any of this. Do add ctx.logger.info(...) with business context though.

Next: Lifecycle.

MCP tools