Observability
Metrics, traces, logs, and profiles for plugins in production.
Plugins get instrumentation for free, but knowing what's emitted — and how to read it — makes debugging an order of magnitude faster.
The four pillars
| Pillar | Tool | Where to look |
|---|---|---|
| Metrics | Prometheus | grafana.adapstory.com → dashboard Plugin Overview |
| Traces | Tempo | grafana.adapstory.com → Explore → Tempo |
| Logs | Loki | grafana.adapstory.com → Explore → Loki |
| Profiles | Pyroscope | grafana.adapstory.com → Pyroscope, or Tempo → Profiles link |
All four are linked by trace.id and tenant.id.
Auto-instrumentation
Every plugin pod gets:
- OTEL sidecar — collects OTLP traces from port 4318.
- Alloy eBPF DaemonSet — CPU profiles (no code changes, interpreter tracer for Python/JVM/Go/V8/Ruby).
- Fluent Bit — tails container stdout, ships JSON to Loki.
- ServiceMonitor — scrapes
/metricson port 9090 if present.
What the SDK emits
Every @mcp_tool handler emits:
- Span
plugin.{name}.tool.{tool_name}with attributes:tenant.id,user.id,plugin.version,llm.session.idllm.tokens.prompt,llm.tokens.completion,llm.model,llm.cost_usd
- Metric
plugin_tool_duration_seconds(histogram) labelled bytool,tenant,version,outcome - Metric
plugin_llm_tokens_total(counter) labelled bytool,tenant,model,direction(prompt/completion) - Log event
plugin.tool.invoked+plugin.tool.completedat INFO
SLOs the platform tracks for you
| SLO | Target | Window |
|---|---|---|
| Tool success rate | 99.5% | 30d |
| p99 tool latency | Declared in manifest | 30d |
| LLM budget burn | Below declared gatewayBudget | 1d rolling |
Burn-rate alerts page on-call if you breach 14.4× burn for 1h and 6× for 6h (Google SRE multi-window/multi-burn-rate). No setup required.
Adding business metrics
Use the SDK's metrics helper:
from adapstory_plugin_sdk import meter
counter = meter.create_counter(
name="plugin.courses_recommended",
description="Recommendations emitted per tenant",
unit="1",
)
counter.add(1, attributes={"course_type": course.kind})Metrics are auto-prefixed with your plugin name and tagged with tenant.id. They show up in Prometheus within ~30s.
Adding traces
The OTel SDK is already configured. Just create spans:
from adapstory_plugin_sdk import tracer
with tracer.start_as_current_span("rank_recommendations") as span:
span.set_attribute("candidates", len(candidates))
result = await rank(candidates)
span.set_attribute("top_score", result[0].score)The span becomes a child of the incoming tool-call span automatically.
Debugging in production
The single most useful query: filter Loki for {plugin="your-name", tenant_id="..."} and tail while you reproduce. JSON logs mean you can pivot on any attribute.
Flow for "it's slow for tenant X":
- Grafana → dashboard "Plugin Overview" → filter by
tenant=X. - Find the slow percentile window → click a sample span → open in Tempo.
- In Tempo → "Profiles" button → see CPU flame graph for the same window.
- Cross-check
llm_cost_usdpanel — if it spikes, the model call is the bottleneck, not your code.
Alerts you own
Plugins don't define their own Prometheus alerts. If you need a business alert (e.g., "recommendation quality dropped"), emit a metric and request a dashboard/alert through the platform team — we'll add it to the custom-alerts.yaml in the monitoring repo.
That's it — you're production-grade.