Metrics
Counters and histograms ADK-TS records automatically, plus how to record your own measurements
ADK-TS records metrics for every agent invocation, tool execution, and LLM call. Metrics are exported via OTLP on a configurable interval and follow OpenTelemetry standards, so they work with Prometheus, Grafana, Datadog, and any other OTLP-compatible metrics backend.
Jaeger only supports traces
Jaeger does not ingest metrics. Use Grafana with Tempo or a full OTLP backend like Datadog for metrics visualization.
Automatic Metrics
Counters
Counters increment once per operation and reset on process restart:
| Metric | Labels | Description |
|---|---|---|
adk.agent.invocations | agent.name, environment, status | Total agent invocations |
adk.tool.executions | tool.name, agent.name, environment, status | Total tool executions |
adk.llm.calls | model, agent.name, environment, status | Total LLM calls |
adk.errors | error_type, context | Total errors |
Histograms
Histograms record distributions of values, enabling percentile queries:
| Metric | Unit | Labels | Description |
|---|---|---|---|
adk.agent.duration | ms | agent.name, environment, status | Agent execution duration |
adk.tool.duration | ms | tool.name, agent.name, environment, status | Tool execution duration |
adk.llm.duration | ms | model, agent.name, environment, status | LLM call duration |
adk.llm.tokens | tokens | model, agent.name, environment, status | Total tokens per LLM call |
adk.llm.tokens.input | tokens | model, agent.name, environment, status | Input tokens per LLM call |
adk.llm.tokens.output | tokens | model, agent.name, environment, status | Output tokens per LLM call |
The status label is either success or error. All metrics also pick up any resource attributes you set via resourceAttributes during initialization.
Metric Export Interval
Metrics are buffered and exported periodically rather than per-operation. The default interval is 60 seconds. Shorten it for dashboards that need near-real-time data; lengthen it for production to reduce overhead:
await telemetryService.initialize({
appName: "my-agent-app",
otlpEndpoint: "http://localhost:4318/v1/traces",
metricExportIntervalMs: 30000, // export every 30 seconds
});ADK-TS derives the metrics endpoint from the traces endpoint by replacing /v1/traces with /v1/metrics. Both point to the same host and port, so a single otlpEndpoint value covers both.
Recording Custom Metrics
For operations outside the automatic instrumentation — batch jobs, background workers, or business-level counters — you can record metrics directly:
import { telemetryService } from "@iqai/adk";
// Count a successful agent run
telemetryService.recordAgentInvocation({
agentName: "research-agent",
environment: "production",
status: "success",
});
// Record how long the agent took
telemetryService.recordAgentDuration(1500, {
agentName: "research-agent",
environment: "production",
status: "success",
});
// Count a tool execution
telemetryService.recordToolExecution({
toolName: "search_web",
agentName: "research-agent",
environment: "production",
status: "success",
});
// Record tool duration
telemetryService.recordToolDuration(450, {
toolName: "search_web",
agentName: "research-agent",
environment: "production",
status: "success",
});
// Count an LLM call
telemetryService.recordLlmCall({
model: "gemini-2.5-flash",
agentName: "research-agent",
environment: "production",
status: "success",
});
// Record LLM duration
telemetryService.recordLlmDuration(1200, {
model: "gemini-2.5-flash",
agentName: "research-agent",
environment: "production",
status: "success",
});
// Record token usage (input tokens, output tokens, then dimensions)
telemetryService.recordLlmTokens(150, 75, {
model: "gemini-2.5-flash",
agentName: "research-agent",
environment: "production",
status: "success",
});
// Record an error
telemetryService.recordError("tool", "search_web");The recordError method increments adk.errors with error_type set to "agent", "tool", or "llm" and context set to the string you pass.
Metric Name Constants
Use the METRICS constant when building dashboards or alert queries — it lists all metric names in one place:
import { METRICS } from "@iqai/adk";
METRICS.AGENT_INVOCATIONS; // "adk.agent.invocations"
METRICS.TOOL_EXECUTIONS; // "adk.tool.executions"
METRICS.LLM_CALLS; // "adk.llm.calls"
METRICS.ERRORS; // "adk.errors"
METRICS.AGENT_DURATION; // "adk.agent.duration"
METRICS.TOOL_DURATION; // "adk.tool.duration"
METRICS.LLM_DURATION; // "adk.llm.duration"
METRICS.LLM_TOKENS; // "adk.llm.tokens"
METRICS.LLM_INPUT_TOKENS; // "adk.llm.tokens.input"
METRICS.LLM_OUTPUT_TOKENS; // "adk.llm.tokens.output"Common Dashboard Queries
These PromQL examples work with Grafana backed by Prometheus or any OTLP-to-Prometheus bridge:
# Agent invocation rate (per minute)
rate(adk_agent_invocations_total[5m]) * 60
# 95th percentile LLM latency
histogram_quantile(0.95, rate(adk_llm_duration_bucket[5m]))
# Error rate across all agent operations
sum(rate(adk_errors_total[5m]))
# Token cost breakdown by model
sum by (model) (rate(adk_llm_tokens_input_bucket[1h]))
sum by (model) (rate(adk_llm_tokens_output_bucket[1h]))