TypeScriptADK-TS
Runtime

Performance

Optimization, monitoring, and scalability considerations

Performance optimization in the ADK Runtime involves multiple layers, from efficient event processing to resource management and service optimization.

Event Processing Optimization

Minimizing Event Overhead

Events are the core communication mechanism, so optimizing their creation and processing is crucial:

// Efficient event creation
class OptimizedAgent extends BaseAgent {
  protected async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator<Event> {
    // Batch multiple state changes in a single event
    const actions = new EventActions({
      stateDelta: {
        "user_preference": "dark_mode",
        "session_count": ctx.session.events.length + 1,
        "last_activity": new Date().toISOString()
      }
    });

    // Single event with multiple changes instead of multiple events
    yield new Event({
      invocationId: ctx.invocationId,
      author: this.name,
      content: { parts: [{ text: "Settings updated successfully" }] },
      actions
    });
  }
}

Efficient Serialization

Keep event payloads reasonably sized for better performance:

// Avoid large content in events
class EfficientAgent extends BaseAgent {
  protected async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator<Event> {
    const largeData = await this.processLargeDataset();

    // Store large data as artifact, reference in event
    if (ctx.artifactService) {
      await ctx.artifactService.saveArtifact({
        appName: ctx.session.appName,
        userId: ctx.session.userId,
        sessionId: ctx.session.id,
        filename: "analysis_results.json",
        artifact: {
          inlineData: {
            mimeType: "application/json",
            data: JSON.stringify(largeData)
          }
        }
      });

      // Event contains reference, not the data itself
      yield new Event({
        invocationId: ctx.invocationId,
        author: this.name,
        content: {
          parts: [{
            text: "Analysis complete. Results saved as analysis_results.json"
          }]
        }
      });
    }
  }
}

Streaming for User Experience

Use partial events for better perceived performance:

class StreamingAgent extends BaseAgent {
  protected async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator<Event> {
    const steps = [
      "Analyzing request...",
      "Gathering information...",
      "Processing data...",
      "Generating response..."
    ];

    // Stream progress updates
    for (const step of steps) {
      yield new Event({
        invocationId: ctx.invocationId,
        author: this.name,
        content: { parts: [{ text: step }] },
        partial: true  // Not persisted, just for streaming
      });

      await this.performStep(step);
    }

    // Final result
    yield new Event({
      invocationId: ctx.invocationId,
      author: this.name,
      content: { parts: [{ text: "Here's your complete analysis..." }] },
      partial: false  // Persisted to session
    });
  }
}

Resource Management

Connection Pooling

Efficient management of expensive resources like LLM connections:

class ConnectionPool {
  private pools = new Map<string, any[]>();
  private maxPoolSize = 10;

  async getConnection(model: string): Promise<any> {
    const pool = this.pools.get(model) || [];

    if (pool.length > 0) {
      return pool.pop();
    }

    // Create new connection if pool is empty
    return await this.createNewConnection(model);
  }

  releaseConnection(model: string, connection: any): void {
    const pool = this.pools.get(model) || [];

    if (pool.length < this.maxPoolSize) {
      pool.push(connection);
      this.pools.set(model, pool);
    } else {
      // Pool is full, dispose connection
      this.disposeConnection(connection);
    }
  }
}

// Use in LLM integration
const connectionPool = new ConnectionPool();

class OptimizedLlmAgent extends LlmAgent {
  async callLlm(request: LlmRequest): Promise<LlmResponse> {
    const connection = await connectionPool.getConnection(this.model);

    try {
      return await connection.call(request);
    } finally {
      connectionPool.releaseConnection(this.model, connection);
    }
  }
}

Memory Management

Proper cleanup and garbage collection optimization:

class MemoryEfficientAgent extends BaseAgent {
  private cache = new Map<string, any>();
  private maxCacheSize = 1000;

  protected async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator<Event> {
    try {
      // Use WeakMap for automatic cleanup
      const tempData = new WeakMap();

      // Process with memory awareness
      const result = await this.processWithMemoryManagement(ctx, tempData);

      yield new Event({
        invocationId: ctx.invocationId,
        author: this.name,
        content: { parts: [{ text: result }] }
      });

    } finally {
      // Explicit cleanup
      this.cleanupCache();
    }
  }

  private cleanupCache(): void {
    if (this.cache.size > this.maxCacheSize) {
      // Remove oldest entries (simple LRU)
      const entries = Array.from(this.cache.entries());
      const toRemove = entries.slice(0, Math.floor(this.maxCacheSize * 0.2));

      for (const [key] of toRemove) {
        this.cache.delete(key);
      }
    }
  }
}

Service Optimization

Optimize service calls and reduce latency:

class OptimizedSessionService extends BaseSessionService {
  private cache = new Map<string, Session>();
  private cacheExpiry = new Map<string, number>();
  private cacheTtl = 5 * 60 * 1000; // 5 minutes

  async getSession(appName: string, userId: string, sessionId: string): Promise<Session | null> {
    const cacheKey = `${appName}:${userId}:${sessionId}`;
    const now = Date.now();

    // Check cache first
    const cached = this.cache.get(cacheKey);
    const expiry = this.cacheExpiry.get(cacheKey);

    if (cached && expiry && expiry > now) {
      return cached;
    }

    // Load from storage
    const session = await this.loadFromStorage(appName, userId, sessionId);

    if (session) {
      // Cache the result
      this.cache.set(cacheKey, session);
      this.cacheExpiry.set(cacheKey, now + this.cacheTtl);
    }

    return session;
  }

  async appendEvent(session: Session, event: Event): Promise<Event> {
    // Batch events for better performance
    await this.batchEventAppend(session, event);

    // Invalidate cache
    const cacheKey = `${session.appName}:${session.userId}:${session.id}`;
    this.cache.delete(cacheKey);
    this.cacheExpiry.delete(cacheKey);

    return event;
  }
}

Monitoring and Observability

Performance Metrics

Track key performance indicators for runtime health:

import { Registry, Histogram, Counter, Gauge } from 'prom-client';

class PerformanceMonitor {
  private registry = new Registry();

  private invocationDuration = new Histogram({
    name: 'adk_invocation_duration_seconds',
    help: 'Duration of invocations',
    labelNames: ['agent_name', 'status'],
    buckets: [0.1, 0.5, 1, 2, 5, 10, 30]
  });

  private eventCount = new Counter({
    name: 'adk_events_total',
    help: 'Total number of events generated',
    labelNames: ['event_type', 'agent_name']
  });

  private activeInvocations = new Gauge({
    name: 'adk_active_invocations',
    help: 'Number of currently active invocations'
  });

  constructor() {
    this.registry.registerMetric(this.invocationDuration);
    this.registry.registerMetric(this.eventCount);
    this.registry.registerMetric(this.activeInvocations);
  }

  startInvocation(agentName: string): () => void {
    const startTime = Date.now();
    this.activeInvocations.inc();

    return () => {
      const duration = (Date.now() - startTime) / 1000;
      this.invocationDuration.observe({ agent_name: agentName, status: 'success' }, duration);
      this.activeInvocations.dec();
    };
  }

  recordEvent(eventType: string, agentName: string): void {
    this.eventCount.inc({ event_type: eventType, agent_name: agentName });
  }
}

// Integration with Runner
class MonitoredRunner extends Runner {
  private monitor = new PerformanceMonitor();

  async *runAsync(params: any): AsyncGenerator<Event> {
    const endInvocation = this.monitor.startInvocation(this.agent.name);

    try {
      for await (const event of super.runAsync(params)) {
        this.monitor.recordEvent(
          event.content ? 'content' : 'action',
          event.author
        );
        yield event;
      }
    } finally {
      endInvocation();
    }
  }
}

Telemetry Integration

Built-in OpenTelemetry support for distributed tracing:

import { tracer } from '@iqai/adk';

class TracedAgent extends BaseAgent {
  protected async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator<Event> {
    const span = tracer.startSpan(`agent.${this.name}.execute`, {
      attributes: {
        'agent.name': this.name,
        'invocation.id': ctx.invocationId,
        'user.id': ctx.session.userId
      }
    });

    try {
      span.addEvent('agent.started');

      // Your agent logic with span events
      span.addEvent('processing.started');
      const result = await this.processRequest(ctx);
      span.addEvent('processing.completed', {
        'result.length': result.length
      });

      yield new Event({
        invocationId: ctx.invocationId,
        author: this.name,
        content: { parts: [{ text: result }] }
      });

      span.setStatus({ code: SpanStatusCode.OK });

    } catch (error) {
      span.recordException(error as Error);
      span.setStatus({
        code: SpanStatusCode.ERROR,
        message: error instanceof Error ? error.message : 'Unknown error'
      });
      throw error;
    } finally {
      span.end();
    }
  }
}

Error Tracking

Comprehensive error monitoring and alerting:

class ErrorTracker {
  private errorCounts = new Map<string, number>();
  private errorThreshold = 10;
  private timeWindow = 5 * 60 * 1000; // 5 minutes

  recordError(error: Error, context: { agentName: string; invocationId: string }): void {
    const errorKey = `${context.agentName}:${error.constructor.name}`;
    const currentCount = this.errorCounts.get(errorKey) || 0;

    this.errorCounts.set(errorKey, currentCount + 1);

    // Alert if threshold exceeded
    if (currentCount + 1 >= this.errorThreshold) {
      this.sendAlert(errorKey, currentCount + 1, error, context);
    }

    // Reset counts periodically
    setTimeout(() => {
      this.errorCounts.delete(errorKey);
    }, this.timeWindow);
  }

  private sendAlert(errorKey: string, count: number, error: Error, context: any): void {
    console.error(`Alert: High error rate for ${errorKey}: ${count} errors`, {
      error: error.message,
      stack: error.stack,
      context
    });

    // Send to monitoring service
    // this.monitoringService.sendAlert(...)
  }
}

// Integration with agents
const errorTracker = new ErrorTracker();

class MonitoredAgent extends BaseAgent {
  protected async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator<Event> {
    try {
      // Your agent logic
      yield new Event({
        invocationId: ctx.invocationId,
        author: this.name,
        content: { parts: [{ text: "Success!" }] }
      });
    } catch (error) {
      errorTracker.recordError(error as Error, {
        agentName: this.name,
        invocationId: ctx.invocationId
      });
      throw error;
    }
  }
}

Scalability Considerations

Horizontal Scaling

Design for stateless execution across multiple instances:

// Stateless agent design
class StatelessAgent extends BaseAgent {
  // Avoid instance variables that hold state
  // Use context and services for all state

  protected async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator<Event> {
    // Get state from session, not instance variables
    const userPreferences = ctx.session.state?.get('user_preferences') || {};

    // Process stateless-ly
    const result = await this.processStateless(ctx.userContent, userPreferences);

    // Update state through context
    yield new Event({
      invocationId: ctx.invocationId,
      author: this.name,
      content: { parts: [{ text: result }] },
      actions: new EventActions({
        stateDelta: {
          'last_processed': new Date().toISOString()
        }
      })
    });
  }

  private async processStateless(input: any, preferences: any): Promise<string> {
    // Pure function - no side effects
    return `Processed: ${JSON.stringify(input)} with preferences: ${JSON.stringify(preferences)}`;
  }
}

Load Balancing

Distribute work across multiple Runner instances:

// Load balancer for Runner instances
class LoadBalancedRunnerPool {
  private runners: Runner[] = [];
  private currentIndex = 0;

  constructor(runnerConfigs: any[]) {
    this.runners = runnerConfigs.map(config => new Runner(config));
  }

  async *runAsync(params: any): AsyncGenerator<Event> {
    // Round-robin load balancing
    const runner = this.runners[this.currentIndex % this.runners.length];
    this.currentIndex++;

    for await (const event of runner.runAsync(params)) {
      yield event;
    }
  }
}

// Usage
const pool = new LoadBalancedRunnerPool([
  { appName: "app", agent: agent1, sessionService: service1 },
  { appName: "app", agent: agent2, sessionService: service2 },
  { appName: "app", agent: agent3, sessionService: service3 }
]);

Caching Strategies

Implement intelligent caching for better performance:

class CachingStrategy {
  private cache = new Map<string, { data: any; expiry: number; hits: number }>();
  private maxSize = 10000;
  private defaultTtl = 5 * 60 * 1000; // 5 minutes

  async get<T>(key: string, factory: () => Promise<T>, ttl?: number): Promise<T> {
    const cached = this.cache.get(key);
    const now = Date.now();

    if (cached && cached.expiry > now) {
      cached.hits++;
      return cached.data;
    }

    // Generate new data
    const data = await factory();

    // Store in cache
    this.cache.set(key, {
      data,
      expiry: now + (ttl || this.defaultTtl),
      hits: 1
    });

    // Cleanup if needed
    this.cleanup();

    return data;
  }

  private cleanup(): void {
    if (this.cache.size <= this.maxSize) return;

    // Remove expired entries first
    const now = Date.now();
    for (const [key, value] of this.cache.entries()) {
      if (value.expiry <= now) {
        this.cache.delete(key);
      }
    }

    // Remove least used entries if still over limit
    if (this.cache.size > this.maxSize) {
      const entries = Array.from(this.cache.entries())
        .sort((a, b) => a[1].hits - b[1].hits)
        .slice(0, Math.floor(this.maxSize * 0.2));

      for (const [key] of entries) {
        this.cache.delete(key);
      }
    }
  }
}

// Usage in agents
const cache = new CachingStrategy();

class CachedAgent extends BaseAgent {
  protected async *runAsyncImpl(ctx: InvocationContext): AsyncGenerator<Event> {
    const cacheKey = `agent_response:${this.name}:${JSON.stringify(ctx.userContent)}`;

    const response = await cache.get(cacheKey, async () => {
      return await this.expensiveOperation(ctx.userContent);
    });

    yield new Event({
      invocationId: ctx.invocationId,
      author: this.name,
      content: { parts: [{ text: response }] }
    });
  }
}

Best Practices

Performance Guidelines

  • Event Batching: Combine related state changes into single events
  • Lazy Loading: Load services and resources only when needed
  • Connection Reuse: Pool expensive connections (LLM, database)
  • Caching: Cache expensive computations with appropriate TTL
  • Monitoring: Track key metrics for performance insights

Resource Optimization

  • Memory Management: Use WeakMap/WeakSet for automatic cleanup
  • CPU Efficiency: Avoid blocking operations in event loops
  • I/O Optimization: Batch database operations where possible
  • Network Efficiency: Minimize external service calls

Monitoring Strategy

  • Key Metrics: Track invocation duration, event counts, error rates
  • Distributed Tracing: Use OpenTelemetry for request correlation
  • Alerting: Set up alerts for performance degradation
  • Profiling: Regular performance profiling in production

Performance Testing

Regular performance testing with realistic workloads is essential for maintaining optimal runtime performance. Consider load testing with concurrent invocations and monitoring resource utilization patterns.