Architecture Reference

The Agentic Stack

A 7-layer architecture reference for the post-API enterprise. From infrastructure to governance — every layer an agentic system needs to survive production.

L0Infrastructure L1Data & Memory L2Model Layer L3Agent Runtime L4Agent Communication L5Orchestration L6Governance & Compliance

The enterprise stack is inverting. For decades, architects thought in terms of presentation → business logic → data. In the agentic era, the stack is: infrastructure → data → models → runtime → communication → orchestration → governance. Each layer depends on the ones below it. Skip a layer, and your agents fail in production.

This reference maps all seven layers — the technologies, the patterns, the trade-offs, and the key decisions at each level. It is not a tutorial. It is an architecture map for technical leaders building systems that will still be running in 2028.

The Seven Layers

Each layer builds on the one below. Together, they form the complete architecture for agentic enterprise systems.

Layer 0

Infrastructure

The foundation beneath every agentic system. Traditional API gateways are evolving into AI Gateways — adding token metering, prompt sanitization, and model-aware rate limiting. Service mesh is being repurposed for agent-to-agent traffic with identity-aware routing. Edge compute brings inference closer to the user for sub-100ms latency.

Key Capabilities

API Gateways evolving into AI Gateways with token metering and prompt sanitization
Service mesh for agent-to-agent traffic with mTLS identity (Istio, Linkerd)
Edge compute for low-latency inference and real-time agent interactions
GPU orchestration and inference optimization at the infrastructure level

Key Technologies

Flex GatewayKong AI GatewayEnvoyCloudflare WorkersIstioLinkerd

Key Insight

“The AI Gateway is the new API Gateway. If your gateway can't meter tokens, sanitize prompts, and enforce model-level policies — it's already legacy.”

Layer 1

Data & Memory

Agents need memory — both short-term context and long-term knowledge. Vector databases enable semantic retrieval over unstructured data. Knowledge graphs encode structured relationships for precise reasoning. Context stores give agents persistent memory across sessions. The key shift: data architecture must be agent-readable, not just human-queryable.

Key Capabilities

Vector databases for semantic retrieval (Pinecone, Weaviate, Qdrant, pgvector)
Knowledge graphs for structured relationships (Neo4j, Amazon Neptune)
Context stores for persistent agent memory across sessions and handoffs
RAG 2.0: hybrid retrieval with re-ranking, moving beyond naive vector search

Key Technologies

PineconeWeaviateQdrantpgvectorNeo4jAmazon Neptune

Key Insight

“RAG 1.0 was 'embed and pray.' RAG 2.0 combines dense + sparse retrieval, cross-encoder re-ranking, and knowledge graph grounding. Your retrieval pipeline is your agent's IQ ceiling.”

Layer 2

Model Layer

The foundation models powering agentic reasoning. Model selection is no longer about picking 'the best model' — it's about routing tasks to the most cost-effective model for each capability. Semantic routers analyze intent and dispatch to specialized models. Multi-modal capabilities span text, vision, audio, and code generation.

Key Capabilities

Foundation models: GPT-4o, Claude 3.5/4, Gemini 2, Llama 4, Mistral Large
Model routing and semantic routing — send tasks to the most cost-effective model
Fine-tuning vs RAG decision framework for enterprise customization
Multi-modal capabilities across text, vision, audio, and code generation

Key Technologies

GPT-4oClaude 4Gemini 2Llama 4Mistral LargeSemantic Router

Key Insight

“Models are commoditizing. GPT-4o, Claude, and Gemini are converging in capability. The differentiator is now architecture — how you route, orchestrate, and govern models — not which model you pick.”

Layer 3

Agent Runtime

The execution environment where agents come alive. The runtime provides the primitives — tool use, structured output, function calling, code interpretation, and guardrails. OpenAI's Agents SDK defines the new standard with its Agents, Tools, Handoffs, Guardrails, and Tracing abstractions. Microsoft's Semantic Kernel brings enterprise-grade patterns. Salesforce's AgentForce adds governed enterprise deployment.

Key Capabilities

OpenAI Agents SDK: Agents, Tools, Handoffs, Guardrails, Tracing primitives
Semantic Kernel (Microsoft): enterprise-grade agent framework with planner architecture
Salesforce AgentForce: governed enterprise agents with CRM-native deployment
The Assistants API sunset (August 2026) and migration path to Agents SDK

Key Technologies

OpenAI Agents SDKSemantic KernelAgentForceAssistants API → Agents SDK

Key Insight

“OpenAI sunsetting the Assistants API in August 2026 is a signal: the industry is moving from 'stateful assistant threads' to 'composable agent primitives.' If you're still building on Assistants, start migrating now.”

Layer 4

Agent Communication

The protocols that let agents talk to tools and to each other. MCP (Model Context Protocol) is the USB port for AI — a universal standard for connecting agents to data sources, tools, and services. A2A (Agent-to-Agent) is the LinkedIn for agents — enabling capability discovery and task delegation across organizational boundaries.

Key Capabilities

MCP: 97M monthly downloads — context passing, session memory, agent identity, access scope
MCP 2026 roadmap: Streamable HTTP, metadata discovery via .well-known, task lifecycle
A2A (Google): 150+ partners — Agent Cards for capability discovery, gRPC support (v0.3)
MCP vs A2A: complementary, not competing. MCP = tool access. A2A = agent coordination.

Key Technologies

MCPA2A ProtocolAgent CardsStreamable HTTPgRPC

Key Insight

“MCP and A2A are not competing — they're complementary. MCP defines how agents access tools and context. A2A defines how agents find and coordinate with each other. Together, they form the communication backbone of the agentic enterprise.”

Layer 5

Orchestration

The patterns and frameworks that coordinate multi-agent systems. The supervisor-worker pattern has emerged as the dominant enterprise deployment model — a supervisor decomposes tasks, delegates to specialized workers, and synthesizes results. The framework landscape is maturing: LangGraph leads production workloads, CrewAI dominates rapid prototyping, and AutoGen excels at research and negotiation scenarios.

Key Capabilities

Supervisor-Worker pattern: the dominant enterprise deployment model
Framework landscape: LangGraph (40% production), CrewAI (prototyping), AutoGen (research)
Deterministic vs probabilistic workflow routing for different risk tolerances
Anti-patterns: God Agent, context window stuffing, synchronous chains

Key Technologies

LangGraphCrewAIAutoGenSupervisor-WorkerHuman-in-the-Loop

Key Insight

“The three anti-patterns killing agentic projects: the God Agent (one agent doing everything), context window stuffing (dumping everything into the prompt), and synchronous chains (agents waiting on agents waiting on agents). Avoid all three.”

Layer 6

Governance & Compliance

The layer that determines whether your agentic systems survive contact with reality. The EU AI Act reaches full enforcement on August 2, 2026. Dynamic Agent Authorization (DAA) is replacing traditional IAM — because agents don't have job titles, they have capabilities. Every agent action needs an audit trail. Every decision needs explainability. And only 28% of organizations have mature governance structures.

Key Capabilities

EU AI Act full enforcement: August 2, 2026 — with penalties up to €35M or 7% of global turnover
Dynamic Agent Authorization (DAA) replacing traditional IAM for autonomous systems
Agent audit trails and decision explainability for regulatory compliance
Risk classification frameworks for agentic systems under the EU AI Act

Key Technologies

EU AI ActDynamic Agent AuthorizationAudit TrailsRisk Classification

Key Insight

“Only 28% of organizations have mature governance structures for AI agents. With penalties up to €35M or 7% of global annual turnover, governance isn't a nice-to-have — it's existential. Build it into Layer 0, not as an afterthought at Layer 6.”

How the Layers Connect

This isn't a waterfall. Agents traverse the stack dynamically. A single user request might touch all seven layers in milliseconds:

Infrastructure routes the request through the AI Gateway, applies rate limits, and sanitizes the prompt.

Data & Memory retrieves relevant context from vector stores and knowledge graphs via hybrid RAG.

Model Layer routes the enriched prompt to the optimal model based on task complexity and cost.

Agent Runtime executes tool calls, applies guardrails, and generates structured output.

Communication connects to external tools via MCP and discovers peer agents via A2A.

Orchestration coordinates the supervisor-worker flow, manages handoffs, and synthesizes results.

Governance logs every decision, enforces authorization policies, and maintains the audit trail.

The key insight: governance isn't just the top layer — it permeates every layer. Authorization checks happen at infrastructure (L0). Data access controls live at the memory layer (L1). Model usage policies are enforced at L2. Every layer has a governance surface.

Architects who treat this as a linear stack will build fragile systems. Architects who understand the cross-cutting concerns — observability, security, governance — will build systems that scale to production.

The Landscape in Numbers

97M+

Monthly MCP downloads

150+

A2A enterprise partners

40%

LangGraph production share

28%

Orgs with mature AI governance

Go Deeper

Each pillar of the agentic stack has its own dedicated deep dive. Start with the layer that matters most to your architecture.

Back to home

The Agentic Stack

The Seven Layers

Infrastructure

Key Capabilities

Key Technologies

Data & Memory

Key Capabilities

Key Technologies

Model Layer

Key Capabilities

Key Technologies

Agent Runtime

Key Capabilities

Key Technologies

Agent Communication

Key Capabilities

Key Technologies

Orchestration

Key Capabilities

Key Technologies

Governance & Compliance

Key Capabilities

Key Technologies

How the Layers Connect

The Landscape in Numbers

Go Deeper

Protocols Deep Dive

Orchestration Patterns

Governance Framework