O2: AI Agents Deep Dive
An LLM is a brain in a jar โ impressive reasoning, zero ability to act. An AI agent wraps that brain with memory, tools, and a planning loop so it can observe the world, decide what to do, execute actions, and learn from results. This module covers the full spectrum from chatbot to multi-agent swarm. For orchestration primitives agents build on, see O1: Semantic Kernel. For the tool protocols agents use, see O3: MCP & Tools.
What Is an AI Agent?โ
Agent = LLM + Memory + Tools + Planning
| Component | Role | Example |
|---|---|---|
| LLM | Reasoning engine โ understands language, generates plans | GPT-4o, Claude, Llama 3 |
| Memory | Short-term (conversation) + long-term (vector store) context | Chat history, Cosmos DB, Redis |
| Tools | Functions the agent can invoke to affect the real world | Search API, database query, email sender |
| Planning | Strategy for decomposing goals into steps | ReAct, Chain-of-Thought, Tree-of-Thought |
:::tip The Autonomy Heuristic Talks โ Assistant ยท Suggests โ Copilot ยท Acts โ Agent. If it waits for every instruction, it's an assistant. If it proactively suggests next steps, it's a copilot. If it takes action on your behalf, it's an agent. :::
Chatbot vs Agentโ
| Dimension | Chatbot | Agent |
|---|---|---|
| Interaction | Reactive Q&A โ user asks, bot answers | Goal-driven โ user sets objective, agent pursues it |
| Decision making | Template matching or single LLM call | Multi-step reasoning with planning loops |
| Tool access | None or scripted integrations | Dynamic tool selection and chaining |
| State | Stateless or simple session memory | Rich short-term + long-term memory |
| Autonomy | Low โ follows scripts | High โ decomposes goals, adapts, retries |
| Error handling | "I don't understand" | Retries, alternative tools, escalation |
Agent vs Copilot vs Assistantโ
| Trait | Assistant | Copilot | Agent |
|---|---|---|---|
| Autonomy | Low | Medium | High |
| Initiative | Responds to commands | Proactively suggests | Acts independently |
| Scope | Single task | Workflow augmentation | End-to-end goal completion |
| Human role | Driver | Co-driver | Passenger (with override) |
| Example | Siri setting a timer | GitHub Copilot suggesting code | Agent booking flights + hotel for a trip |
The Evolution of AI Systemsโ
Rule-Based Bot โ LLM Chatbot โ RAG Chatbot โ Tool-Using Assistant โ AI Agent โ Multi-Agent System
โ โ โ โ โ โ
Hard-coded Free-form Grounded in Can call APIs Autonomous Agents
if/else responses your data and functions planning loop collaborate
Each stage adds a capability: natural language โ knowledge โ action โ autonomy โ collaboration.
The Core Agent Loopโ
Every agent, regardless of framework, runs a variation of this loop:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. OBSERVE โ User goal or env โ
โ 2. THINK โ LLM reasons + plans โ
โ 3. ACT โ Execute tool/action โ
โ 4. OBSERVE โ Check results โ
โ 5. REPEAT or STOP โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The ReAct pattern (Reason + Act) is the most common implementation: the LLM produces a Thought โ Action โ Observation cycle until the task is complete or a stop condition is met.
# Simplified agent loop (pseudocode)
def agent_loop(goal: str, tools: list, max_hops: int = 10):
memory = [{"role": "user", "content": goal}]
for hop in range(max_hops):
response = llm.chat(memory, tools=tools)
if response.finish_reason == "stop":
return response.content # Done
# Execute tool call
result = execute_tool(response.tool_calls[0])
memory.append({"role": "tool", "content": result})
raise TimeoutError("Agent exceeded max hops")
Always set max_hops (or equivalent). Without it, an agent can loop forever โ burning tokens and money. Start with 10 hops, increase only if your use case demands it.
Agent Frameworks Comparisonโ
| Framework | Language | Strength | Pattern | Best For |
|---|---|---|---|---|
| AutoGen | Python | Multi-agent conversations, code execution | ConversableAgent + GroupChat | Research, coding tasks, multi-agent debate |
| CrewAI | Python | Task delegation, role-based agents | Crew โ Agent โ Task with delegation | Business workflows, content pipelines |
| LangChain | Python/JS | LCEL chains, extensive tool ecosystem | AgentExecutor, LangGraph for cycles | RAG, tool-heavy pipelines, prototyping |
| Semantic Kernel | C#/Python/Java | Enterprise-grade, Azure-native | Plugins + Planners + Filters | Production .NET/Java apps, Azure integration |
| Microsoft Agent Framework | Python | Production SDK, Azure Foundry integration | Agent Service with tools + state | Enterprise deployment with eval + monitoring |
:::info When to use what
- Prototyping โ LangChain (fastest ecosystem, most examples)
- Multi-agent research โ AutoGen (built for agent conversations)
- Enterprise .NET โ Semantic Kernel (first-class C# support)
- Production Python โ Microsoft Agent Framework (Foundry integration)
- Business process โ CrewAI (intuitive role/task model) :::
Multi-Agent Patternsโ
Supervisor Patternโ
One orchestrator agent delegates to specialist agents and aggregates results.
โโโโโโโโโโโโโโโ
โ Supervisor โ
โโโโฌโโโโฌโโโโฌโโโโ
โ โ โ
โโโโโโ โ โโโโโโ
โผ โผ โผ
Researcher Coder Reviewer
Swarm Patternโ
Agents self-organize without central control. Each agent decides when to hand off to another.
Pipeline Patternโ
Sequential handoff โ Agent A โ Agent B โ Agent C. Each agent transforms and passes output forward. Ideal for content generation (research โ write โ edit โ publish).
Debate Patternโ
Two+ agents argue opposing positions. A judge agent synthesizes the best answer. Useful for complex analysis where multiple perspectives improve accuracy.
Production Guardrailsโ
| Guardrail | Why | Implementation |
|---|---|---|
| Max hops | Prevent infinite loops | max_iterations=10 in agent config |
| Token budget | Control cost per request | max_tokens per hop + total budget |
| Timeout | Prevent hung agents | 60s per tool call, 5min per task |
| Audit trail | Compliance + debugging | Log every thought/action/observation |
| Human-in-the-loop | Safety for destructive actions | Require approval for writes/deletes |
| Sandboxing | Prevent code execution escapes | Docker containers for code agents |
| Content safety | Block harmful outputs | Azure Content Safety on every response |
For infrastructure to run agents at scale, see O5: AI Infrastructure. For evaluation of agent quality, see O4: Azure AI Foundry.
Key Takeawaysโ
- Agent = LLM + Memory + Tools + Planning โ each component is necessary
- The agent loop (Observe โ Think โ Act) is universal across all frameworks
- Multi-agent patterns (Supervisor, Swarm, Pipeline, Debate) solve different coordination needs
- Production agents need guardrails: max hops, token budgets, timeouts, audit trails
- Choose your framework based on language, deployment target, and collaboration pattern โ not hype