O4: Azure AI Foundry
Azure AI Foundry is Microsoft's unified platform for building, evaluating, and deploying AI applications at enterprise scale. Think of it as the control plane for your entire AI lifecycle โ from model selection through production monitoring. For the orchestration SDKs that build on Foundry, see O1: Semantic Kernel. For infrastructure underpinning Foundry deployments, see O5: AI Infrastructure.
:::tip Think of it this way If Azure Resource Manager (ARM) is the control plane for Azure infrastructure, Azure AI Foundry is the control plane for AI workloads โ model deployment, prompt management, evaluation, and safety all in one place. :::
Evolutionโ
Azure ML Studio (2015) โ Azure AI Studio (2023) โ Azure AI Foundry (GA 2024+)
ML training only AI + ML unified Full AI lifecycle platform
Preview Production-ready
Each generation expanded scope: ML-only โ AI experiments โ enterprise AI lifecycle management.
Three Interfaces, One Platformโ
| Interface | Best For | Example |
|---|---|---|
| Portal (ai.azure.com) | Exploration, visual evaluation, prompt playground | Try models, compare outputs, review evals |
Python SDK (azure-ai-projects) | Programmatic access, CI/CD integration, automation | Build evaluation pipelines, deploy models |
CLI (az ml) | Scripting, infrastructure automation | Provision hubs/projects in pipelines |
All three interfaces manage the same underlying resources โ choose based on your workflow.
Hub and Project Modelโ
Foundry uses a two-tier workspace hierarchy that separates shared infrastructure from team workspaces:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AI Hub โ
โ Shared: connections, compute, โ
โ networking, security policies โ
โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โ โ Project A โ โ Project B โ โ
โ โ Team Alpha โ โ Team Beta โ โ
โ โ - Endpoints โ โ - Endpoints โ โ
โ โ - Evals โ โ - Evals โ โ
โ โ - Flows โ โ - Flows โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Hub (Organization Level)โ
| Responsibility | What It Manages |
|---|---|
| Connections | Azure OpenAI, AI Search, Storage โ shared across projects |
| Compute | Shared compute instances for training and inference |
| Networking | Private endpoints, VNet integration, firewall rules |
| Security | RBAC policies, managed identity, Key Vault integration |
| Governance | Content safety policies, model access controls |
Project (Team Level)โ
| Responsibility | What It Manages |
|---|---|
| Endpoints | Model deployments specific to this team |
| Evaluations | Quality metrics, test datasets, evaluation runs |
| Prompt Flow | RAG/agent orchestration flows |
| Data | Datasets, indexes, and data connections |
| Artifacts | Prompt versions, flow snapshots, evaluation history |
:::info Hub โ Project relationship One Hub โ many Projects. Projects inherit the Hub's connections and security. Teams get isolation (their own endpoints, evals, data) while sharing expensive infrastructure (compute, networking). :::
Model Catalogโ
Foundry's Model Catalog provides access to 1,700+ models from multiple providers:
| Provider | Notable Models | Deployment Type |
|---|---|---|
| OpenAI | GPT-4o, GPT-4o-mini, o1, o3 | Managed (Azure OpenAI) |
| Meta | Llama 3.1 (8B/70B/405B) | Serverless API or Managed Compute |
| Mistral | Mistral Large, Mixtral | Serverless API |
| Cohere | Command R, Command R+ | Serverless API |
| Microsoft | Phi-3, Phi-3.5, Florence | Serverless API or Managed Compute |
Deployment Typesโ
| Type | How It Works | Billing | Best For |
|---|---|---|---|
| Serverless API | Pay-per-token, no infrastructure to manage | Token-based pricing | Variable/unpredictable workloads |
| Managed Compute | Dedicated VM(s) running the model | VM hourly rate | Consistent high throughput, custom models |
| Global | Microsoft-hosted, multi-region | Token-based pricing | Highest availability, lowest latency |
For GPU sizing and PTU vs PAYG decisions, see O5: AI Infrastructure.
Evaluation Pipelinesโ
Foundry provides built-in evaluation for measuring AI quality โ critical for production deployments:
Built-in Metricsโ
| Metric | What It Measures | Scale | Target |
|---|---|---|---|
| Groundedness | Are claims supported by the provided context? | 1โ5 | โฅ 4.0 |
| Relevance | Does the response address the user's question? | 1โ5 | โฅ 4.0 |
| Coherence | Is the response logically structured and readable? | 1โ5 | โฅ 4.0 |
| Fluency | Is the language natural and grammatically correct? | 1โ5 | โฅ 4.0 |
| Similarity | How close is the response to a ground truth answer? | 1โ5 | โฅ 3.5 |
Running Evaluationsโ
from azure.ai.projects import AIProjectClient
from azure.ai.evaluation import GroundednessEvaluator, RelevanceEvaluator
project = AIProjectClient.from_connection_string(conn_str)
# Create evaluators
groundedness = GroundednessEvaluator(model_config)
relevance = RelevanceEvaluator(model_config)
# Evaluate a dataset
results = project.evaluations.create(
data="test_dataset.jsonl",
evaluators={
"groundedness": groundedness,
"relevance": relevance,
},
)
print(results.metrics)
# {"groundedness": 4.3, "relevance": 4.1}
Never deploy an AI application without running evaluations first. Evaluation is not optional โ it's the quality gate between development and production. Set minimum thresholds and fail the pipeline if they're not met.
Prompt Flowโ
Prompt Flow is Foundry's visual orchestration tool for building RAG and agent pipelines:
โโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ
โ Input โโโโโบโ Embedding โโโโโบโ Search โโโโโบโ LLM โโโโบ Output
โ (query) โ โ (vectorize)โ โ (retrieve)โ โ (generate)โ
โโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโโ
| Feature | Description |
|---|---|
| Visual editor | Drag-and-drop DAG for prompt chains |
| Variants | A/B test different prompts side by side |
| Bulk testing | Run flows against test datasets |
| Deployment | One-click deploy to managed endpoint |
| Tracing | Step-by-step execution trace for debugging |
When to Use Whatโ
| Scenario | Use |
|---|---|
| Quick model experimentation | Portal playground |
| Building a RAG pipeline | Prompt Flow + AI Search connection |
| Automated evaluation in CI/CD | Python SDK + evaluation pipeline |
| Deploying to production | Managed endpoint with content safety |
| Multi-team AI development | Hub + per-team Projects |
Key Takeawaysโ
- Azure AI Foundry is the unified control plane for the AI lifecycle โ build, evaluate, deploy, monitor
- The Hub/Project model separates shared infrastructure from team workspaces
- The Model Catalog provides 1,700+ models with serverless or managed deployment options
- Evaluation pipelines with built-in metrics are essential quality gates before production
- Prompt Flow provides visual orchestration for RAG and agent pipelines
- Use the Portal for exploration, SDK for automation, CLI for infrastructure scripting