Skip to Content
ConceptsWell-Architected Framework

Well-Architected Framework

FrootAI aligns every primitive, play, and protocol element to the 6 pillars of the Azure Well-Architected Framework (WAF). This isn’t optional β€” WAF alignment is enforced at the protocol level via fai-manifest.json.

The 6 Pillars

PillarKey PrinciplesExample Enforcement
πŸ›‘οΈ SecurityIdentity, network, data protection, AI-specific securityManaged Identity, Key Vault, content safety filters
πŸ”„ ReliabilityRetry, circuit breaker, health checks, graceful degradationExponential backoff, /health endpoints, cached fallbacks
πŸ’° Cost OptimizationModel routing, token budgets, right-sizing, FinOpsGPT-4o-mini triage, costPerQuery guardrails
βš™οΈ Operational ExcellenceCI/CD, observability, IaC, incident managementStructured logging, App Insights, Bicep templates
⚑ Performance EfficiencyCaching, streaming, async patterns, bundle optimizationResponse caching, SSE streaming, CDN for static assets
πŸ€– Responsible AIContent safety, groundedness, fairness, transparencyAzure AI Content Safety, groundedness β‰₯ 0.95, source citations

Security

Every FrootAI solution enforces:

  • Never hardcode secrets β€” use Azure Managed Identity and Key Vault
  • RBAC with least-privilege β€” Microsoft Entra ID for user authentication
  • Private endpoints for all PaaS services in production
  • Content safety filters on all AI endpoints
  • Rate limiting AI API calls per user/tenant
  • Input sanitization β€” validate and sanitize all prompts before sending to models
fai-manifest.json β€” Security WAF
{ "context": { "waf": ["security"] } }

Reliability

  • All external API calls must have retry logic with exponential backoff (3 retries: 1s/2s/4s with jitter)
  • Every service must expose a /health endpoint verifying downstream dependencies
  • If Azure OpenAI is unavailable, fall back to cached responses or static content
  • HTTP client timeouts: 30s for AI endpoints, 10s for search, 5s for metadata
config/guardrails.json β€” Reliability thresholds
{ "thresholds": { "coherence": 0.90, "groundedness": 0.95 } }

Cost Optimization

  • Use model routing: GPT-4o-mini for simple tasks, GPT-4o for complex reasoning
  • Implement token budgets per request via max_tokens in config
  • Cache frequent AI responses with TTL-based semantic deduplication
  • Set costPerQuery guardrails in fai-manifest.json
  • Default to the smallest viable SKU β€” scale up based on metrics, not assumptions
config/openai.json β€” Cost controls
{ "model": "gpt-4o", "max_tokens": 4096, "fallback_model": "gpt-4o-mini" }
πŸ’‘

Use the FrootAI Cost EstimatorΒ  to calculate monthly Azure costs for any solution play at dev or production scale.

Operational Excellence

  • All deployments must go through CI/CD pipelines β€” no manual deployments
  • Use conventional commits (feat:, fix:, docs:, chore:)
  • All infrastructure must be defined in Bicep/Terraform β€” no portal clicks
  • Structured logging with correlation IDs across all services
  • Application Insights for APM, distributed tracing, and custom AI metrics
# Validate consistency before every release npm run validate:primitives node engine/index.js fai-manifest.json --status

Performance Efficiency

  • Target: < 3s for simple queries, < 10s for complex multi-step reasoning
  • Use streaming responses for AI chat interfaces
  • Implement response caching for repeated queries (semantic similarity > 0.95)
  • Parallelize independent AI calls (search + glossary lookup)
  • Use appropriate top_k for RAG scenarios (5–10 for most use cases)

Responsible AI

  • All user-facing AI responses must pass through Azure AI Content Safety
  • RAG responses must cite sources β€” never generate unsourced claims
  • Implement groundedness checks (score β‰₯ 0.95 on 0–1 scale)
  • Always include β€œAI-generated” disclaimers on outputs
  • Critical decisions must have human-in-the-loop validation
config/guardrails.json β€” Responsible AI thresholds
{ "content_safety": { "hate": 0, "violence": 0, "self_harm": 0, "sexual": 0 } }
🚨

Content safety thresholds must be zero for all categories in production β€” zero tolerance for harmful content.

WAF in the FAI Protocol

Every primitive can declare WAF alignment in its frontmatter:

agents/fai-security-reviewer.agent.md
--- description: "Reviews code for OWASP LLM Top 10 vulnerabilities" waf: ["security", "responsible-ai"] plays: ["30-ai-security-hardening"] ---

The fai-manifest.json enforces play-level WAF pillars. The FAI Engine validates that a play’s declared pillars are covered by its primitives:

fai-manifest.json
{ "context": { "waf": ["security", "reliability", "cost-optimization", "responsible-ai"] } }

Valid WAF Pillar Values

These are the only valid values in waf arrays:

ValuePillar
securityIdentity, network, data protection, AI security
reliabilityRetry, circuit breaker, health checks, degradation
cost-optimizationModel routing, token budgets, right-sizing
operational-excellenceCI/CD, observability, IaC, incidents
performance-efficiencyCaching, streaming, async, optimization
responsible-aiContent safety, groundedness, fairness

Next Steps

  • FAI Protocol β€” how WAF is enforced at the protocol level
  • Primitives β€” how each primitive type declares WAF alignment
  • PR Checklist β€” WAF validation in pull requests
Last updated on