Skip to Content
Solution PlaysPlay 97: Play 97 — AI Data Marketplace 📊

Play 97 — AI Data Marketplace 📊

Data commerce platform — dataset discovery with quality scoring, privacy-preserving sharing, synthetic data augmentation, license management, lineage tracking.

Build an AI data marketplace. 5-dimension quality scoring (completeness, consistency, accuracy, timeliness, uniqueness) grades every dataset, PII scanning + k-anonymity + differential privacy enable privacy-preserving sharing, Gaussian Copula generates synthetic data with verified zero real-record overlap, and Azure Purview tracks full data lineage.

Quick Start

cd solution-plays/97-ai-data-marketplace az deployment group create -g $RG -f infra/main.bicep -p infra/parameters.json code . # Use @builder to implement, @reviewer to audit, @tuner to optimize

Architecture

📐 Full architecture details

ServicePurpose
Azure OpenAI (gpt-4o)Dataset description + semantic search
Azure AI Search (Standard)Dataset catalog with vector + semantic search
Azure PurviewData lineage tracking + governance
Cosmos DB (Serverless)Listings, transactions, reviews
Azure StorageDataset files, samples, synthetic outputs
Azure FunctionsQuality scoring + privacy scanning

Pre-Tuned Defaults

  • Quality: 5 dimensions · weighted composite · auto-delist below 40 · weekly refresh
  • Privacy: Presidio PII scan · k=5 anonymity · Gaussian Copula synthetic · ε=1.0 differential privacy
  • Search: Vector + semantic · facets (category, privacy, license) · quality + freshness boost
  • Licensing: 5 license types · attribution tracking · usage limits · 80/20 revenue split

DevKit (AI-Assisted Development)

PrimitiveWhat It Does
agent.mdRoot orchestrator with builder→reviewer→tuner handoffs
copilot-instructions.mdData marketplace domain (quality, privacy, synthetic, lineage)
3 agentsBuilder (gpt-4o), Reviewer (gpt-4o-mini), Tuner (gpt-4o-mini)
3 skillsDeploy (225+ lines), Evaluate (110+ lines), Tune (225+ lines)
4 prompts/deploy, /test, /review, /evaluate with agent routing

Cost Estimate

ServiceDev/moProd/moEnterprise/mo
Azure Machine Learning$0 (Basic)$450 (Standard)$1,400 (Standard HA)
Azure Blob Storage$5 (LRS Hot)$120 (ZRS Hot + Cool)$400 (GRS Hot + Cool + Archive)
Azure API Management$5 (Consumption)$700 (Standard)$2,800 (Premium)
Azure Cosmos DB$5 (Serverless)$230 (4000 RU/s)$700 (12000 RU/s)
Azure Functions$0 (Consumption)$200 (Premium EP2)$500 (Premium EP3)
Key Vault$1 (Standard)$5 (Standard)$20 (Premium HSM)
Application Insights$0 (Free)$40 (Pay-per-GB)$120 (Pay-per-GB)
Total$16$1,745$5,940

💰 Full cost breakdown

vs. Play 62 (Federated Learning Pipeline)

AspectPlay 62Play 97
FocusTrain models without sharing dataShare/sell datasets with privacy
PrivacyFederated (data stays local)Anonymize/synthesize before sharing
OutputTrained modelQuality-scored dataset listing
GovernanceDifferential privacy on gradientsPII scanning + lineage tracking

📖 Full documentation · 🌐 frootai.dev/solution-plays/97-ai-data-marketplace  · 📦 FAI Protocol

FAI Manifest

FieldValue
Play97-ai-data-marketplace
Version1.0.0
KnowledgeT3-Production-Patterns, T2-Responsible-AI, O4-Azure-AI
WAF Pillarssecurity, responsible-ai, cost-optimization, operational-excellence
Last updated on