Vavi Labs — Agentic AI Consulting

The portfolio.

Products

Arbiter — build-time governance making every agent-assisted architecture call visible before it compounds.
BatSwing — vision-based batting analysis for cricket academies.
Creative Collab OS — AI-assisted creativity with better context, taste, and judgment.

Explore products →

Consulting

Case study — Agentic AI for Shipment Exception Management, a logistics & supply-chain control tower.
Reference implementation — 6-agent architecture, 7-layer guardrail stack, eval-first engineering, built on LangGraph, FastAPI & deployed on AWS.

Explore consulting → Read the whitepaper →

Plugins, Skills

Agentic AI Engineering — full lifecycle plugin: system design, eval harness, production deployment. /agentic-ai-engineering
Production MLOps — end-to-end workflows: experiment tracking, deployment pipelines, monitoring & alerting. /production-mlops

Browse dev-tools →

Courses & Training

Tech Abstractions — AI/ML interview-prep platform, few dozen users on the waitlist.
Corporate Training — 5 course series.

View all courses → Browse training decks →

Fine-tuning

AI Sitcom Scriptwriter — reasoning-first screenplay generation with on-brand humor and character voice.
AI Feynman Kannada Tutor — reasoning-first physics tutor in Kannada, combining SFT and RAG.

See training results →

Explainers

Illustrated explainers and deep dives with interactive visualisations.

Illustrated Coding Agent · Illustrated LLM Inference · Statistics for MLOps · Illustrated RLHF Guide

Read the series →

Shipped products.

In active use, in beta, or on the waitlist.

BatSwing

One phone, one side-on capture, one branded report — cricket academies send families a clear picture of the player's batting in English and Hindi. Dozens of swing-analysis reports weekly for one academy.

View BatSwing →

Creative Collab OS

AI-assisted creativity with better context, taste, and judgment — five distinct crafts (comic writing, song writing, and more), each mapping a few real creative angles before you commit to one. In beta with creatives & content creators.

View Creative Collab OS →

Arbiter

Build-time governance making every agent-assisted architecture call visible before it compounds — a review queue, decision detail, and outcome loop, with a trace-oriented view from coding agent to review queue. Currently in beta.

View Arbiter →

Consulting

The business problem: structural failure modes of manual exception handling

6-agent architecture with exception lifecycle state machine

Bronze to Silver to Gold data pipeline with circuit breaker patterns

Four-pillar eval framework, pass at k versus pass to the k

LangGraph wiring and implementation detail

Case Study: Agentic AI for Shipment Exception Management

A production-grade reference implementation built on LangGraph, FastAPI, and LiteLLM — from the business problem through a runnable deployment on AWS.

The Problem — Four structural failure modes of manual exception handling at scale
Agentic Control Tower — 6-agent architecture with exception lifecycle state machine
Data Layer — Bronze → Silver → Gold pipeline with circuit breaker patterns
Trust & Safety — 7-layer guardrail stack, HITL design, four "never" rules
Eval-First Engineering — Four-pillar eval framework, pass@k vs pass^k
Implementation & Deployment — LangGraph wiring, Terraform, AWS cost model

View engagement →

AI/ML Interview Prep Platform

11 free chapters · 18 practice scenarios · scored on 4 rubric dimensions.

Practice question panel with production-scale scenario

Progress dashboard with domain breakdown and score trend

Interview simulator concept, coming soon

Tech Abstractions Interview Prep

A structured interview preparation platform covering the technical abstractions that matter most in modern AI engineering roles — from distributed systems to LLM internals. Grounded in the same material as the Illustrated Explainer series.

Read Course Chapter — 11 free chapters covering production systems, not toy examples
Attempt practice questions — 18 scenarios with a 3-rung follow-up ladder from mid-level to staff difficulty
Get Scored — Automated scoring across 4 rubric dimensions, expert answers unlock after submission
Progress dashboard — Domain breakdown, score trend, and weak-area callouts
Interview Simulator (coming soon) — Live mock interview session with an AI interviewer, real-time rubric updates

Open platform →

View interview prep page →

Plugins, SKILLS for Coding Agents

2 plugins.

Agentic AI Engineering

A complete skill plugin covering the full agentic AI engineering lifecycle — from system design to evaluation harness to production deployment.

Agent architecture — Decision records for every design call
Harness engineering — Loop design and tool permission modeling
Context engineering — Patterns for grounding agent behavior
Evaluation harness — Scaffolding built before agent code
Production readiness — Checklist before rollout

View plugin →

Production MLOps

Production MLOps plugin

End-to-end MLOps workflows for teams running ML in production — from experiment tracking to deployment pipelines to monitoring and alerting.

Experiment tracking — Versioning across training runs
Model registry — Promotion workflows to production
Deployment pipelines — Scaffolding for shipping models
Feature store — Design patterns for shared features
Monitoring — Drift detection and incident playbooks

View plugin →

Browse all dev-tools →

Corporate Training

5 course series · 79 chapters total.

Leadership · Strategy

Agentic AI for Leaders

A series for senior executives and engineers on what agentic AI actually is, how to make the right architectural choices, and how to move from pilot to production. Covers strategic framing, build-vs-buy decisions, product strategy, pricing, and governance.

Scope — 2 modules · 7 chapters
Audience — Executive / Senior

Book a training session →

Engineering · Technical

Engineering AI Agents

The definitive technical course for engineers and architects building production-grade agentic AI systems. Covers cognitive architecture, memory systems, tool use, agentic design patterns, multi-agent coordination, evaluation frameworks, guardrails, observability, security, and cost/latency optimization.

Scope — 6 modules · 22 chapters
Audience — Engineer / Architect

Book a training session →

MLOps · Production

MLOps Production Guide

A workbook-style course for ML engineers who need to ship, operate, and scale ML systems in production. Covers problem framing through monitoring and incident response — the operational decisions that determine whether a model creates real business value.

Scope — 8 modules · 22 chapters
Audience — ML Engineer

Book a training session →

Systems · Engineering

Harness Engineering

A systems course on the non-model layer that makes coding agents reliable: loop mechanics, state management, permissions modeling, verification pipelines, and human-in-the-loop workflow design. The engineering discipline behind trustworthy agents.

Scope — 4 modules · 14 chapters
Audience — Senior Engineer

Book a training session →

Infrastructure · Research

LLM Inference Engineering

A practical series on how large language model inference works in production: tokens, decode loops, KV caches, attention mechanics, continuous batching, memory management, quantization, and the economics of serving at scale.

Scope — 4 modules · 14 chapters
Audience — Engineer / Researcher

Book a training session →

Browse training decks → View self-paced courses Illustrated explainers

Fine-tuning SLMs, Dataset creation for Domain specific tasks

AI Sitcom Scriptwriter

Reinforcement fine-tuning (RFT)

AI Sitcom Scriptwriter score distribution boxplot

Teaching an open-source LLM to write The Office — reasoning-first screenplay generation with on-brand humor, character voice, and multi-step setups. SFT on reasoning traces + screenplay pairs, then reinforcement fine-tuning (RFT) with PPO, judged by an LLM-as-judge across 8 weighted metrics.

Dataset on HuggingFace ↗ SFT Model ↗ RFT Model ↗

View case study →

AI Feynman Kannada Tutor

Fine-tuned tutor model

AI Feynman Kannada Tutor score distribution boxplot

Multi-stage fine-tuning pipeline creating a reasoning-first physics tutor in Kannada — combining SFT and RAG for intuitive, grounded explanations. Multi-stage SFT (language → domain → grounding) across a 4-model progression, evaluated with LLM-as-judge on a 0–5 scale.

Dataset on HuggingFace ↗ Kannada General SFT ↗ Kannada Physics SFT ↗

View case study →

Agentic AI · GenAI · MLOps - treated as systems engineering.

The portfolio.

Products

Consulting

Plugins, Skills

Courses & Training

Fine-tuning

Explainers

Shipped products.

BatSwing

Creative Collab OS

Arbiter

Consulting

Case Study: Agentic AI for Shipment Exception Management

AI/ML Interview Prep Platform

Tech Abstractions Interview Prep

Plugins, SKILLS for Coding Agents

Agentic AI Engineering

Production MLOps plugin

Corporate Training

Agentic AI for Leaders

Engineering AI Agents

MLOps Production Guide

Harness Engineering

LLM Inference Engineering

Fine-tuning SLMs, Dataset creation for Domain specific tasks

AI Sitcom Scriptwriter

AI Feynman Kannada Tutor