Vavi Labs builds AI, ML workflows with the parts that matter in production: problem framing, architecture, evals, guardrails, observability, deployment paths, and team training. This site is a working portfolio of products, tools, models, and artifacts behind that approach.
/agentic-ai-engineering/production-mlopsIllustrated explainers and deep dives with interactive visualisations.
Illustrated Coding Agent · Illustrated LLM Inference · Statistics for MLOps · Illustrated RLHF Guide
Read the series →One phone, one side-on capture, one branded report — cricket academies send families a clear picture of the player's batting in English and Hindi. Dozens of swing-analysis reports weekly for one academy.
View BatSwing →AI-assisted creativity with better context, taste, and judgment — five distinct crafts (comic writing, song writing, and more), each mapping a few real creative angles before you commit to one. In beta with creatives & content creators.
View Creative Collab OS →Build-time governance making every agent-assisted architecture call visible before it compounds — a review queue, decision detail, and outcome loop, with a trace-oriented view from coding agent to review queue. Currently in beta.
View Arbiter →A production-grade reference implementation built on LangGraph, FastAPI, and LiteLLM — from the business problem through a runnable deployment on AWS.





A structured interview preparation platform covering the technical abstractions that matter most in modern AI engineering roles — from distributed systems to LLM internals. Grounded in the same material as the Illustrated Explainer series.
A complete skill plugin covering the full agentic AI engineering lifecycle — from system design to evaluation harness to production deployment.
End-to-end MLOps workflows for teams running ML in production — from experiment tracking to deployment pipelines to monitoring and alerting.

A series for senior executives and engineers on what agentic AI actually is, how to make the right architectural choices, and how to move from pilot to production. Covers strategic framing, build-vs-buy decisions, product strategy, pricing, and governance.

The definitive technical course for engineers and architects building production-grade agentic AI systems. Covers cognitive architecture, memory systems, tool use, agentic design patterns, multi-agent coordination, evaluation frameworks, guardrails, observability, security, and cost/latency optimization.

A workbook-style course for ML engineers who need to ship, operate, and scale ML systems in production. Covers problem framing through monitoring and incident response — the operational decisions that determine whether a model creates real business value.

A systems course on the non-model layer that makes coding agents reliable: loop mechanics, state management, permissions modeling, verification pipelines, and human-in-the-loop workflow design. The engineering discipline behind trustworthy agents.

A practical series on how large language model inference works in production: tokens, decode loops, KV caches, attention mechanics, continuous batching, memory management, quantization, and the economics of serving at scale.
Teaching an open-source LLM to write The Office — reasoning-first screenplay generation with on-brand humor, character voice, and multi-step setups. SFT on reasoning traces + screenplay pairs, then reinforcement fine-tuning (RFT) with PPO, judged by an LLM-as-judge across 8 weighted metrics.
View case study →
Multi-stage fine-tuning pipeline creating a reasoning-first physics tutor in Kannada — combining SFT and RAG for intuitive, grounded explanations. Multi-stage SFT (language → domain → grounding) across a 4-model progression, evaluated with LLM-as-judge on a 0–5 scale.
View case study →