
Book Three/The Differentiator
On-Device AI for Flutter
Private, Offline, and Free at Inference
Run real AI on the phone — no servers, no API bills, nothing leaves the device. Local LLMs, offline RAG, vision and speech, performance under heat and memory pressure, and a capstone that works in airplane mode.
$29143 pages · fixed-layout PDF · code verified June 2026
What’s inside
- Run Gemma on a phone with flutter_gemma and LiteRT
- Fully-offline RAG — a private knowledge base that never phones home
- Survive real hardware: memory, battery, and thermal throttling
- Ship a fully-offline feature that works on a plane, at zero cost
On-Device LLMsOffline RAGVision · SpeechHybrid RoutingCustom Models
How it’s organized
- 1Why On-Devicethe case, models, runtimes
- 2Run a ModelGemma, platform AI, delivery
- 3Core Taskstext, vision, audio, RAG
- 4Productionperformance, hybrid, custom
- 5Capstoneoffline habit insights

