Overview
Applied ML 2026 is the successor to the 2024 Applied Machine Learning book, written from scratch for the LLM era. Where the earlier volume was a 2024 snapshot of 25 worked projects across healthcare, bioinformatics, and vision/robotics — and has been archived to the shelf below — this book is built differently, one reproducible project at a time.
This is an in-progress release. The preface and Chapter 1 are published now; further chapters land as each project's rig lands with runnable code and honest numbers.
Design principles
Every project in this book has to pass the same bar:
- LLM-first. The starting point is a foundation model plus retrieval, with fine-tuning as a targeted intervention. Projects are organised around the patterns the LLM era actually uses — retrieval, tool use, evaluation, distillation, quantisation, cost — rather than around architectures-in-isolation.
- Eval-first. Every chapter's first code cell is the metric, not the model. The eval is the specification; without one, you cannot tell whether anything is working.
- Reproducible rigs. Every project ships with an
experiments/run.pythat you can clone and run in under a second on a laptop. The script emitsmetrics.json, and the chapter's numbers are pulled live from that file. Webpage and PDF cannot drift. - No API keys required to reproduce. Every rig uses only the Python standard library or a small, well-pinned dependency set. Projects that benefit from a real LLM are designed so the base rig is deterministic, with the LLM call as a clearly-marked optional extension. This keeps the bar at a $200 laptop.
- Honest failure modes. Each project has a section documenting what fails, with numbers. Negative results are not a footnote.
- Grounded in Mathematical Awakening. Where the math shows up, this book points back to the relevant chapter rather than re-teaching. The two books are meant to sit together.
What's in now
Chapter 1 — Evaluating pass@k, and what it doesn't tell you. A 120-problem synthetic grade-school math benchmark with three difficulty tiers. Four solvers of controllable accuracy and controllable sample-level correlation expose how pass@k actually behaves:
- why pass@k = pass@1 for a deterministic solver (and what that means for papers that report pass@10 on greedy decoding);
- how the classical bound holds under independence and breaks under correlated sampling;
- how a 0.5 stickiness correlation costs ~6 percentage points at pass@5 relative to an iid oracle of the same base rate;
- how to read pass@k alongside bootstrap 95% CIs and per-difficulty breakdowns, and why headline pass@k numbers without either are half a number.
The rig is under 400 lines of standard-library Python, runs in one second, and produces the numbers that appear in the chapter.
What's coming next
As each project lands with a runnable rig, the chapter joins the book. The active queue:
- Retrieval for QA: BM25 vs embeddings vs hybrid on a small hand-built dataset.
- Distillation: a synthetic teacher-student setup to measure how much reasoning ability transfers via SFT on generated traces.
- Quantisation and the memory-quality trade-off: int8, int4, and the ablations that matter.
- Tool use: a tiny agent that calls three deterministic tools, with traceable logs and a cost ceiling.
- Domain plumbing: one healthcare, one bioinformatics, one robotics project, each built on the LLM-first primitives above.
Chapters only land when the rig does. That constraint is deliberate.
Read online vs. PDF
The web reader splits the book into its current chapters — preface and Chapter 1 — and renders inline math and code listings directly in the browser. The PDF is the same content in a single downloadable file and grows as chapters are added.