Back to books
Book · 1 chapter

Applied ML 2026: Reproducible Projects for the LLM Era

An in-progress companion volume to Mathematical Awakening, being written from scratch in the LLM era. Each project ships with a reproducible rig (experiments/run.py → metrics.json) in the tradition of the rest of the Aresalab papers: LLM-first patterns (retrieval, tool-use, evals, distillation, quantisation), domain depth (healthcare, bioinformatics, robotics), and honest failure modes. Published chapter-by-chapter as each project lands with a running experiment.

Author
Yevheniy Chuba
Date
2026
Chapters
1
Format
PDF + HTML
Download PDF

Overview

Applied ML 2026 is the successor to the 2024 Applied Machine Learning book, written from scratch for the LLM era. Where the earlier volume was a 2024 snapshot of 25 worked projects across healthcare, bioinformatics, and vision/robotics — and has been archived to the shelf below — this book is built differently, one reproducible project at a time.

This is an in-progress release. The preface and Chapter 1 are published now; further chapters land as each project's rig lands with runnable code and honest numbers.

Design principles

Every project in this book has to pass the same bar:

  • LLM-first. The starting point is a foundation model plus retrieval, with fine-tuning as a targeted intervention. Projects are organised around the patterns the LLM era actually uses — retrieval, tool use, evaluation, distillation, quantisation, cost — rather than around architectures-in-isolation.
  • Eval-first. Every chapter's first code cell is the metric, not the model. The eval is the specification; without one, you cannot tell whether anything is working.
  • Reproducible rigs. Every project ships with an experiments/run.py that you can clone and run in under a second on a laptop. The script emits metrics.json, and the chapter's numbers are pulled live from that file. Webpage and PDF cannot drift.
  • No API keys required to reproduce. Every rig uses only the Python standard library or a small, well-pinned dependency set. Projects that benefit from a real LLM are designed so the base rig is deterministic, with the LLM call as a clearly-marked optional extension. This keeps the bar at a $200 laptop.
  • Honest failure modes. Each project has a section documenting what fails, with numbers. Negative results are not a footnote.
  • Grounded in Mathematical Awakening. Where the math shows up, this book points back to the relevant chapter rather than re-teaching. The two books are meant to sit together.

What's in now

Chapter 1 — Evaluating pass@k, and what it doesn't tell you. A 120-problem synthetic grade-school math benchmark with three difficulty tiers. Four solvers of controllable accuracy and controllable sample-level correlation expose how pass@k actually behaves:

  • why pass@k = pass@1 for a deterministic solver (and what that means for papers that report pass@10 on greedy decoding);
  • how the classical bound 1(1p)k1 - (1 - p)^k holds under independence and breaks under correlated sampling;
  • how a 0.5 stickiness correlation costs ~6 percentage points at pass@5 relative to an iid oracle of the same base rate;
  • how to read pass@k alongside bootstrap 95% CIs and per-difficulty breakdowns, and why headline pass@k numbers without either are half a number.

The rig is under 400 lines of standard-library Python, runs in one second, and produces the numbers that appear in the chapter.

What's coming next

As each project lands with a runnable rig, the chapter joins the book. The active queue:

  • Retrieval for QA: BM25 vs embeddings vs hybrid on a small hand-built dataset.
  • Distillation: a synthetic teacher-student setup to measure how much reasoning ability transfers via SFT on generated traces.
  • Quantisation and the memory-quality trade-off: int8, int4, and the ablations that matter.
  • Tool use: a tiny agent that calls three deterministic tools, with traceable logs and a cost ceiling.
  • Domain plumbing: one healthcare, one bioinformatics, one robotics project, each built on the LLM-first primitives above.

Chapters only land when the rig does. That constraint is deliberate.

Read online vs. PDF

The web reader splits the book into its current chapters — preface and Chapter 1 — and renders inline math and code listings directly in the browser. The PDF is the same content in a single downloadable file and grows as chapters are added.

Read online

2 chapters

A web reader for the book: every chapter as its own page, with math and code rendered inline. Cite it by chapter URL, or grab the full PDF above if you want a clean offline copy.

  1. 00Preface
  2. 01Chapter 1: Evaluating pass@k, and what it doesn't tell you

How to cite

Pre-Zenodo · no DOI yet
@book{chuba2026applied,
  title        = {Applied ML 2026: Reproducible Projects for the LLM Era},
  author       = {Chuba, Yevheniy},
  year         = {2026},
  institution  = {YoreAI},
  publisher    = {YoreAI},
  url          = {https://aresalab.com/books/applied-ml-2026},
  note         = {Accessed via Aresalab}
}