Catalog · 2026

What we're imagining.

Three artifacts compose the Kōzu lab — fine-tuned models, the apparatus that trains them, and the datasets that shape them.

01/Models

Intelligence, distilled.

SATELLITE CLASSAlpha 4

Deimos A4

A 4.66B reasoning specialist built on Qwen3.5-4B. Internal terse, concise chain-of-thought yields ~60% fewer tokens, ~36% faster inference, and +40 pt avg accuracy on hard math vs base. Distilled from 4,338 shortest-correct traces curated from Deimos-A1.

↗

PLANETARY CLASSPlanned

Europa

Our upcoming medium sized model, designed to explore how intelligence scales per parameter, while further refining the experimental reasoning techniques from Deimos.

↗

STELLAR CLASSPlanned

Ganymede

Our upcoming flagship model — scaling the techniques we've refined into a model tested and validated for real-world scenarios where reasoning efficiency is key.

↗

02/Tools

Instruments of the craft.

DISTILLATIONv0.1

Hadron

An LLM distillation framework built around NousResearch's AutoReason tournament refinement. A single teacher answers, critiques, adversarially revises, synthesizes, and blind-Borda ranks itself until "do nothing" wins twice. Distillation labels measurably beat the teacher's own single-shot output. Full reasoning traces per role — drop them straight into process-supervision fine-tuning.

↗

TRACE EXTRACTIONv0.1

Tokamak

Extracts reasoning traces from LLM conversations and compresses them into a super token-efficient stream of internal chain-of-thought and concise outputs. Built for generating tight, high-signal training data from long, branching dialogues — without losing the reasoning that got you there.

↗

ORCHESTRATIONv0.1

Stellarator

A control plane for fine-tuning and reinforcement-learning workloads on Tinker. Sandbox runs feed a structured pre-flight gate before promotion to scale, with cost projections, budgets, and live alert streams threaded through every step. A Rust supervisor handles per-job polling and websocket fan-out; an integrated research subsystem cites HF papers, arXiv, and code examples per run.

↗

03/Datasets

Signal, isolated.

REASONINGv0.1

Quark

Our first dataset — built for concise chain-of-thought reasoning (CCoT) and token efficiency. Packs additional reasoning steps inside the same output footprint, so models think further per token instead of spending tokens to think.

↗

DRAFTERTheoretical

Lepton

Theoretical: a drafter dataset tuned for speculative decoding. Tiny predictive models get fast enough to meaningfully accelerate large reasoning LLMs.

↗

ROUTINGTheoretical

Muon

Theoretical: an ephemeral-context dataset focused on fast prompt routing and activation speed in sparse Mixture-of-Experts architectures.

↗

REASONINGTheoretical

Baryon

Theoretical: a composite reinforcement-learning dataset rich in agentic chain-of-thought structures — reasoning lift without expanding parameter count.

↗

PRUNINGTheoretical

Neutrino

Theoretical: a pruning-aware sparsification dataset. Trains models to hold accuracy while safely dropping inactive weights.

↗

RESERVEDReserved

Boson · Gluon · Photon

Names reserved for datasets that haven't earned their scope yet. When one does, it gets promoted out of this cell.

↗