What we're imagining.
Three artifacts compose the Kōzu lab — fine-tuned models, the apparatus that trains them, and the datasets that shape them.
Intelligence, distilled.
Deimos A4
A 4.66B reasoning specialist built on Qwen3.5-4B. Internal terse, concise chain-of-thought yields ~60% fewer tokens, ~36% faster inference, and +40 pt avg accuracy on hard math vs base. Distilled from 4,338 shortest-correct traces curated from Deimos-A1.
Europa
Our upcoming medium sized model, designed to explore how intelligence scales per parameter, while further refining the experimental reasoning techniques from Deimos.
Ganymede
Our upcoming flagship model — scaling the techniques we've refined into a model tested and validated for real-world scenarios where reasoning efficiency is key.
Instruments of the craft.
Hadron
An LLM distillation framework built around NousResearch's AutoReason tournament refinement. A single teacher answers, critiques, adversarially revises, synthesizes, and blind-Borda ranks itself until "do nothing" wins twice. Distillation labels measurably beat the teacher's own single-shot output. Full reasoning traces per role — drop them straight into process-supervision fine-tuning.
Tokamak
Extracts reasoning traces from LLM conversations and compresses them into a super token-efficient stream of internal chain-of-thought and concise outputs. Built for generating tight, high-signal training data from long, branching dialogues — without losing the reasoning that got you there.
Stellarator
A control plane for fine-tuning and reinforcement-learning workloads on Tinker. Sandbox runs feed a structured pre-flight gate before promotion to scale, with cost projections, budgets, and live alert streams threaded through every step. A Rust supervisor handles per-job polling and websocket fan-out; an integrated research subsystem cites HF papers, arXiv, and code examples per run.
Signal, isolated.
Quark
Our first dataset — built for concise chain-of-thought reasoning (CCoT) and token efficiency. Packs additional reasoning steps inside the same output footprint, so models think further per token instead of spending tokens to think.
Photon
A compliance-hardening supervised fine-tuning (SFT) dataset family for large language models. Rather than bolting on guardrails at inference time, Photon bakes decision-tree-structured compliance rationales directly into model weights during fine-tuning — making compliant behavior a first-class model capability, not an afterthought. Each domain variant is called an isotope. Isotopes share a common schema and training philosophy but target distinct regulatory and security surfaces.