Kronos is a self-evolving training pipeline targeting Opus-4.6-parity
on coding benchmarks at $1/M output tokens,
2M context, 150–200 tok/s.
All scripts, datasets, and intermediate checkpoints are public.
€100
seed capital
€5.50
spent so far
Qwen2.5-Coder-1.5B
Round-3 PIVOT base model
HumanEval+ / LCB
evaluation benchmarks
Status (May 13 2026)
Round-0 DONE: Qwen2.5-Coder-1.5B + LoRA r=32 on 30K CodeFeedback rows. 625 steps, loss 0.678→0.489, mtok-acc 0.813→0.841. Adapter at jaivial/kronos-round0-qwen15coder-lora.
Round-1A SHELVED: Qwen2.5-Coder-7B SFT cap-killed twice on Kaggle P100 (v2 MLP r=32 + v3 attn r=16 both stopped at ~27% of one epoch under 12h cap). 7B path needs H100 — deferred until Lambda grant or paid burst.
Round-3 PIVOT cap-killed: GRPO RL on R0 1.5B base completed ZERO steps in 12h on Kaggle P100 (cycle-106). Per-step ~640s with 8-cand × 512-tok × grad-accum 8. GRPO infeasible on free P100 regardless of base size.
Round-3 Branch B DPO LAUNCH-READY: O(N) preference-pair training fits 12h cap with 9h margin. 11 chosen/rejected pairs already generated from Qwen3-Coder-480B teacher at jaivial/kronos-r3b-dpo-pairs. Waits on Kaggle weekly quota reset (~May 18-19).
Round-2 wired: distillation (Branch A) from Qwen3-Coder-480B-A35B teacher + more-SFT/diagnose alternates. Replays after R3 metrics + H100 unlock.
Operating stack: HF Pro (~€5.50/mo prorated) for 480B teacher access. Kaggle P100 free for training. R6 serving stack (vLLM + DFlash speculative decoding) architecture sketched. PLAN.md §0.2 freezes paid GPU ≥€10 until ≥€50 external lands OR DPO checkpoint validates.
Approach
Cheap-first ladder: Kaggle/Colab free GPUs → grant credits → small paid GPU bursts → revenue-funded scale.
Distillation Round 2: Qwen3-Coder-480B teacher fixes ~50% of student failures on LCB-medium (empirically validated cycle 38–39).
RL Round 3: GRPO + binary code-execution reward on the residual systemic failures distillation can't fix.
Speculative decoding Round 6: DFlash block-diffusion drafts for 2–3× throughput at serve time.
Early access
Want to try the model when Round-3 lands? Drop your email.
Paid early-access tier (€10–20/mo with usage cap) opens after the first Opus-4.6-comparable checkpoint.