Autonomous LLM-driven hyperparameter optimization for GPU pretraining research. Claude designs experiments, trains GPT-2-scale models, and decides what to try next.