Autoresearch Benchmark — LLM Autonomous ML Researcher Leaderboard

How to contribute

Fork elementalcollision/autoresearch-unified
Run autoresearch with any LLM on any GPU
Place results TSV in data/results/{dataset}/{your-github-handle}-{gpu}/
Open a pull request — the leaderboard rebuilds on merge

TSV format: 14 columns —

exp, description, val_bpb, peak_mem_gb, tok_sec, mfu, steps, status, notes, gpu_name, baseline_sha, watts, joules_per_token, total_energy_joules

. Status must be one of: baseline, keep, discard, crash, skip. Filename pattern: results_{model}_{rN}.tsv (e.g. results_sonnet46_r1.tsv).