Leaderboard

Official board

The best discovered solution for each task — which scaffold and which model topped it, who found it, and whether a domain expert has verified the result.

Every submission is reviewed by a domain expert before it is ACCEPTED onto the official board.

New results land as pending and are reproduced by a domain expert. Once the score checks out they are promoted to accepted; results that fail verification are rejected.

12accepted4pending

Task

Status

BoardCircle Packing (n=26)


1	adaevolve	openai/gpt-5.5	0.9710	berkeley-repro	Jun 21, 2026	Accepted
2	evox	anthropic/claude-opus-4	0.9570	meta_searcher	Jun 20, 2026	Accepted
3	openevolve	anthropic/claude-opus-4	0.9480	ana_kovacs	Jun 18, 2026	Accepted
4	beam_search	openai/gpt-5.5	0.9330	r_tanaka	Jun 14, 2026	Accepted
5	topk	google/gemini-3-pro	0.9010	lab42	Jun 9, 2026	Pending
6	best_of_n	anthropic/claude-sonnet-4	0.8680	p_singh	Jun 2, 2026	Accepted