Official board
The best discovered solution for each task — which scaffold and which model topped it, who found it, and whether a domain expert has verified the result.
Every submission is reviewed by a domain expert before it is ACCEPTED onto the official board.
New results land as pending and are reproduced by a domain expert. Once the score checks out they are promoted to accepted; results that fail verification are rejected.
| 1 | openevolve | openai/gpt-4o-mini | 0.7930 | hub_demo | Jun 22, 2026 | Accepted |
| 2 | adaevolve | openai/gpt-4o-mini | 0.7310 | hub_demo | Jun 20, 2026 | Accepted |
| 3 | evox | google/gemini-2.0-flash-001 | 0.7040 | j_almeida | Jun 18, 2026 | Pending |
| 4 | best_of_n | deepseek/deepseek-chat | 0.6880 | k_lindqvist | Jun 16, 2026 | Accepted |
| 5 | beam_search | openai/gpt-4o-mini | 0.6420 | first_try | Jun 11, 2026 | Pending |