SkyDiscover/evox
EvoX
Co-evolves the search strategy with the solutions: the parent/context selection policy is itself LLM-written code, scored by windowed improvement and hot-swapped on stagnation.
About
EvoX (EvoX: Meta-Evolution for Automated Discovery, UC Berkeley) treats the search strategy of LLM-driven evolutionary search as an evolvable object rather than a fixed harness. The active strategy is an executable class whose `add()`/`sample()` methods decide which parent to mutate, which variation-operator label to attach (free-form exploration, structural divergence, or local refinement — problem-specific operator texts generated once per run by a guide LLM), and which inspiration set to show the proposer. Solutions and the strategy that breeds them evolve on two interleaved timescales: solutions every iteration, the strategy only when progress stalls.
Every deployed strategy is scored over the window it ran for. The score is a log-weighted, horizon-normalized improvement, `J = (s_end - s_start) * (1 + ln(1 + max(0, s_start))) / sqrt(W)`, and every finalized strategy is recorded in a strategy history H. Switching is demand-driven: when the best score stagnates (consecutive iterations with gain at or below an absolute threshold) for `W` iterations, a strong model rewrites the argmax-J strategy from H, conditioned on a population-state descriptor φ that captures the score distribution, top-k structure, recent execution trace, and parent/context reuse ratios.
A rewrite is never trusted blindly: the candidate is validated by a behavioral test-suite (`Valid(·)`), and on success the entire current population is migrated into the new strategy — never reset — with the previous strategy kept as a runtime fallback. If a deployed strategy throws at runtime, the population restores the fallback (or, in the worst case, the always-valid seed) and the failed evolution is counted but never scored. All failures leave the current strategy in place.
This scaffold is a faithful port of the SkyDiscover reference implementation (`search/evox/`), which is treated as ground truth wherever it diverges from the paper. Notably, J uses the code's `(1 + ln(...))` weight, stagnation is a per-iteration consecutive counter rather than fixed windows, the meta-parent is the deterministic argmax-J strategy, and the horizon normalizer is fixed at the switch interval even when a strategy outlives it (an intentional bonus for long-lived improving strategies).
Composition
6/6 blocksThe six components this scaffold snaps together. Each block names its concrete implementation.
- Populationevolved_strategy_store
The set of candidate solutions in play — the gene pool the search evolves over.
- Selectionevolved_strategy_sampler
Decides which genomes survive and reproduce — tournament, elitism, novelty, or your own policy.
- Promptoperator_labeled_default
Assembles the context handed to the model — parents, feedback, instructions, examples.
- Proposerdiff
The LLM-driven variation operator — proposes new candidates by mutation and crossover.
- Evaluatortask
Scores each candidate against the task — the fitness signal that drives selection.
- Memorystrategy_history
Persists discoveries across generations — archives, islands, and lineage for the search.
Tags
Source
EvoX: Meta-Evolution for Automated Discovery (UC Berkeley); reference implementation in SkyDiscover