SkyDiscover/evox

EvoX

Co-evolves the search strategy with the solutions: the parent/context selection policy is itself LLM-written code, scored by windowed improvement and hot-swapped on stagnation.

Test-time searchApache-2.0

About

EvoX (EvoX: Meta-Evolution for Automated Discovery, UC Berkeley) treats the search strategy of LLM-driven evolutionary search as an evolvable object rather than a fixed harness. The active strategy is an executable class whose `add()`/`sample()` methods decide which parent to mutate, which variation-operator label to attach (free-form exploration, structural divergence, or local refinement — problem-specific operator texts generated once per run by a guide LLM), and which inspiration set to show the proposer. Solutions and the strategy that breeds them evolve on two interleaved timescales: solutions every iteration, the strategy only when progress stalls.

Every deployed strategy is scored over the window it ran for. The score is a log-weighted, horizon-normalized improvement, `J = (s_end - s_start) * (1 + ln(1 + max(0, s_start))) / sqrt(W)`, and every finalized strategy is recorded in a strategy history H. Switching is demand-driven: when the best score stagnates (consecutive iterations with gain at or below an absolute threshold) for `W` iterations, a strong model rewrites the argmax-J strategy from H, conditioned on a population-state descriptor φ that captures the score distribution, top-k structure, recent execution trace, and parent/context reuse ratios.

A rewrite is never trusted blindly: the candidate is validated by a behavioral test-suite (`Valid(·)`), and on success the entire current population is migrated into the new strategy — never reset — with the previous strategy kept as a runtime fallback. If a deployed strategy throws at runtime, the population restores the fallback (or, in the worst case, the always-valid seed) and the failed evolution is counted but never scored. All failures leave the current strategy in place.

This scaffold is a faithful port of the SkyDiscover reference implementation (`search/evox/`), which is treated as ground truth wherever it diverges from the paper. Notably, J uses the code's `(1 + ln(...))` weight, stagnation is a per-iteration consecutive counter rather than fixed windows, the meta-parent is the deterministic argmax-J strategy, and the horizon normalizer is fixed at the switch interval even when a strategy outlives it (an intentional bonus for long-lived improving strategies).

Composition

6/6 blocks

The six components this scaffold snaps together. Each block names its concrete implementation.

Population
evolved_strategy_store
Selection
evolved_strategy_sampler
Prompt
operator_labeled_default
Proposer
diff
Evaluator
task
Memory
strategy_history
  • Populationevolved_strategy_store

    The set of candidate solutions in play — the gene pool the search evolves over.

  • Selectionevolved_strategy_sampler

    Decides which genomes survive and reproduce — tournament, elitism, novelty, or your own policy.

  • Promptoperator_labeled_default

    Assembles the context handed to the model — parents, feedback, instructions, examples.

  • Proposerdiff

    The LLM-driven variation operator — proposes new candidates by mutation and crossover.

  • Evaluatortask

    Scores each candidate against the task — the fitness signal that drives selection.

  • Memorystrategy_history

    Persists discoveries across generations — archives, islands, and lineage for the search.

Tags

meta-evolutionstrategy-evolutioncoevolutionskydiscover

Source

EvoX: Meta-Evolution for Automated Discovery (UC Berkeley); reference implementation in SkyDiscover