SkyDiscover/adaevolve

AdaEvolve

Hierarchical adaptive search: G-signal exploration intensity, UCB island allocation, and LLM meta-guidance on stagnation.

Test-time searchApache-2.0

About

AdaEvolve reframes LLM-driven program evolution as hierarchical adaptive optimization driven by one quantity — the accumulated fitness-improvement signal G, an Adam-style second moment of normalized improvements. Rather than fixing exploration knobs by hand, the method lets each island measure how productively it has been improving and adapts its own search behaviour accordingly, then coordinates those islands with a bandit and rescues the global search with meta-guidance when it stalls.

Level 1 maps each island's G to an exploration intensity that splits parent sampling into explore, exploit, and balanced modes over per-island quality-diversity archives: an island that keeps finding big improvements drives its intensity down toward exploitation, while a saturated island pushes intensity back up to explore. Level 2 allocates iterations across islands with a decayed-reward UCB bandit — rewards are normalized against the global best rather than each island's local best, which fixes the bias that would otherwise starve a strong island, and both rewards and visit counts decay so the bandit tracks recent productivity. Ring migration shares elite programs between islands, and new islands are spawned from heterogeneous quality/diversity/Pareto presets when global productivity collapses. Level 3 detects stagnation via a windowed global-improvement rate and asks a guide LLM for breakthrough "paradigm" ideas that are injected into prompts and applied to the global best until they are exhausted.

This Galapagos implementation is a faithful port of the SkyDiscover reference implementation of AdaEvolve, which it treats as ground truth wherever the paper and code diverge. Following the code, G is updated only on improvement events (it is not decayed during stagnation), island spawning triggers on global productivity (improvements per evaluation below a threshold) rather than on a G threshold, and parent selection uses the 3-mode probability split P(explore) = I, P(exploit) = (1 − I)·0.7, P(balanced) = (1 − I)·0.3.

Two ablation switches from upstream are deliberately not exposed: the quality-diversity archive is always on (the legacy capped-list mode is not ported), and the proposer is hard-wired to SEARCH/REPLACE diffs with a full-rewrite fallback (the upstream default), so the diff-vs-rewrite knobs are not surfaced.

Composition

6/6 blocks

The six components this scaffold snaps together. Each block names its concrete implementation.

Population
qd_island_archipelago
Selection
adaptive_intensity_ucb
Prompt
adaevolve_template
Proposer
diff
Evaluator
task
Memory
paradigm_tactics
  • Populationqd_island_archipelago

    The set of candidate solutions in play — the gene pool the search evolves over.

  • Selectionadaptive_intensity_ucb

    Decides which genomes survive and reproduce — tournament, elitism, novelty, or your own policy.

  • Promptadaevolve_template

    Assembles the context handed to the model — parents, feedback, instructions, examples.

  • Proposerdiff

    The LLM-driven variation operator — proposes new candidates by mutation and crossover.

  • Evaluatortask

    Scores each candidate against the task — the fitness signal that drives selection.

  • Memoryparadigm_tactics

    Persists discoveries across generations — archives, islands, and lineage for the search.

Tags

adaptiveucbislandsquality-diversitymeta-guidancediff-evolution

Source

AdaEvolve: Adaptive LLM-Driven Zeroth-Order Optimization (UC Berkeley); reference implementation in SkyDiscover