SkyDiscover/adaevolve
AdaEvolve
Hierarchical adaptive search: G-signal exploration intensity, UCB island allocation, and LLM meta-guidance on stagnation.
About
AdaEvolve reframes LLM-driven program evolution as hierarchical adaptive optimization driven by one quantity — the accumulated fitness-improvement signal G, an Adam-style second moment of normalized improvements. Rather than fixing exploration knobs by hand, the method lets each island measure how productively it has been improving and adapts its own search behaviour accordingly, then coordinates those islands with a bandit and rescues the global search with meta-guidance when it stalls.
Level 1 maps each island's G to an exploration intensity that splits parent sampling into explore, exploit, and balanced modes over per-island quality-diversity archives: an island that keeps finding big improvements drives its intensity down toward exploitation, while a saturated island pushes intensity back up to explore. Level 2 allocates iterations across islands with a decayed-reward UCB bandit — rewards are normalized against the global best rather than each island's local best, which fixes the bias that would otherwise starve a strong island, and both rewards and visit counts decay so the bandit tracks recent productivity. Ring migration shares elite programs between islands, and new islands are spawned from heterogeneous quality/diversity/Pareto presets when global productivity collapses. Level 3 detects stagnation via a windowed global-improvement rate and asks a guide LLM for breakthrough "paradigm" ideas that are injected into prompts and applied to the global best until they are exhausted.
This Galapagos implementation is a faithful port of the SkyDiscover reference implementation of AdaEvolve, which it treats as ground truth wherever the paper and code diverge. Following the code, G is updated only on improvement events (it is not decayed during stagnation), island spawning triggers on global productivity (improvements per evaluation below a threshold) rather than on a G threshold, and parent selection uses the 3-mode probability split P(explore) = I, P(exploit) = (1 − I)·0.7, P(balanced) = (1 − I)·0.3.
Two ablation switches from upstream are deliberately not exposed: the quality-diversity archive is always on (the legacy capped-list mode is not ported), and the proposer is hard-wired to SEARCH/REPLACE diffs with a full-rewrite fallback (the upstream default), so the diff-vs-rewrite knobs are not surfaced.
Composition
6/6 blocksThe six components this scaffold snaps together. Each block names its concrete implementation.
- Populationqd_island_archipelago
The set of candidate solutions in play — the gene pool the search evolves over.
- Selectionadaptive_intensity_ucb
Decides which genomes survive and reproduce — tournament, elitism, novelty, or your own policy.
- Promptadaevolve_template
Assembles the context handed to the model — parents, feedback, instructions, examples.
- Proposerdiff
The LLM-driven variation operator — proposes new candidates by mutation and crossover.
- Evaluatortask
Scores each candidate against the task — the fitness signal that drives selection.
- Memoryparadigm_tactics
Persists discoveries across generations — archives, islands, and lineage for the search.
Tags
Source
AdaEvolve: Adaptive LLM-Driven Zeroth-Order Optimization (UC Berkeley); reference implementation in SkyDiscover