Skip to content

The evolutionary loop

Galapagos models every evolutionary-search method — from a plain mutate-and-select loop to a multi-agent, self-modifying system — as a composition of six components over one unit type, the Genome. A method is simply a choice of which implementation fills each of the six slots; the loop itself never changes.

                    ┌──────────────── Memory ────────────────┐
                    │  free-form knowledge: notes, skills,    │
                    │  scratchpad, landscape, tactics         │
                    └────▲───────────────────────────┬────────┘
                    read │                      write │
   ┌────────────┐  ┌─────┴───────────┐  ┌─────────────┐  ┌──────────┐  ┌───────────┐
   │ Population │─▶│ SelectionPolicy │─▶│ PromptBuilder │─▶│ Proposer │─▶│ Evaluator │
   └─────▲──────┘  └─────────────────┘  └─────────────┘  └──────────┘  └─────┬─────┘
         └──────────────────────── new scored Genome ◀──────────────────────┘

The loop in one sentence

Each iteration: select parents from the Population → build a prompt from them and Memory → propose a candidate → evaluate it with a deterministic verifiable function → add the scored Genome back to the Population (per the selection policy) → optionally write what was learned to Memory. Repeat until the end condition. The GalapagosScaffold orchestrates this loop.

The six components

Each role is an abstract base class in galapagos.components, with one or more shipped implementations. The components own the behaviour; the orchestrator only drives the loop and enforces the budget.

Component Role Shipped implementations
Population the candidate store — holds Genomes, owns the topology. InMemoryPopulation, IslandPopulation
SelectionPolicy the active, stateful policy — picks parents + inspirations, adapts over time. ExploreExploitPolicy, UCBBanditPolicy, IdentityPolicy
PromptBuilder the renderer — formats the selection + Memory into {system, user}. Pure formatting. DefaultPromptBuilder
Proposer the variation operator — turns a prompt into a new Genome. DiffProposer, CrossoverProposer
Evaluator the deterministic verifiable scorer — maps a Genome to metrics. Supplied by the task. SubprocessEvaluator
Memory the cross-cutting, free-form knowledge store. Optional. NullMemory, ScratchpadMemory

See Core components for each role in depth, the per-method coverage matrix, and how each conceptual role maps to a shipped class.

The unit of evolution: Genome

Everything flowing through the loop is a Genome — one candidate solution plus everything needed to select, evaluate, and trace it: content, id, parent_id, lineage ("a → b → c"), scores, metadata, artifacts. Its fitness is genome.scores["combined_score"], exposed as genome.fitness. See Genome.

A method = which implementation fills each slot

The same six slots express every reference method. Differences between methods are differences in which implementation fills a slot — not differences in architecture. For example:

Slot OpenEvolve AdaEvolve
Population map_elites_islands (islands + MAP-Elites) qd_island_archipelago (islands + dynamic spawning)
SelectionPolicy three_tier_explore_exploit adaptive_intensity_ucb (UCB island routing + adaptive intensity)
PromptBuilder openevolve_template adaevolve_template (+ tactic injection)
Proposer diff diff
Evaluator task (from the task) task (from the task)
Memory (none) paradigm_tactics (breakthrough tactics)

AdaEvolve is OpenEvolve with four slots swapped — the defining two are an adaptive SelectionPolicy (a UCB bandit over islands with adaptive exploration intensity) and a Memory of paradigm-breakthrough tactics, alongside variant Population (dynamic island spawning) and PromptBuilder (tactic injection) implementations. Nothing about the loop changes; only the slots do.

Agent-driven scaffolds are no exception: when an autonomous agent picks its own parent, the slot is just the IdentityPolicy (it returns the whole Population and defers the choice to the Proposer). Meta-scaffolds — a search whose Proposer runs an inner search — are just nesting, because a GalapagosScaffold itself satisfies the Proposer interface.

Running the loop

The whole loop is reachable through the card protocol: load a model, a scaffold, and a task from cards, then call run.

import galapagos as gx

config   = gx.GalapagosConfig.from_config(scaffold_name="openevolve")
model    = gx.GalapagosModel.from_card(name="openai/gpt-4o-mini", host="openrouter")
scaffold = gx.GalapagosScaffold.from_card(name="openevolve", config=config, model=model)
task     = gx.GalapagosTask.from_card(name="circle_packing")

result = scaffold.run(task=task)
print(result.best_score, result.summary)

result is a RunResult carrying best (the top Genome), best_score, history, and a summary dict (cost, iterations, …).

The model is a frozen callable; the scaffold provides all the intelligence (selection + prompting). The shipped scaffolds are openevolve, adaevolve, evox, and the four SkyDiscover baselines best_of_n, best_of_n_attempts, topk, beam_search.

Cards: the single source of truth

Every task, scaffold, model, and discovery is described by a YAML card. Cards power the local library and the Hub: you from_card to load and galapagos submit to share, with one schema validating both directions.

See also

  • Core components — the six interfaces in depth, with the coverage matrix.
  • Genome — the unit of evolution.
  • Cards — the YAML protocol that loads and submits everything.
  • Models — the model the Proposer calls.