Skip to content

Run a scaffold

Running a discovery method is four objects — a model, a config, a scaffold, and a task — and one call. Everything else is optional tuning.

import galapagos as gx

model    = gx.GalapagosModel.from_card(name="openai/gpt-5.5", host="openrouter")
config   = gx.GalapagosConfig.from_config(scaffold_name="openevolve")
scaffold = gx.GalapagosScaffold.from_card(name="openevolve", config=config, model=model)
task     = gx.GalapagosTask.from_card(name="circle_packing")

result   = scaffold.run(task=task)
print(result.best_score)        # best combined_score found
print(result.best.content)      # the winning program (a string)

scaffold.run(task=...) drives the six-component loop: select parents from the Population → build a prompt → propose a candidate → evaluate it → add the scored Genome back → repeat until the budget is spent.

The four objects

Model

GalapagosModel.from_card(name=..., host=...) resolves a hosted, OpenAI-compatible endpoint. The supported hosts are openai, openrouter, togetherai (alias together), litellm, vllm, huggingface (alias hf), azure, bedrock, anthropic, and google. The API key is read from the environment: OPENAI_API_KEY for all hosts (see the host/env-var tables in Models).

model = gx.GalapagosModel.from_card(name="openai/gpt-4o-mini", host="openrouter")

Every run calls a live LLM, so set an OpenRouter key first: Galapagos reads it from OPENAI_API_KEY (see Installation).

Config

GalapagosConfig.from_config(scaffold_name=...) loads the scaffold's bundled default config; or pass path="cfg.yaml" for your own. Read and override tunables with dotted paths:

config = gx.GalapagosConfig.from_config(scaffold_name="openevolve")
config.set("database.num_islands", 8)
config.set("budget.max_iterations", 200)
config.get("budget.max_iterations")     # -> 200

The budget section maps onto the stopping conditions (see below).

Scaffold

Three ways to construct one:

# 1. by name via the registry (the base class dispatches)
scaffold = gx.GalapagosScaffold.from_card(name="openevolve", config=config, model=model)

# 2. a concrete subclass loads its own card + defaults (config/model optional)
scaffold = gx.OpenEvolveScaffold.from_card(model=model)

# 3. build-your-own from components (see "Write your own scaffold")
scaffold = gx.GalapagosScaffold.from_card(population=..., selection_policy=..., proposer=...)

List the runnable scaffolds at any time:

gx.available_scaffolds()    # every bundled card: ['adaevolve', 'beam_search', 'best_of_n', 'best_of_n_attempts', 'evox', 'meta_harness', 'openevolve', 'topk']
gx.registered_scaffolds()   # the runnable subset — the same eight

All bundled scaffolds are runnable

The catalog ships 8 cards — adaevolve, beam_search, best_of_n, best_of_n_attempts, evox, meta_harness, openevolve, and topk — and every one has a runnable Python controller. GalapagosScaffold.from_card("nope", ...) raises a clear KeyError listing the runnable set.

Task

GalapagosTask.from_card(name=...) loads the problem statement, the seed program, and the Evaluator. The Evaluator is supplied by the task, not the scaffold — so any scaffold runs against any task.

task = gx.GalapagosTask.from_card(name="circle_packing")
task.context            # the problem text injected into prompts
task.runnable           # True iff it ships a seed + evaluator.py
task.status             # 'stable'
task.initial_genome()   # the seed Genome

The catalog bundles 64 runnable tasks; circle_packing, function_minimization, and playground_sphere are the canonical quickstart examples. See the task catalog.

The budget

The run stops as soon as any configured bound is hit. Set them on the config's budget section, or override the iteration count inline on run:

config.set("budget.max_iterations", 100)   # cap on iterations
config.set("budget.target_score", 1.0)     # stop early once reached
config.set("budget.max_usd", 5.0)          # hard $ ceiling (live model calls)
config.set("budget.patience", 30)          # stop after N iters with no best-score gain
config.set("budget.wallclock_s", 600)      # stop after N seconds

result = scaffold.run(task=task, max_iterations=50)   # inline override of max_iterations

Reading RunResult

run returns a RunResult:

result = scaffold.run(task=task)

result.best          # the best Genome (or None)
result.best_score    # result.best.fitness, or -inf
result.history       # list[Genome] — the seed + every evaluated child, in order
result.run_dir       # run directory (if the scaffold set one)
result.summary       # a dict, e.g.:
{
    "scaffold": "openevolve",
    "task": "circle_packing",
    "iterations": 100,        # loop steps taken
    "evaluations": 94,        # genomes evaluated = 1 (seed) + iterations - no_diff
    "best_score": 2.61,       # best combined_score
    "cost_usd": 0.42,         # accumulated model spend
    "no_diff": 7,             # wasted steps where the Proposer returned a no-op
    "population_size": 40,    # genomes currently in the Population
}

The winning artifact is result.best.content (a string of source code). Its metric dict is result.best.scores and the headline number is result.best.fitness (== result.best.scores["combined_score"]).

A short, cheap run

Every run calls a live LLM and spends budget. Keep an exploratory run small by starting on the tiny playground_sphere task and capping the iteration count:

import galapagos as gx

model    = gx.load_model("openai/gpt-4o-mini", host="openrouter")
scaffold = gx.OpenEvolveScaffold.from_card(model=model)
task     = gx.load_task("playground_sphere")    # the fastest task

result   = scaffold.run(task=task, max_iterations=20)
print(result.best_score)                         # > the seed score

The CLI

The galapagos console script wraps the same flow. --model is required, and the run reads your OpenRouter key from OPENAI_API_KEY.

# a short run on the smallest task
galapagos run --scaffold openevolve --task playground_sphere \
    --model openai/gpt-4o-mini --host openrouter --iters 20

# a longer run via OpenRouter
galapagos run --scaffold openevolve --task circle_packing \
    --model openai/gpt-5.5 --host openrouter --iters 100

# point at a custom config YAML, set the seed
galapagos run --scaffold adaevolve --task function_minimization \
    --model openai/gpt-4o-mini --host openrouter --config my_config.yaml --seed 7

# inspect the catalogs
galapagos scaffold list
galapagos task list

galapagos run prints the final best_score and the summary JSON. Use galapagos submit to validate a card.