Python API¶

The package is galapagos (pip install name: galapagos; editable: pip install -e .). The public surface is small — import everything from the top level.

import galapagos as gx

# primary classes
gx.GalapagosModel, gx.GalapagosConfig, gx.GalapagosScaffold, gx.GalapagosTask
gx.AdaEvolveScaffold, gx.BeamSearchScaffold, gx.BestOfNScaffold, gx.EvoXScaffold, gx.OpenEvolveScaffold, gx.TopKScaffold

# records
gx.Genome, gx.Selection, gx.EvalResult, gx.RunState, gx.RunResult, gx.Budget

# registry + functional loaders
gx.available_scaffolds, gx.available_tasks, gx.registered_scaffolds
gx.AutoScaffold, gx.register_scaffold
gx.load_scaffold, gx.load_model, gx.load_task, gx.load_config

Primary classes¶

`GalapagosModel`¶

GalapagosModel.from_card(name=None, host=None, *, path=None, **kw) -> GalapagosModel

Resolve a model from a name + host (or a model-card YAML at path). Hosts: openai, openrouter, togetherai (alias together), litellm, vllm, huggingface (alias hf), azure, bedrock, anthropic, google. The API key is read from OPENAI_API_KEY. Subclasses implement generate(prompt: Prompt) -> Generation.

model = gx.GalapagosModel.from_card(name="openai/gpt-4o-mini", host="openrouter")
model = gx.load_model("openai/gpt-4o-mini", host="openrouter")   # functional alias

`GalapagosConfig`¶

GalapagosConfig.from_config(scaffold_name=None, *, path=None, **overrides) -> GalapagosConfig
cfg.get(dotted, default=None) -> Any
cfg.set(dotted, value) -> GalapagosConfig
cfg.section(name) -> dict
cfg.as_dict() -> dict

A thin config object over a nested dict, accessed by dotted paths. from_config(scaffold_name="openevolve") loads the scaffold's bundled defaults; from_config(path="cfg.yaml") loads a file.

cfg = gx.GalapagosConfig.from_config(scaffold_name="openevolve")
cfg.set("database.num_islands", 8).set("budget.max_iterations", 200)
cfg.get("budget.max_iterations")     # -> 200

`GalapagosScaffold`¶

The orchestrator that drives the six components around the loop.

GalapagosScaffold.from_card(name=None, *, path=None, config=None, model=None,
                            population=None, selection_policy=None, prompt_builder=None,
                            proposer=None, evaluator=None, memory=None, seed=None, **kw) -> GalapagosScaffold
scaffold.run(task=None, *, max_iterations=None) -> RunResult

Three load modes: by name (from_card("openevolve", model=...) — dispatches via the registry), subclass defaults (OpenEvolveScaffold.from_card() — loads its own card + config + default model), and build-your-own (pass component instances / module.Class paths / .py paths to the six role kwargs).

@classmethod
def build_components(cls, config, model) -> dict   # the five scaffold-side components (override in a subclass)

# adaptation hooks (no-ops by default):
def before_step(self) -> None: ...                 # before selection
def after_step(self, child: Genome, result) -> None: ...   # after eval (result is None on a no-op)
def periodic(self) -> None: ...                    # once per iteration, after the step

The seven runnable subclasses are AdaEvolveScaffold, BeamSearchScaffold, BestOfNScaffold, EvoXScaffold, MetaHarnessScaffold, OpenEvolveScaffold, and TopKScaffold. See Write your own scaffold.

`GalapagosTask`¶

GalapagosTask.from_card(name=None, *, path=None) -> GalapagosTask
task.context -> str               # the problem statement injected into prompts
task.status -> str                # 'stable' | 'experimental' | 'spec' | 'external'
task.runnable -> bool             # True iff it ships a seed + evaluator
task.initial_genome() -> Genome   # the seed Genome
task.evaluator -> Evaluator | None  # a SubprocessEvaluator over the task's evaluator.py

The task supplies the Evaluator (not the scaffold), so any scaffold runs against any task.

The six components¶

from galapagos.components import .... Every component is an abstract base with shipped impls.

Population¶

class Population:                       # the candidate store
    def add(self, genome: Genome) -> bool: ...      # returns whether admitted
    def query(self, spec: dict | None = None) -> list[Genome]: ...
    def all(self) -> list[Genome]: ...
    def best(self) -> Genome | None: ...

Impl	Purpose
`InMemoryPopulation(capacity=1000)`	A bounded top-k / leaderboard list kept sorted by fitness.
`IslandPopulation(num_islands=4, migration_interval=25, migration_rate=2, descriptor=None)`	Islands of MAP-Elites cells with periodic ring migration.

SelectionPolicy¶

class SelectionPolicy:                  # the active, stateful policy
    def select(self, population, state: RunState | None = None) -> Selection: ...
    def observe(self, genome: Genome, state: RunState | None = None) -> None: ...

Impl	Purpose
`ExploreExploitPolicy(seed=0, explore_ratio=0.3, num_inspirations=3)`	Explore/exploit split + fitness-weighted exploit; diverse inspirations.
`UCBBanditPolicy(seed=0, num_islands=4, c=1.4, num_inspirations=2)`	UCB1 routing over islands; mirrors posteriors into `state.signals['ucb']`.
`IdentityPolicy()`	Delegated/agent-driven: returns the whole population, defers the choice to the Proposer.

PromptBuilder¶

class PromptBuilder:                    # the renderer (pure formatting, no selection)
    def build(self, selection: Selection, memory=None, state: RunState | None = None) -> Prompt: ...

Impl	Purpose
`DefaultPromptBuilder(system_message=None, max_inspiration_chars=600, include_memory=True)`	The canonical multi-section template (task → metrics → feedback → inspirations → memory → current program).

Proposer¶

class Proposer:                         # the variation operator
    def propose(self, prompt, env: Env) -> Genome: ...

Env(model, selection, evaluator=None, memory=None, state=None) is the toolbox handed to a Proposer.

Impl	Purpose
`DiffProposer()`	One LLM call → SEARCH/REPLACE diff (or full rewrite) applied to the parent; no-op detection.
`CrossoverProposer(novelty_threshold=0.9, recent=12)`	Crossover + token-Jaccard novelty rejection (one resample on near-duplicates).

Helper: apply_edit(parent_code, response) -> (new_code, changed).

Evaluator¶

class Evaluator:                        # the pure scorer (supplied by the task)
    def evaluate(self, genome: Genome) -> EvalResult: ...

Impl	Purpose
`SubprocessEvaluator(evaluator_path, timeout=120, suffix=".py")`	Runs the task's `evaluator.py` in an isolated subprocess.

Task evaluator contract: evaluate(program_path) -> dict with at least combined_score (float), and optional validity / status / per_instance / artifacts.text_feedback.

Memory¶

class Memory:                           # free-form knowledge (optional)
    def read(self, spec: dict | None = None) -> str: ...
    def write(self, knowledge: str, **meta) -> None: ...

Impl	Purpose
`NullMemory()`	The empty memory — the default.
`ScratchpadMemory(max_notes=8)`	A rolling meta-scratchpad of distilled design insights.

Records¶

from galapagos import Genome, Selection, EvalResult, RunState, RunResult, Budget

`Genome`¶

@dataclass
class Genome:
    content: str                        # the artifact being evolved (code, prompts, config, ...)
    id: str                             # auto-assigned 'g000001'
    parent_id: str | None = None
    lineage: str = ""
    scores: dict[str, float] = {}       # filled by the Evaluator
    metadata: dict = {}                 # selection data: island, cell, generation, embeddings, ...
    artifacts: dict = {}                # evaluator side-output (text_feedback, traces)

    @property
    def fitness(self) -> float          # scores['combined_score'], else mean of numeric scores, else -inf
    def child(self, content, **metadata) -> Genome   # descendant with lineage wired up

`Selection`¶

@dataclass
class Selection:
    parent: Genome | None               # the parent to mutate (None => delegated selection)
    inspirations: list[Genome] = []     # context-only inspirations
    pool: list[Genome] = []             # the full visible population (for delegated selection)

`EvalResult`¶

@dataclass
class EvalResult:
    metrics: dict[str, float] = {}      # must contain 'combined_score'
    artifacts: dict = {}
    valid: bool = True                  # gates admission
    per_instance: list[float] | None = None   # per-test-case success vector
    text_feedback: str | None = None    # surfaced into later prompts

    @property
    def combined_score(self) -> float

`RunState`¶

@dataclass
class RunState:
    iteration: int = 0
    cost_usd: float = 0.0
    prompt_tokens: int = 0
    completion_tokens: int = 0
    best: Genome | None = None
    run_dir: str | None = None
    task_context: str = ""
    signals: dict = {}                  # adaptive policies stash cross-cutting state here
    started_at: float

    def record_cost(self, cost_usd, prompt_tokens=0, completion_tokens=0) -> None
    @property
    def elapsed_s(self) -> float

`Budget`¶

@dataclass
class Budget:
    max_iterations: int = 100
    max_usd: float | None = None
    target_score: float | None = None
    patience: int | None = None         # stop after N iters with no best-score gain
    wallclock_s: float | None = None

The run stops as soon as any configured bound is hit. Built from the config's budget section.

`RunResult`¶

@dataclass
class RunResult:
    best: Genome | None
    summary: dict = {}                  # {scaffold, task, iterations, evaluations, best_score, cost_usd, no_diff, population_size}
    run_dir: str | None = None
    history: list[Genome] = []          # the seed + every evaluated child, in order

    @property
    def best_score(self) -> float       # best.fitness, or -inf

Loaders & registry¶

load_model(name=None, host=None, *, path=None, base_url=None, **kw) -> GalapagosModel
load_config(scaffold_name=None, *, path=None, **overrides) -> GalapagosConfig
load_scaffold(name=None, *, path=None, model=None, config=None, **kw) -> GalapagosScaffold
load_task(name=None, *, path=None) -> GalapagosTask

available_scaffolds() -> list[str]      # all bundled scaffold cards (all runnable)
available_tasks() -> list[str]          # all bundled task cards
registered_scaffolds() -> list[str]     # the runnable subset (== available_scaffolds() today)

@register_scaffold("name")              # decorator: wire a Scaffold subclass to its card name
AutoScaffold.from_card(name, ...)       # name -> runnable scaffold (used internally by load_scaffold)

The functional load_* functions are thin aliases for the corresponding *.from_card classmethods.

Cards¶

from galapagos.cards.registry import (
    load_scaffold_card, load_task_card, available_scaffolds, available_tasks,
)
from galapagos.cards.schema import ScaffoldCard, TaskCard, ModelCard, VerificationCard

See Scaffold & Task cards for the schemas.

Python API¶

Primary classes¶

GalapagosModel¶

GalapagosConfig¶

GalapagosScaffold¶

GalapagosTask¶