default/best_of_n

Best-of-N

Give the LLM N valid attempts at the same parent before committing to the global best, then repeat.

Test-time searchApache-2.0

About

Best-of-N is a test-time search baseline that deliberately exploits one program state at a time. It picks a parent and reuses it until N valid children have been produced from it — N independent variations from a single starting point — and only then commits to the current global best and repeats the cycle. Inspirations (context programs shown alongside the parent) are re-sampled fresh from the top pool at every step, regardless of where the reuse cycle stands.

This scaffold is a faithful port of SkyDiscover's `BestOfNDatabase` from the UC Berkeley Sky Computing Lab. In Galapagos that single class is split along the standard component seam: a flat keep-all `InMemoryPopulation` stores every scored program and re-derives the global best on demand, while a stateful `BestOfNPolicy` owns the parent-reuse counter. Faithful to the original, the counter is advanced only by a validly-scored child — SkyDiscover increments it inside `add()`, which never runs for an error result — so a parse or evaluation failure is a free retry that does not spend the per-parent budget.

The single tuning knob is N. Larger N deepens exploitation of one program state, spending more of the budget refining variations before moving on; N=1 advances to a new best after every valid child and so approaches the behavior of Top-K. If you instead want a strictly fixed per-parent budget where every attempt counts whether or not it scored, the `best_of_n_attempts` sibling spends one budget unit per selection rather than per valid child.

Composition

5/6 blocks

The six components this scaffold snaps together. Each block names its concrete implementation.

Population

keep_all

Selection

best_of_n_reuse

Prompt

default

Proposer

diff

Evaluator

task

Memory

none

Populationkeep_all
The set of candidate solutions in play — the gene pool the search evolves over.
Selectionbest_of_n_reuse
Decides which genomes survive and reproduce — tournament, elitism, novelty, or your own policy.
Promptdefault
Assembles the context handed to the model — parents, feedback, instructions, examples.
Proposerdiff
The LLM-driven variation operator — proposes new candidates by mutation and crossover.
Evaluatortask
Scores each candidate against the task — the fitness signal that drives selection.

Source

SkyDiscover (UC Berkeley Sky Computing Lab) — best_of_n search strategy

Quick facts

Downloads0

LicenseApache-2.0

Default model—

ControllerBestOfNScaffold

Use this scaffold

example.py

from galapagos import GalapagosScaffold

scaffold = GalapagosScaffold.from_card(name="best_of_n")
result = scaffold.run(task="<task_name>")

About

default/best_of_n

About

Composition

Tags

Source

default/best_of_n

About

Composition

Tags

Source