Best-of-N
Give the LLM N valid attempts at the same parent before committing to the global best, then repeat.
"""Best-of-N — give the LLM N attempts at the same parent before committing to the global best.
Ported from SkyDiscover's ``BestOfNDatabase`` (UC Berkeley Sky Computing Lab). One module per component:
population.py -> BestOfNPopulation (keep-all archive)
selection_policy.py -> BestOfNPolicy (reuse parent N times, then switch to the global best)
prompt_builder.py -> BestOfNPromptBuilder (the default multi-section template)
proposer.py -> BestOfNProposer (SEARCH/REPLACE diff)
evaluator.py -> BestOfNEvaluator (task-supplied)
memory.py -> BestOfNMemory (none)
scaffold.py -> BestOfNScaffold (the orchestrator that composes the six)
A single parent is reused until N **valid** children have been produced from it — N independent
variations from one starting point — before the search commits to the current global best and
repeats. Faithful to SkyDiscover, a failed (parse/invalid) attempt is a free retry that does not
spend the budget; the ``best_of_n_attempts`` variant instead spends one unit per attempt. Larger N
deepens exploitation of a single program state; N=1 approaches Top-K.
"""
from __future__ import annotations
from ...config import GalapagosConfig
from ...models import GalapagosModel
from ..base_scaffold import GalapagosScaffold
from ..registry import register_scaffold
# one module per component (the Best-of-N scaffold method)
from .memory import BestOfNMemory
from .population import BestOfNPopulation
from .prompt_builder import BestOfNPromptBuilder
from .proposer import BestOfNProposer
from .selection_policy import BestOfNPolicy
@register_scaffold("best_of_n")
class BestOfNScaffold(GalapagosScaffold):
name = "best_of_n"
@classmethod
def build_components(cls, config: GalapagosConfig, model: GalapagosModel | None) -> dict:
seed = int(config.seed)
sel = config.selection_policy
return {
"population": BestOfNPopulation(capacity=config.population.capacity),
"selection_policy": BestOfNPolicy(
seed=seed,
n=int(sel.best_of_n),
num_inspirations=int(sel.num_inspirations),
),
"prompt_builder": BestOfNPromptBuilder(),
"proposer": BestOfNProposer(),
"memory": BestOfNMemory(),
}