Best-of-N (attempt-counted)
Best-of-N that rotates the parent every N attempts — failed/invalid tries spend the budget too.
# Best-of-N (attempt-counted) (`best_of_n_attempts`)
> Best-of-N that rotates the parent every N attempts — failed/invalid tries spend the budget too.
## Overview
Best-of-N (attempt-counted) is the strictly-budgeted variant of Best-of-N. Like its sibling, it picks a parent program, reuses that same parent for several consecutive iterations, and then switches to the current global best before repeating the cycle. The only behavioral difference is how the per-parent budget is counted: this variant spends one unit of the N budget on every attempt, whether or not the attempt produces a valid, scorable child.
In the faithful `best_of_n` scaffold (a port of SkyDiscover's BestOfNDatabase), the parent-reuse counter advances only when a valid child is added, so a parse or evaluation failure is a free retry on the same parent. Here, the counter advances at selection time instead, which means a failure still consumes one of the parent's N attempts. The practical effect is that the parent rotates on a fixed cadence of exactly N iterations, regardless of how many of those iterations actually yielded usable children.
Everything else is shared with `best_of_n`: a flat keep-all in-memory population sorted by fitness, the standard multi-section default prompt, a SEARCH/REPLACE diff proposer with full-rewrite fallback, and no memory. Choose `best_of_n` when you want to match SkyDiscover's database semantics where failures are not charged; choose this variant when you want a strictly fixed per-parent compute budget and predictable parent turnover.
## Algorithm
The loop maintains a single piece of adaptive state — the current parent id and a `uses` counter — inside the SelectionPolicy. On each iteration:
1. **Select** — If no parent is committed yet, or the previous parent is gone, or its budget is spent (`uses >= n`), commit to the current global best in the population and reset `uses` to 0; otherwise reuse the same parent. Sample up to `num_inspirations` context programs from the top pool, excluding the parent. Then increment `uses` immediately — this is the attempt-counted rule: the budget is spent at selection, before the outcome is known.
2. **Prompt** — Render the default multi-section message (task, metrics, feedback, inspirations, current program).
3. **Propose** — One LLM call produces a diff (or full rewrite) applied to the parent.
4. **Evaluate** — The task-supplied evaluator scores the child.
5. **Admit** — The child is added to the keep-all population. `observe` is a no-op here, because the budget was already charged in `select`.
```text
state: current_parent_id = None, uses = 0
for each iteration:
members = population.all()
# _choose_parent
current = member with id == current_parent_id (or None)
if current is None or uses >= n:
parent = argmax fitness over members # commit to global best
current_parent_id = parent.id; uses = 0
else:
parent = current # reuse same parent
inspirations = random sample of <=num_inspirations from top pool, excluding parent
uses += 1 # ATTEMPT-COUNTED: charge now, valid or not
prompt = default_builder(parent, inspirations)
child = proposer(prompt) # diff / full rewrite; may fail to parse
score = evaluate(child) # may fail
population.add(child) # failures still cost one of the N attempts
# observe(child) does nothing — budget already spent
```
Contrast with `best_of_n`, where `uses += 1` lives in `observe` and runs only when the child is valid, so failures do not advance the counter.
## Components
A Galapagos scaffold composes six components. This scaffold reuses five of them verbatim from `best_of_n` and swaps in only the attempt-counted SelectionPolicy.
| Slot | Implementation | Role |
|---|---|---|
| Population | `BestOfNPopulation` (`InMemoryPopulation`, keep-all) | Fitness-sorted archive; finds the global best and supplies the context sample |
| SelectionPolicy | `BestOfNAttemptsPolicy` | Reuse one parent for N consecutive attempts, charging the budget per `select` (valid or not) |
| PromptBuilder | `BestOfNPromptBuilder` (`DefaultPromptBuilder`) | Standard multi-section template: task, metrics, feedback, inspirations, current program |
| Proposer | `BestOfNProposer` (`DiffProposer`) | One LLM call producing a SEARCH/REPLACE diff or full rewrite, with no-op detection |
| Evaluator | task-supplied | Scores each proposed child against the task |
| Memory | `BestOfNMemory` (`NullMemory`) | None — no cross-candidate knowledge is kept |
## Configuration
- `selection_policy.best_of_n` (5) — number of consecutive attempts to reuse one parent before switching to the current global best; here every attempt counts toward this budget.
- `selection_policy.num_inspirations` (4) — number of context programs sampled from the top pool, excluding the parent.
- `population.capacity` (null) — uncapped keep-all by default (matches SkyDiscover, no cap); set an int to bound the archive.
- `general.max_iterations` (100) — total search iterations.
- `seed` (0) — seeds the policy RNG for parent/inspiration sampling.
## When to use
Use this scaffold when you want a strictly fixed compute budget per parent and a predictable parent-rotation cadence of exactly N iterations — useful when failures are expensive or you want to bound time spent on any single starting point. Prefer the faithful `best_of_n` when you want SkyDiscover-equivalent semantics, where parse/eval failures are free retries and a parent is held until N *valid* children appear. If you instead want to always exploit the single best program with fresh context rather than reusing a fixed parent for a window, `topk` (greedy top-1) is the better neighbor.
## Source
Galapagos variant of SkyDiscover's `best_of_n` search strategy (the attempt-counted counterpart of the faithful `best_of_n` port of SkyDiscover's BestOfNDatabase).