Maintain a fixed-width beam of promising programs; expand one per step, prune by fitness+diversity.
# Beam Search (`beam_search`)
> Maintain a fixed-width beam of promising programs; expand one per step, prune by fitness+diversity.
## Overview
Beam Search keeps a fixed-width beam of the most promising programs. Every admitted program joins the beam, and whenever the beam grows past its width it is pruned back to `beam_width` by fitness — optionally diversity-aware, so the beam does not collapse onto a single lineage. Each step one parent is drawn from the beam by one of four strategies (best, stochastic, round_robin, or diversity_weighted), and the program is expanded into a single child; inspirations passed as context are the global top programs.
This scaffold is a faithful port of SkyDiscover's `BeamSearchDatabase` (UC Berkeley Sky Computing Lab). In Galapagos its two beam-specific axes map cleanly onto two components: the beam structure, diversity-aware admission, and search-tree depth tracking live in a `BeamPopulation`, while the parent choice lives in a `BeamSelectionPolicy`. Both share one fitness notion — `combined_score` with an optional exponential depth penalty — and one cheap, embedding-free diversity metric (character n-gram Jaccard distance between programs), keeping pruning and selection mutually consistent.
The result is a search that is greedier than a flat population sampler but cheaper and more parallel-friendly than a full evolutionary loop. A keep-all store underneath the beam means every program is retained, so inspirations can always be sourced from the global best even after a candidate has been pruned out of the active beam.
## Algorithm
Each iteration follows the standard select -> prompt -> propose -> evaluate -> admit cycle, specialized to a beam:
1. **Select** — draw one parent from the current beam by `strategy`. `best` takes the highest-scoring member; `round_robin` cycles the beam in score order; `stochastic` samples a softmax over score at `temperature`; `diversity_weighted` samples a softmax over `(1-w)·score + w·avg_distance_to_recently_expanded`. The policy remembers recently expanded parents (last 50, with a 10-deep lookback for the diversity term) as its adaptive state.
2. **Prompt** — build the canonical multi-section message (task -> metrics -> feedback -> inspirations -> current program), where inspirations are the global top `num_inspirations` programs, excluding the chosen parent.
3. **Propose** — one LLM call yields a SEARCH/REPLACE diff (or full-rewrite fallback) applied to the parent, producing one child, with no-op detection.
4. **Evaluate** — the task-supplied evaluator scores the child.
5. **Admit** — the child is retained in the keep-all store and appended to the beam. If the beam now exceeds `beam_width`, prune: by pure fitness when `diversity_weight == 0`, or by a greedy fitness+diversity objective otherwise.
```text
beam = [] # ids of the active beam (subset of a keep-all store)
expanded = [] # recently expanded parents (selection state)
for step in range(max_iterations):
members = sorted(beam, by score) # score = combined_score * exp(-depth_penalty * depth)
parent = pick(members, strategy) # best | round_robin | stochastic | diversity_weighted
expanded.append(parent)
inspirations = global_top(num_inspirations, exclude=parent)
prompt = build(task, parent, inspirations)
child = apply_diff(llm(prompt), parent) # one child per expanded parent
score(child)
store.add(child) # keep-all; child also enters the beam
depth[child] = depth[parent] + 1
if len(beam) > beam_width:
if diversity_weight > 0:
beam = greedy_diverse(beam, beam_width) # (1-w)*fitness + w*min_dist_to_selected
else:
beam = top_by_fitness(beam, beam_width)
```
The diversity-aware prune (`greedy_diverse`) always keeps the single best member, then greedily adds whichever remaining candidate maximizes `(1-w)·fitness + w·min_distance_to_already_selected`, so the beam stays both strong and spread out across distinct lineages.
## Components
A Galapagos scaffold composes six components; Beam Search wires them as follows:
| Slot | Implementation | Role |
|---|---|---|
| Population | `BeamPopulation` (kind: beam) | Keep-all store plus a fixed-width beam; diversity-aware prune and search-tree depth tracking. |
| SelectionPolicy | `BeamSelectionPolicy` (kind: beam) | Picks the parent from the beam by best / stochastic / round_robin / diversity_weighted. |
| PromptBuilder | `BeamSearchPromptBuilder` (kind: default) | Canonical multi-section template; adds nothing beyond the default context. |
| Proposer | `BeamSearchProposer` (kind: diff) | One LLM call -> SEARCH/REPLACE diff (full-rewrite fallback) -> one child per parent. |
| Evaluator | task-supplied | Scores each child with the task's evaluation function. |
| Memory | `NullMemory` (kind: none) | None; Beam Search keeps no cross-candidate free-form knowledge. |
## Configuration
Keys this scaffold actually reads:
- `population.beam_width` (5) — number of candidates kept in the active beam.
- `population.beam_diversity_weight` (0.3) — weight `w` of the diversity term in pruning; 0 = pure fitness. Also read by the selection policy.
- `population.beam_depth_penalty` (0.0) — exponential fitness penalty per search-tree depth; 0 = off.
- `selection_policy.beam_selection_strategy` (`diversity_weighted`) — parent draw: `best | stochastic | round_robin | diversity_weighted`.
- `selection_policy.beam_temperature` (1.0) — softmax temperature for `stochastic` / `diversity_weighted`.
- `selection_policy.num_inspirations` (4) — number of global top programs supplied as context.
- `general.max_iterations` (100) — number of search steps.
- `seed` (0) — seeds the selection policy's RNG.
## When to use
Reach for Beam Search when you want bounded, directed exploration: it commits compute to a small set of strong lineages (the beam) rather than the whole history, which keeps it cheaper and more focused than a full evolutionary population while staying greedier than flat best-of-n sampling. Turn up `beam_diversity_weight` (or use the `diversity_weighted` strategy) when a fitness-only beam keeps collapsing onto one near-duplicate lineage. If you instead want pure greedy hill-climbing on the single best program, `topk` (greedy top-1 plus context) is simpler; if you want independent samples with no beam structure at all, `best_of_n` is the lighter baseline.
## Source
Ported from SkyDiscover's (UC Berkeley Sky Computing Lab) `BeamSearchDatabase` search strategy; this is a faithful Galapagos port, not an original method.