Meta-Harness/meta_harness

Meta-Harness

A minimal outer loop that delegates selection AND mutation to a skill-steered proposer over an append-only candidate history, returning a (score x cost) Pareto frontier.

Test-time searchMIT

"""Meta-Harness PromptBuilder component — the serialized filesystem-``D`` view + the SKILL.md steering, adapted from the coding-agent CLI to the galapagos chat contract. One file per component (see scaffold.py). The reference proposer is a Claude Code session with Read/Glob/Grep/Bash over ``D``, steered by ``.claude/skills/meta-harness/SKILL.md`` (injected verbatim into the system prompt via ``--append-system-prompt``). A chat proposer cannot browse, so this builder serializes the exact slice the skill's reading list points the agent at: * **system** — a real Agent-Skills SKILL.md, loaded at runtime from ``skills/meta-harness/SKILL.md`` (dir-per-skill, mirroring the reference's ``.claude/skills/meta-harness/``; ``skills/`` instead of ``.claude/`` because the package tree and the hub Files tab exclude dot-directories). The skill text is the search's PRIMARY HYPERPARAMETER (the paper's practical-tips appendix: editing it moved results more than any loop constant), so it lives in an editable file, never in code: the researcher workflow is to copy the bundled SKILL.md, edit it, and point the scaffold's ``skill:`` config key at the copy (absolute and cwd-relative paths honored; bundled-relative is the default). The file is read ONCE at construction (missing/unparseable file fails fast, never mid-run); its ``---`` frontmatter is parsed with ``yaml.safe_load`` (exposed as :attr:`skill_name` / :attr:`skill_description`); the documented sentinel tokens (``{candidates_per_proposal}``, ``{exploitation_axes}``) are substituted via ``str.replace`` ONLY — never ``str.format``, which would explode on the markdown's other braces — and the WHOLE substituted file (frontmatter included) is injected under the reference claude_wrapper's exact banner: ``"Follow these skill instructions:\\n\\n## Skill: <name>\\n<content>\\n\\n"`` (the reference injects the full file verbatim, frontmatter and all, so this port does too). * **the steering content** — the SKILL.md is a minimal-diff instantiation, for galapagos task programs, of the UNION of the official repo's two domain skills (text_classification's and terminal_bench_2's SKILL.md — both bundled VERBATIM, byte-identical, under ``skills/meta-harness/references/``): produce EXACTLY k candidates; no early stopping ("Do NOT write 'the frontier is optimal'"); the anti-parameter-tuning rules with the six exploitation axes and the rotation/variety rule; the "identical except constants => rewrite" self-critique; the anti-overfitting rules; the <=30-line report per candidate. Only two classes of edit (scaffold.py sanctioned adaptations): the CLI-specific mechanics — file writes/``pending_eval.json`` become the in-response ``### CANDIDATE <i>: <name>`` format (report lines + ONE fenced python block, full program, EVOLVE-BLOCK markers preserved); the mandatory ``/tmp`` prototyping step becomes a mandatory "reason through 2-3 variants" step (no exec tool in a chat proposer) — and minimal domain-noun substitutions inside the constraint blocks (memory systems/agents → programs, the ``predict()``/``learn_from_batch()`` method pair → the EVOLVE-BLOCK region, the six exploitation-axis labels). Domain-neutral phrasing is kept VERBATIM — including the published-approach list "(DSPy, OPRO, Reflexion, CEIL, etc.)" and terminal_bench_2's anti-overfitting rule "**Never mention task names** in candidate code, prompts, or comments." Every section and steering sentence of both originals is mapped 1:1; nothing is dropped. * **user**, in order — task context; "iteration t of N" (``render_task_prompt`` analogue); the EVOLUTION SUMMARY table (one row per ever-evaluated candidate: name | iteration | combined_score | cost | outcome — the ``evolution_summary.jsonl`` analogue; ALL rows, capped only when huge and never below the most recent 50); the Pareto frontier with both objectives (``frontier_val.json``); the most recent R prior candidate reports (the ``reports/`` compression layer); sampled execution traces — ``text_feedback`` excerpts stratified errors-first then successes, drawn via the builder's seeded rng and clipped to ``trace_max_chars`` (the paper's ablation: raw traces beat summaries — so excerpts are replayed, not summarized); full source of ``top_k_sources`` frontier members (the copy-then-edit pool); and FINALLY the current best program as the LAST fenced python block (the galapagos hard invariant — diff Proposers locate the mutation target there). Every injected string (task context, names, reports, traces, frontier sources) is fence-sanitized — any run of three-plus backticks collapses to two — so adversarial content can never shift the fence pairing; the parent program itself is rendered raw (it must round-trip exactly). Determinism: the trace sample is drawn from ``self.rng`` (``random.Random(seed)``, seeded from the config seed) and :meth:`build` runs exactly once per loop iteration, so two seeded runs render byte-identical prompts. """ from __future__ import annotations import random import re from pathlib import Path import yaml from ...components.prompt import PromptBuilder from ...models.base import Prompt from ...records import RunState, Selection from .population import display_name from .selection_policy import EXPLOITATION_AXES _FENCE_RUN = re.compile(r"`{3,}") # any run of >=3 backticks (a fence opener/closer) def _defence(text: str) -> str: """Neutralize fence markers in injected free text (task context, reports, traces, summary names, frontier sources): collapse any run of three-plus backticks to two, so the current-program-is-the-last-fenced-block invariant holds even against adversarial content.""" return _FENCE_RUN.sub("``", text) # ---------------------------------------------------------------------------------------------- # The SKILL.md steering — a real Agent-Skills file at skills/meta-harness/SKILL.md (dir-per-skill, # mirroring the reference's .claude/skills/meta-harness/), loaded at runtime. It is a minimal-diff # instantiation of the official repo's two domain skills (text_classification + terminal_bench_2, # both bundled VERBATIM under skills/meta-harness/references/): every section and steering # sentence of both originals survives; two things are adapted (both declared in scaffold.py): the # CLI-specific mechanics (file paths, Write/Edit tools, pending_eval.json, /tmp prototyping, # subagent delegation) become the chat contract, and domain nouns are minimally substituted # (memory systems/agents -> programs, predict()/learn_from_batch() -> the EVOLVE-BLOCK region). # Domain-neutral phrasing stays verbatim — including the published-approach list # "(DSPy, OPRO, Reflexion, CEIL, etc.)" and terminal_bench_2's "Never mention task names" # anti-overfitting rule. The skill's output-format example is rendered as # 4-space-indented text — deliberately NOT a fenced block, so the user message's parent program # stays the LAST real fence. # ---------------------------------------------------------------------------------------------- _SCAFFOLD_DIR = Path(__file__).resolve().parent DEFAULT_SKILL = "skills/meta-harness/SKILL.md" # ``---``-delimited YAML frontmatter (the Agent Skills format) followed by the markdown body _FRONTMATTER = re.compile(r"\A---\s*\n(.*?)\n---\s*\n", re.DOTALL) def _resolve_skill_path(skill: str) -> Path: """Resolve the configured ``skill`` path: absolute paths as-is; relative paths against the scaffold package dir first (where the bundled default lives), then the cwd — the researcher workflow is copy the bundled SKILL.md, edit it, point the ``skill:`` config key at the copy.""" candidate = Path(skill).expanduser() tried = [candidate] if candidate.is_absolute() else [_SCAFFOLD_DIR / candidate, candidate] for path in tried: if path.is_file(): return path raise FileNotFoundError( f"Meta-Harness skill file not found: {skill!r} " f"(tried: {', '.join(str(p.resolve()) for p in tried)})") def _load_skill(skill: str) -> tuple[Path, dict, str]: """Read a SKILL.md once and split its Agent-Skills frontmatter. Returns ``(path, frontmatter_dict, full_raw_text)``; any problem (missing file, missing/unparseable frontmatter, no ``name``) raises HERE — i.e. at builder construction, never mid-run.""" path = _resolve_skill_path(skill) raw = path.read_text(encoding="utf-8") match = _FRONTMATTER.match(raw) if match is None: raise ValueError(f"{path}: not a valid SKILL.md — missing '---'-delimited YAML frontmatter") try: meta = yaml.safe_load(match.group(1)) except yaml.YAMLError as exc: raise ValueError(f"{path}: unparseable SKILL.md frontmatter: {exc}") from exc if not isinstance(meta, dict) or not str(meta.get("name") or "").strip(): raise ValueError(f"{path}: SKILL.md frontmatter must be a YAML mapping with a 'name' field") return path, meta, raw class MetaHarnessPromptBuilder(PromptBuilder): """Renders the chat-port of the proposer session: SKILL.md steering (system) + the serialized filesystem view (user), with the current best program as the LAST fenced python block.""" def __init__(self, candidates_per_proposal: int = 3, top_k_sources: int = 3, reports_in_prompt: int = 6, trace_errors: int = 2, trace_successes: int = 1, trace_max_chars: int = 1500, summary_max_rows: int = 200, seed: int = 0, skill: str = DEFAULT_SKILL): self.candidates_per_proposal = max(1, int(candidates_per_proposal)) self.top_k_sources = int(top_k_sources) self.reports_in_prompt = int(reports_in_prompt) self.trace_errors = int(trace_errors) self.trace_successes = int(trace_successes) self.trace_max_chars = int(trace_max_chars) self.summary_max_rows = int(summary_max_rows) self.rng = random.Random(seed) # trace sampling — the builder's only randomness # the SKILL.md steering: read ONCE here (no per-iteration IO; load errors fail fast at # construction), frontmatter exposed, sentinel tokens substituted via str.replace over # EXACTLY the set documented in the HTML comment at the top of the SKILL.md body (never # str.format — the markdown's other braces must survive) self.skill_path, frontmatter, raw = _load_skill(skill) self.skill_name: str = str(frontmatter["name"]) self.skill_description: str = str(frontmatter.get("description") or "") substituted = raw for token, value in (("{candidates_per_proposal}", str(self.candidates_per_proposal)), ("{exploitation_axes}", ", ".join(EXPLOITATION_AXES))): substituted = substituted.replace(token, value) # the reference claude_wrapper injects the WHOLE file verbatim — frontmatter included — # under this exact banner ("Follow these skill instructions:\n\n## Skill: <name>\n" + # content + "\n\n"); match it byte-for-byte self._system_text = (f"Follow these skill instructions:\n\n" f"## Skill: {self.skill_name}\n{substituted}\n\n") # ---- system --------------------------------------------------------------------------------- def _system(self) -> str: return self._system_text # ---- user ----------------------------------------------------------------------------------- def build(self, selection: Selection, memory=None, state: RunState | None = None) -> Prompt: parent = selection.parent system = self._system() if parent is None: # delegated selection (not used by Meta-Harness, kept for safety) return Prompt(system=system, user=(state.task_context if state else "")) sig = (state.signals.get("meta_harness", {}) if state is not None else {}) or {} rows = list(memory.rows()) if (memory is not None and hasattr(memory, "rows")) else [] reports = list(memory.reports()) if (memory is not None and hasattr(memory, "reports")) else [] sections: list[str] = [] if state and state.task_context: sections.append("# Task\n" + _defence(state.task_context)) sections.append(self._header(sig, state)) sections.append(self._summary_table(rows)) sections.append(self._frontier(sig)) report_block = self._reports(reports) if report_block: sections.append(report_block) traces = self._traces(rows) if traces: sections.append(traces) sources = self._sources(selection) if sources: sections.append(sources) # the current best program — the LAST fenced python block (galapagos hard invariant) sections.append( f"## Current best program — {_defence(display_name(parent))} " f"(combined_score={parent.fitness:.4f})\n" "The frontier's top member. Any prior candidate above is an equally valid base.\n\n" f"```python\n{parent.content}\n```" ) sections.append( # fence-free closing instruction (nothing may re-open a fence here) f"Now produce exactly {self.candidates_per_proposal} candidates in the format from " "the instructions: a '### CANDIDATE <i>: <snake_case_name>' header, the <=30-line " "report, then ONE complete fenced python program with the EVOLVE-BLOCK markers " "preserved — for each candidate." ) return Prompt(system=system, user="\n\n".join(sections)) # ---- section builders ------------------------------------------------------------------------- @staticmethod def _header(sig: dict, state: RunState | None) -> str: """The ``render_task_prompt`` analogue: iteration t of N + the axis rotation hint.""" iteration = sig.get("iteration", state.iteration if state else 0) total = sig.get("max_iterations") of_n = f" of {total}" if total else "" lines = [f"Run iteration {iteration}{of_n} of the evolution loop."] hint = sig.get("axis_hint") if hint: lines.append(f"Exploitation-axis rotation hint for this round: {hint} " "(pick different axes if recent candidates clustered on one).") return "\n".join(lines) def _summary_table(self, rows: list[dict]) -> str: """The ``evolution_summary.jsonl`` analogue: ALL ever-evaluated candidate rows, capped at ``summary_max_rows`` only when huge (never below the most recent 50).""" if len(rows) > self.summary_max_rows: rows = rows[-max(self.summary_max_rows, 50):] lines = ["## EVOLUTION SUMMARY — every evaluated candidate", "name | iteration | combined_score | cost | outcome"] if not rows: lines.append("(no candidates evaluated yet)") for r in rows: lines.append(f"{_defence(str(r['name']))} | {r['iteration']} | {r['score']:.4f} | " f"{r['cost']:g} | {r['outcome']}") return "\n".join(lines) @staticmethod def _frontier(sig: dict) -> str: """``frontier_val.json["_pareto"]``: names + BOTH objective values, score-desc.""" entries = sig.get("frontier") or [] lines = ["## Pareto frontier (maximize combined_score, minimize cost)"] if not entries: lines.append("(empty)") for i, e in enumerate(entries, 1): lines.append(f"{i}. {_defence(str(e['name']))}: " f"combined_score={e['combined_score']:.4f}, cost={e['cost']:g}") return "\n".join(lines) def _reports(self, reports: list[dict]) -> str: """The ``reports/`` compression layer: the most recent R <=30-line candidate reports.""" recent = reports[-self.reports_in_prompt:] if not recent: return "" blocks = [f"### iteration {r['iteration']} — {_defence(str(r['name']))}\n" f"{_defence(str(r['report']))}" for r in recent] return (f"## Prior candidate reports (most recent {len(recent)})\n\n" + "\n\n".join(blocks)) def _traces(self, rows: list[dict]) -> str: """Sampled execution traces, stratified errors-first then successes (the chat reconstruction of "deep-read failed AND successful trajectories"), drawn via ``self.rng`` and clipped to ``trace_max_chars``.""" traced = [r for r in rows if r.get("trace")] errors = [r for r in traced if r["outcome"] == "failed"] successes = [r for r in traced if r["outcome"] != "failed"] picked = (self.rng.sample(errors, min(self.trace_errors, len(errors))) + self.rng.sample(successes, min(self.trace_successes, len(successes)))) if not picked: return "" blocks = [] for r in picked: text = str(r["trace"]) if len(text) > self.trace_max_chars: text = text[: self.trace_max_chars] + "\n... (truncated)" blocks.append(f"### {_defence(str(r['name']))} (iteration {r['iteration']}, " f"outcome: {r['outcome']})\n{_defence(text)}") return "## Sampled execution traces (errors first)\n\n" + "\n\n".join(blocks) def _sources(self, selection: Selection) -> str: """Full source of ``top_k_sources`` frontier members — the copy-then-edit pool (SKILL.md Step 3.1). The parent is excluded here (it is rendered LAST, raw); these display copies are fence-sanitized so genome content can never flip the fence parity.""" pool = [g for g in selection.inspirations if g is not selection.parent] pool = pool[: self.top_k_sources] if not pool: return "" blocks = [f"### {_defence(display_name(g))} (combined_score={g.fitness:.4f})\n" f"```python\n{_defence(g.content)}\n```" for g in pool] return "## Frontier program sources (copy-then-edit pool)\n\n" + "\n\n".join(blocks)