Models¶

A model is whatever turns a prompt into text. GalapagosModel.from_card loads one from a model name + a host; the host selects an OpenAI-compatible endpoint and resolves its base_url. The model is the LLM the Proposer calls each step.

import galapagos as gx

model = gx.GalapagosModel.from_card(name="openai/gpt-5.5", host="openrouter")
# functional alias:
model = gx.load_model("openai/gpt-5.5", host="openrouter")

One client (APIModel) covers every hosting platform — only the base_url (and which env vars it reads) changes. The client is built lazily, so importing Galapagos never requires credentials.

A model is loaded from a Model Card — name, model_path, host, temperature, max_tokens — exactly like every other Galapagos artifact.

The host table¶

host selects the endpoint. The platform's host vocabulary spans managed routers, self-hosted servers, and direct provider APIs. The hosts below are wired into the shipped loader; each resolves a base_url, and an explicit base_url= argument always overrides it.

`host`	Resolved `base_url`	Notes
`openrouter`	`https://openrouter.ai/api/v1`	Default when `host` is omitted.
`openai`	`None` → falls back to `OPENAI_BASE_URL` (or the OpenAI default)	The vanilla OpenAI API.
`togetherai` (alias `together`)	`https://api.together.xyz/v1`	Together AI.
`litellm`	`LITELLM_BASE_URL` env, else `http://localhost:4000`	A LiteLLM proxy (any provider behind it).
`vllm`	`VLLM_BASE_URL` env, else `http://localhost:8000/v1`	A local vLLM OpenAI server.
`huggingface` (alias `hf`)	`HF_ENDPOINT_URL` env	A TGI / HF Inference Endpoint.
`azure`	`AZURE_OPENAI_BASE_URL` env	Azure OpenAI.
`bedrock`	`BEDROCK_BASE_URL` env	Bedrock via an OpenAI-compatible proxy.
`anthropic`	`ANTHROPIC_BASE_URL` env, else `https://api.anthropic.com/v1`	Anthropic's OpenAI-compatible endpoint.
`google`	`GOOGLE_BASE_URL` env, else `https://generativelanguage.googleapis.com/v1beta/openai`	The Gemini OpenAI-compatible endpoint.

Anthropic & Google

anthropic and google are first-class hosts wired into the shipped loader, reached via each provider's OpenAI-compatible endpoint (override with ANTHROPIC_BASE_URL / GOOGLE_BASE_URL). They are also reachable through a router — host="openrouter" (which fronts both) or host="litellm" — since every Galapagos model speaks the OpenAI chat-completions API through the single APIModel client.

Every host routes through the same APIModel. Authentication is the standard OpenAI env var:

Env var	Used for
`OPENAI_API_KEY`	the API key for all hosts (defaults to `"EMPTY"` for keyless local servers).
`OPENAI_BASE_URL`	the fallback `base_url` when the host resolves to `None` (i.e. `host="openai"`).
`LITELLM_BASE_URL` / `VLLM_BASE_URL` / `HF_ENDPOINT_URL` / `AZURE_OPENAI_BASE_URL` / `BEDROCK_BASE_URL` / `ANTHROPIC_BASE_URL` / `GOOGLE_BASE_URL`	per-host `base_url` overrides (see the table).

gx.GalapagosModel.from_card(name="meta-llama/Llama-3-70B", host="togetherai")
gx.load_model("gpt-5.5", host="azure")              # base_url from AZURE_OPENAI_BASE_URL

from_card / load_model also accept temperature, max_tokens, and reasoning_effort, which flow through to the underlying client.

The three mandated load forms¶

Every model comes from one of three places — a Hugging Face endpoint, a hosting platform, or your own local vLLM server:

1. Hugging Face2. Hosting platform3. Local vLLM

A model served behind an OpenAI-compatible HF endpoint (TGI / Inference Endpoint):

# export HF_ENDPOINT_URL=https://<your-endpoint>.endpoints.huggingface.cloud/v1
model = gx.GalapagosModel.from_card(name="Qwen/Qwen3-8B", host="hf")

A managed router / aggregator (OpenRouter, OpenAI, Together AI, Azure, Bedrock, LiteLLM):

model = gx.GalapagosModel.from_card(name="openai/gpt-5.5", host="openrouter")

Your own vLLM server (or any OpenAI-compatible endpoint), addressed by base_url:

model = gx.GalapagosModel.from_card(
    name="Qwen/Qwen3-8B", host="vllm",
    base_url="http://localhost:8000/v1",       # explicit > VLLM_BASE_URL > default
)

Loading from a model card¶

A ModelCard YAML can carry the name/host/defaults so a run is fully reproducible from disk:

model = gx.GalapagosModel.from_card(path="my_model_card.yaml")

name, host, temperature, max_tokens, and reasoning_effort are read from the card; explicit arguments override the card.

Roles and the scaffold default¶

A ScaffoldCard declares a default model and the roles it plays:

model:
  default: openai/gpt-5
  host: openrouter
  roles: [propose]          # a reflective method, e.g., declares [propose, reflect, merge]

When you call a scaffold's from_card() without passing model=, the scaffold builds this default itself. Methods that use distinct models per role (e.g. one model diagnoses, another implements) declare multiple roles; the Proposer reads them off the resolved model.

Coding agents are not load_model models

An agent-as-operator method drives a CLI coding agent. That agent is the Proposer's variation operator, exposed through env, not something you pass to from_card. See Core components — Proposer.

Models¶

The host table¶

The three mandated load forms¶

Loading from a model card¶

Roles and the scaffold default¶

See also¶