Models¶
A model is whatever turns a prompt into text. GalapagosModel.from_card loads one from a model
name + a host; the host selects an OpenAI-compatible endpoint and resolves its base_url. The
model is the LLM the Proposer calls each step.
import galapagos as gx
model = gx.GalapagosModel.from_card(name="openai/gpt-5.5", host="openrouter")
# functional alias:
model = gx.load_model("openai/gpt-5.5", host="openrouter")
One client (APIModel) covers every hosting platform — only the base_url (and which env vars it
reads) changes. The client is built lazily, so importing Galapagos never requires credentials.
A model is loaded from a Model Card — name, model_path, host,
temperature, max_tokens — exactly like every other Galapagos artifact.
The host table¶
host selects the endpoint. The platform's host vocabulary spans managed routers, self-hosted
servers, and direct provider APIs. The hosts below are wired into the shipped loader; each resolves a
base_url, and an explicit base_url= argument always overrides it.
host |
Resolved base_url |
Notes |
|---|---|---|
openrouter |
https://openrouter.ai/api/v1 |
Default when host is omitted. |
openai |
None → falls back to OPENAI_BASE_URL (or the OpenAI default) |
The vanilla OpenAI API. |
togetherai (alias together) |
https://api.together.xyz/v1 |
Together AI. |
litellm |
LITELLM_BASE_URL env, else http://localhost:4000 |
A LiteLLM proxy (any provider behind it). |
vllm |
VLLM_BASE_URL env, else http://localhost:8000/v1 |
A local vLLM OpenAI server. |
huggingface (alias hf) |
HF_ENDPOINT_URL env |
A TGI / HF Inference Endpoint. |
azure |
AZURE_OPENAI_BASE_URL env |
Azure OpenAI. |
bedrock |
BEDROCK_BASE_URL env |
Bedrock via an OpenAI-compatible proxy. |
anthropic |
ANTHROPIC_BASE_URL env, else https://api.anthropic.com/v1 |
Anthropic's OpenAI-compatible endpoint. |
google |
GOOGLE_BASE_URL env, else https://generativelanguage.googleapis.com/v1beta/openai |
The Gemini OpenAI-compatible endpoint. |
Anthropic & Google
anthropic and google are first-class hosts wired into the shipped loader, reached via each
provider's OpenAI-compatible endpoint (override with ANTHROPIC_BASE_URL / GOOGLE_BASE_URL).
They are also reachable through a router — host="openrouter" (which fronts both) or
host="litellm" — since every Galapagos model speaks the OpenAI chat-completions API through
the single APIModel client.
Every host routes through the same APIModel. Authentication is the standard OpenAI env var:
| Env var | Used for |
|---|---|
OPENAI_API_KEY |
the API key for all hosts (defaults to "EMPTY" for keyless local servers). |
OPENAI_BASE_URL |
the fallback base_url when the host resolves to None (i.e. host="openai"). |
LITELLM_BASE_URL / VLLM_BASE_URL / HF_ENDPOINT_URL / AZURE_OPENAI_BASE_URL / BEDROCK_BASE_URL / ANTHROPIC_BASE_URL / GOOGLE_BASE_URL |
per-host base_url overrides (see the table). |
gx.GalapagosModel.from_card(name="meta-llama/Llama-3-70B", host="togetherai")
gx.load_model("gpt-5.5", host="azure") # base_url from AZURE_OPENAI_BASE_URL
from_card / load_model also accept temperature, max_tokens, and reasoning_effort, which
flow through to the underlying client.
The three mandated load forms¶
Every model comes from one of three places — a Hugging Face endpoint, a hosting platform, or your own local vLLM server:
A model served behind an OpenAI-compatible HF endpoint (TGI / Inference Endpoint):
A managed router / aggregator (OpenRouter, OpenAI, Together AI, Azure, Bedrock, LiteLLM):
Loading from a model card¶
A ModelCard YAML can carry the name/host/defaults so a run is fully
reproducible from disk:
name, host, temperature, max_tokens, and reasoning_effort are read from the card; explicit
arguments override the card.
Roles and the scaffold default¶
A ScaffoldCard declares a default model and the roles it plays:
model:
default: openai/gpt-5
host: openrouter
roles: [propose] # a reflective method, e.g., declares [propose, reflect, merge]
When you call a scaffold's from_card() without passing model=, the scaffold builds this default
itself. Methods that use distinct models per role (e.g. one model diagnoses, another implements)
declare multiple roles; the Proposer reads them off the resolved model.
Coding agents are not load_model models
An agent-as-operator method drives a CLI coding agent. That agent is the Proposer's variation
operator, exposed through env, not something you pass to from_card. See
Core components — Proposer.
See also¶
- Cards — Model Card — the
ModelCardschema and the host vocabulary. - Core components — Proposer — the Proposer that calls the model.
- The evolutionary loop — where the model sits in the loop.