Skip to content

Models

A model is whatever turns a prompt into text. GalapagosModel.from_card loads one from a model name + a host; the host selects an OpenAI-compatible endpoint and resolves its base_url. The model is the LLM the Proposer calls each step.

import galapagos as gx

model = gx.GalapagosModel.from_card(name="openai/gpt-5.5", host="openrouter")
# functional alias:
model = gx.load_model("openai/gpt-5.5", host="openrouter")

One client (APIModel) covers every hosting platform — only the base_url (and which env vars it reads) changes. The client is built lazily, so importing Galapagos never requires credentials.

A model is loaded from a Model Cardname, model_path, host, temperature, max_tokens — exactly like every other Galapagos artifact.


The host table

host selects the endpoint. The platform's host vocabulary spans managed routers, self-hosted servers, and direct provider APIs. The hosts below are wired into the shipped loader; each resolves a base_url, and an explicit base_url= argument always overrides it.

host Resolved base_url Notes
openrouter https://openrouter.ai/api/v1 Default when host is omitted.
openai None → falls back to OPENAI_BASE_URL (or the OpenAI default) The vanilla OpenAI API.
togetherai (alias together) https://api.together.xyz/v1 Together AI.
litellm LITELLM_BASE_URL env, else http://localhost:4000 A LiteLLM proxy (any provider behind it).
vllm VLLM_BASE_URL env, else http://localhost:8000/v1 A local vLLM OpenAI server.
huggingface (alias hf) HF_ENDPOINT_URL env A TGI / HF Inference Endpoint.
azure AZURE_OPENAI_BASE_URL env Azure OpenAI.
bedrock BEDROCK_BASE_URL env Bedrock via an OpenAI-compatible proxy.
anthropic ANTHROPIC_BASE_URL env, else https://api.anthropic.com/v1 Anthropic's OpenAI-compatible endpoint.
google GOOGLE_BASE_URL env, else https://generativelanguage.googleapis.com/v1beta/openai The Gemini OpenAI-compatible endpoint.

Anthropic & Google

anthropic and google are first-class hosts wired into the shipped loader, reached via each provider's OpenAI-compatible endpoint (override with ANTHROPIC_BASE_URL / GOOGLE_BASE_URL). They are also reachable through a router — host="openrouter" (which fronts both) or host="litellm" — since every Galapagos model speaks the OpenAI chat-completions API through the single APIModel client.

Every host routes through the same APIModel. Authentication is the standard OpenAI env var:

Env var Used for
OPENAI_API_KEY the API key for all hosts (defaults to "EMPTY" for keyless local servers).
OPENAI_BASE_URL the fallback base_url when the host resolves to None (i.e. host="openai").
LITELLM_BASE_URL / VLLM_BASE_URL / HF_ENDPOINT_URL / AZURE_OPENAI_BASE_URL / BEDROCK_BASE_URL / ANTHROPIC_BASE_URL / GOOGLE_BASE_URL per-host base_url overrides (see the table).
gx.GalapagosModel.from_card(name="meta-llama/Llama-3-70B", host="togetherai")
gx.load_model("gpt-5.5", host="azure")              # base_url from AZURE_OPENAI_BASE_URL

from_card / load_model also accept temperature, max_tokens, and reasoning_effort, which flow through to the underlying client.


The three mandated load forms

Every model comes from one of three places — a Hugging Face endpoint, a hosting platform, or your own local vLLM server:

A model served behind an OpenAI-compatible HF endpoint (TGI / Inference Endpoint):

# export HF_ENDPOINT_URL=https://<your-endpoint>.endpoints.huggingface.cloud/v1
model = gx.GalapagosModel.from_card(name="Qwen/Qwen3-8B", host="hf")

A managed router / aggregator (OpenRouter, OpenAI, Together AI, Azure, Bedrock, LiteLLM):

model = gx.GalapagosModel.from_card(name="openai/gpt-5.5", host="openrouter")

Your own vLLM server (or any OpenAI-compatible endpoint), addressed by base_url:

model = gx.GalapagosModel.from_card(
    name="Qwen/Qwen3-8B", host="vllm",
    base_url="http://localhost:8000/v1",       # explicit > VLLM_BASE_URL > default
)


Loading from a model card

A ModelCard YAML can carry the name/host/defaults so a run is fully reproducible from disk:

model = gx.GalapagosModel.from_card(path="my_model_card.yaml")

name, host, temperature, max_tokens, and reasoning_effort are read from the card; explicit arguments override the card.


Roles and the scaffold default

A ScaffoldCard declares a default model and the roles it plays:

model:
  default: openai/gpt-5
  host: openrouter
  roles: [propose]          # a reflective method, e.g., declares [propose, reflect, merge]

When you call a scaffold's from_card() without passing model=, the scaffold builds this default itself. Methods that use distinct models per role (e.g. one model diagnoses, another implements) declare multiple roles; the Proposer reads them off the resolved model.

Coding agents are not load_model models

An agent-as-operator method drives a CLI coding agent. That agent is the Proposer's variation operator, exposed through env, not something you pass to from_card. See Core components — Proposer.


See also