Skip to main content
metalworks talks to LLMs through the ChatModel protocol. You rarely construct adapters by hand — you name a model and metalworks resolves it.

Model refs

A model ref is provider:model-id or provider/model (the slash form matches the convention used by OpenRouter, LiteLLM, and most agent runtimes):
from metalworks import Metalworks

Metalworks(model="anthropic/claude-opus-4-6")
Metalworks(model="openai:gpt-5")
Metalworks(model="google/gemini-3-pro")
RefRoutes toNeeds
anthropic/<id>native Anthropic SDKANTHROPIC_API_KEY
openai/<id>native OpenAI SDKOPENAI_API_KEY
google/<id> (or gemini/<id>)native Google SDKGOOGLE_API_KEY / GEMINI_API_KEY, or Vertex AI (below)
openrouter/<vendor/model>OpenRouterOPENROUTER_API_KEY
openai-compatible/<id>your OPENAI_BASE_URL endpointOPENAI_API_KEY + OPENAI_BASE_URL
meta-llama/llama-3-70b (any unknown vendor)OpenRouter (the whole ref is the id)OPENROUTER_API_KEY
A bare known-provider slash like anthropic/claude-opus always routes to the native SDK — it never silently lands on OpenRouter.

No ref? Inferred from your keys

With no model, the provider is taken from the first key present, in order: Anthropic, OpenAI, Google. So Metalworks() with only OPENAI_API_KEY set uses OpenAI. If none of those is set, a lone OPENROUTER_API_KEY is the recognized single-key fallback — Metalworks() then talks to OpenRouter’s OpenAI-compatible endpoint (so one key reaches many models). A native key always wins over it. You can also pin a default in ~/.config/metalworks/metalworks.toml:
provider = "anthropic"
model = "claude-opus-4-6"
Precedence: explicit model= ref > config file > first present key.

Google via Vertex AI

The Google chat and embedding adapters can authenticate through Vertex AI (Application Default Credentials, e.g. a service account) instead of an API key. Set GOOGLE_GENAI_USE_VERTEXAI=true and provide a project and location:
export GOOGLE_GENAI_USE_VERTEXAI=true
export VERTEX_PROJECT_ID=...        # or GOOGLE_CLOUD_PROJECT
export VERTEX_LOCATION=us-central1  # or GOOGLE_CLOUD_LOCATION (default us-central1)
# credentials: GOOGLE_APPLICATION_CREDENTIALS=/path/to/sa.json, or ambient gcloud ADC
With Vertex mode on, provider inference routes to Google even when no GOOGLE_API_KEY is set. The project is required (VERTEX_PROJECT_ID or GOOGLE_CLOUD_PROJECT); the location defaults to us-central1.

Any OpenAI-compatible endpoint

This is the “bring your own model” path. Any server that speaks the OpenAI chat-completions API — OpenRouter, vLLM, LM Studio, Together, Groq, a local runtime — works with no new adapter:
from metalworks.llm.adapters.openai import OpenAIChatModel

local = OpenAIChatModel(
    model_id="llama-3.1-70b",
    base_url="http://localhost:1234/v1",   # your endpoint
    api_key_env="LOCAL_LLM_KEY",           # the env var holding its key
    native_structured=False,               # use the schema-in-prompt ladder
)
Metalworks(chat=local).research("...", subreddits=["..."])
native_structured=False routes structured calls straight to the schema-in-prompt ladder tier, which is the safe default for endpoints whose JSON-schema support varies. Leave it True if your endpoint enforces response_format reliably.

Fast vs main model

The research and discovery pipelines use a cheap “fast” model for triage and filtering and a capable model for synthesis and generation. Set both:
Metalworks(model="anthropic/claude-opus-4-6", fast_model="anthropic/claude-haiku-4-5")
If you set only model, the fast slot falls back to it. Resolve a pair directly with metalworks.config.resolve_models(model, fast_model).

Embeddings

The pipeline embeds Reddit comments to cluster demand. You don’t configure this separately — it resolves from your environment, and never requires its own key:
PresentEmbeddings used
GOOGLE_API_KEY / VertexGoogle embeddings
else OPENAI_API_KEYOpenAI embeddings
neitherlocal modelfastembed (BAAI/bge-small-en-v1.5, 384-dim), no key
So a chat-only provider (Anthropic, OpenRouter, a local LLM) just works: embeddings fall back to the local model, downloaded once to the Hugging Face cache, then fully offline. A Google or OpenAI key is used automatically when present (higher quality, no download).
metalworks models warm          # pre-download the local model before your first run
Override explicitly by injecting a provider:
from metalworks.embeddings.adapters.openai import OpenAIEmbedding
Metalworks(embeddings=OpenAIEmbedding())     # force a specific embedding backend
Embedding vectors from different models live in incompatible spaces. metalworks stamps each cached index with an identity and refuses to mix them — switching embedding backend on an existing .metalworks/ project triggers a clear EmbeddingModelMismatch rather than silently degrading retrieval. Re-run research to rebuild the index under the new model.

Check the resolution

metalworks doctor        # resolved chat + embedding models, keys found, actionable hints
metalworks models list   # the same, plus a provider × key × extra reachability matrix