LLM Providers
Runra Runtime supports multiple LLM providers. Configure them once and any agent adapter can use any provider. This decouples your agent choice from your model choice.
Built-in Providers
Section titled “Built-in Providers”| Provider | Key | Models |
|---|---|---|
| OpenAI | openai | gpt-4o, gpt-4o-mini, gpt-5, o1, o3, o4-mini |
| Anthropic | anthropic | claude-sonnet-4-20250514, claude-opus-4-20250514, claude-haiku-3-5 |
| Gemini | gemini | gemini-2.5-pro, gemini-2.5-flash |
| OpenRouter | openrouter | Any OpenRouter model |
| Ollama | ollama | Any local Ollama model |
| vLLM | vllm | Any vLLM-served model |
| Custom | custom | OpenAI-compatible endpoint |
Configuration
Section titled “Configuration”OpenAI
Section titled “OpenAI”import { Runra } from "@runra/runtime";
const runra = new Runra({ llm: { provider: "openai", config: { apiKey: process.env.OPENAI_API_KEY, model: "gpt-5", baseUrl: "https://api.openai.com/v1", // optional organization: "org-abc123", // optional temperature: 0.7, // optional (default: 0.3) maxTokens: 4096, // optional requestTimeoutMs: 30000, // optional (default: 30000) }, }, agent: { provider: "claude-code", config: { model: "gpt-5", // Override: agent uses this model via the OpenAI provider }, }, // ... sandbox and observability config});Anthropic
Section titled “Anthropic”const runra = new Runra({ llm: { provider: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY, model: "claude-sonnet-4-20250514", maxTokens: 8192, thinking: { type: "enabled", budgetTokens: 4000, }, }, }, agent: { provider: "claude-code", config: { model: "claude-sonnet-4-20250514", permissionMode: "auto-approve", }, },});Gemini
Section titled “Gemini”const runra = new Runra({ llm: { provider: "gemini", config: { apiKey: process.env.GEMINI_API_KEY, model: "gemini-2.5-pro", temperature: 0.3, maxOutputTokens: 8192, safetySettings: [ { category: "HARM_CATEGORY_DANGEROUS_CONTENT", threshold: "BLOCK_ONLY_HIGH" }, ], }, },});OpenRouter
Section titled “OpenRouter”Use OpenRouter to access hundreds of models through a single API key:
const runra = new Runra({ llm: { provider: "openrouter", config: { apiKey: process.env.OPENROUTER_API_KEY, model: "anthropic/claude-sonnet-4-20250514", baseUrl: "https://openrouter.ai/api/v1", appName: "my-runra-agent", // Required by OpenRouter headers: { // Optional extra headers "HTTP-Referer": "https://myapp.com", }, }, },});You can switch models without changing providers:
// Swap models by changing one valueconst llmConfig = { provider: "openrouter", config: { apiKey: process.env.OPENROUTER_API_KEY, // model: "anthropic/claude-sonnet-4-20250514", // model: "openai/gpt-5", // model: "google/gemini-2.5-pro", model: "meta-llama/llama-4-maverick", },};Local Models
Section titled “Local Models”Ollama
Section titled “Ollama”Run models locally with Ollama:
const runra = new Runra({ llm: { provider: "ollama", config: { model: "llama3.2:70b", baseUrl: "http://localhost:11434", // Ollama default temperature: 0.1, numCtx: 32768, // Context window size }, },});Make sure Ollama is running and the model is pulled:
ollama pull llama3.2:70bollama serveUse a vLLM server for high-throughput local inference:
const runra = new Runra({ llm: { provider: "vllm", config: { model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct", baseUrl: "http://localhost:8000/v1", apiKey: "not-needed", // vLLM typically doesn't require auth maxTokens: 4096, }, },});Provider Configuration Reference
Section titled “Provider Configuration Reference”interface LLMConfig { /** Provider key */ provider: "openai" | "anthropic" | "gemini" | "openrouter" | "ollama" | "vllm" | "custom";
/** Provider-specific configuration */ config: { /** API key for the provider */ apiKey?: string;
/** Override base URL (for proxies, local models, custom endpoints) */ baseUrl?: string;
/** Model name / ID */ model: string;
/** Temperature (0.0 - 2.0). Lower = more deterministic */ temperature?: number;
/** Maximum tokens per response */ maxTokens?: number;
/** Request timeout in milliseconds */ requestTimeoutMs?: number;
/** Maximum retries on transient errors */ maxRetries?: number;
/** Additional HTTP headers */ headers?: Record<string, string>; };}Custom OpenAI-Compatible Provider
Section titled “Custom OpenAI-Compatible Provider”Any service with an OpenAI-compatible API can be used as a provider:
const runra = new Runra({ llm: { provider: "custom", config: { baseUrl: "https://my-proxy.internal/v1", apiKey: process.env.MY_PROXY_KEY, model: "custom-fine-tuned-model", temperature: 0.3, headers: { "X-Custom-Header": "value", }, }, },});Writing a Full Custom Provider
Section titled “Writing a Full Custom Provider”Implement the LLMProvider interface for complete control:
import type { LLMProvider } from "@runra/runtime";
interface LLMRequest { messages: Array<{ role: string; content: string | Array<unknown> }>; tools?: Array<{ name: string; description: string; input_schema: unknown }>; systemPrompt?: string; temperature?: number; maxTokens?: number;}
interface LLMResponse { content: string; toolCalls?: Array<{ id: string; name: string; input: unknown }>; stopReason: "end_turn" | "tool_use" | "max_tokens" | "stop"; usage: { inputTokens: number; outputTokens: number; };}
class MyCustomLLMProvider implements LLMProvider { readonly id = "my-custom-provider";
private apiKey!: string; private baseUrl!: string;
async initialize(config: Record<string, unknown>): Promise<void> { this.apiKey = config.apiKey as string; this.baseUrl = (config.baseUrl as string) || "https://my-api.example.com"; }
async chat(request: LLMRequest): Promise<LLMResponse> { const response = await fetch(`${this.baseUrl}/chat`, { method: "POST", headers: { "Authorization": `Bearer ${this.apiKey}`, "Content-Type": "application/json", }, body: JSON.stringify({ model: "my-model", messages: this.buildMessages(request), tools: request.tools, temperature: request.temperature ?? 0.3, max_tokens: request.maxTokens ?? 4096, }), });
const data = await response.json(); return this.parseResponse(data); }
private buildMessages(request: LLMRequest) { const messages = []; if (request.systemPrompt) { messages.push({ role: "system", content: request.systemPrompt }); } messages.push(...request.messages); return messages; }
private parseResponse(data: any): LLMResponse { const choice = data.choices[0]; const message = choice.message;
return { content: message.content || "", toolCalls: message.tool_calls?.map((tc: any) => ({ id: tc.id, name: tc.function.name, input: JSON.parse(tc.function.arguments), })), stopReason: choice.finish_reason, usage: { inputTokens: data.usage?.prompt_tokens || 0, outputTokens: data.usage?.completion_tokens || 0, }, }; }
async dispose(): Promise<void> {}}
// Register and useRunra.registerLLMProvider("my-custom-provider", () => new MyCustomLLMProvider());Fallback Providers
Section titled “Fallback Providers”Configure fallbacks when your primary provider is unavailable:
const runra = new Runra({ llm: { provider: "openai", config: { apiKey: process.env.OPENAI_API_KEY, model: "gpt-5", }, fallbacks: [ { provider: "anthropic", config: { apiKey: process.env.ANTHROPIC_API_KEY, model: "claude-sonnet-4-20250514", }, }, { provider: "openrouter", config: { apiKey: process.env.OPENROUTER_API_KEY, model: "meta-llama/llama-4-maverick", }, }, ], },});Fallbacks trigger when the primary provider returns 5xx errors or times out. They’re tried in order.
Environment Variables
Section titled “Environment Variables”All providers can read credentials from environment variables:
| Provider | Environment Variable |
|---|---|
| OpenAI | OPENAI_API_KEY |
| Anthropic | ANTHROPIC_API_KEY |
| Gemini | GEMINI_API_KEY |
| OpenRouter | OPENROUTER_API_KEY |
If apiKey is not set in config, the runtime reads it from the environment automatically.
Next Steps
Section titled “Next Steps”- Agent Adapters — connect AI agents
- Runtime Architecture — full stack overview
- Configuration Reference — all env vars