LiteLLM
Overview
LiteLLM is an open-source Python library and proxy server that provides a unified OpenAI-compatible interface to 100+ LLM providers. Built by BerriAI, it translates API calls to each provider's format, enabling developers to switch models without changing code. Offers both a Python SDK for direct integration and a standalone proxy server (LLM Gateway) for centralized management with virtual keys, rate limiting, and spend tracking.
Formidability
Score: 7/10
LiteLLM is the most direct open-source competitor to OpenRouter's core routing functionality. Its massive provider coverage (100+ models) and OpenAI-compatible API make it a credible alternative for teams willing to self-host. The March 2026 feature push significantly expanded its platform surface: a built-in Chat UI, agent management with RBAC/budgets, MCP integration with OAuth 2.1 PKCE, Responses WebSocket API, and new guardrails (PIPEDA, CrowdStrike). It still requires significant DevOps investment to run in production (2-4 weeks setup, 10-20 hrs/month maintenance). The open-core model means enterprise features require paid licenses. Development cadence is extraordinary — multiple releases per day, 1,005+ contributors, day-0 model support for GPT-5.4 and Gemini 3.1. Migrated build tooling from Poetry to uv. Dropped Python 3.9 support (minimum 3.10) in v1.83.10.
Markets
- Primary: Engineering teams and platform teams who want to self-host an LLM gateway
- Secondary: Enterprises needing on-prem/private cloud LLM routing with compliance controls
- Geographic: Global (open-source, self-hosted)
Products
- Python SDK — unified API to call any LLM provider with consistent request/response format
- Proxy Server (LLM Gateway) — centralized gateway with auth, virtual keys, rate limiting, spend tracking (now with Redis-backed distributed budget enforcement and multiple concurrent budget windows per key/team), Responses WebSocket API, health-check-driven routing, order-based deployment fallback priorities
- Chat UI — built-in ChatGPT-like web interface with MCP tools and streaming (new March 2026)
- Agent Management — agent RBAC, health checks, budget/TPM/RPM limits per agent and session
- MCP Integration — BYOK MCP servers with OAuth 2.1 PKCE, admin tool overrides, Google Search API, OpenAPI MCP servers, non-admin submission with review workflow, token auth, AWS SigV4 for Bedrock AgentCore, zero trust auth pattern, per-user OAuth token storage for interactive flows, per-server initialization instructions exposed from gateway
- Enterprise — SSO, RBAC, admin UI, guardrails (PIPEDA, CrowdStrike AIDR, Prisma AIRS, DynamoAI, PromptGuard; RestrictedPython sandbox for custom guardrail code) with project-level scoping and system message skip option, audit logs + export, Prometheus metrics (optimized: 18-bucket latency histograms), Hashicorp Vault, tool policies, Datadog metrics tracing, Public AI Hub (shareable model/agent catalog), AWS KMS v2 key decryption (Beta), web crawler blocking, per-model rate limits for teams, AWS GovCloud mode, bulk team permissions API, safety_identifier compliance tracking
- Token Counting — public
acount_tokens()API with OpenAI-compatible token counting - Anthropic Files API — native support for Anthropic's file upload API
- SageMaker Nova — Amazon Nova models via SageMaker endpoint (added March 2026)
- Agent Framework Integrations — LangGraph, Pydantic AI Agents, Manus listed as supported agent frameworks; AgentCore A2A-native agent support (JSON-RPC envelope preservation); Anthropic advisor tool type support
- Speech/Audio — ElevenLabs and Deepgram provider support
- Image Generation/Editing — Black Forest Labs (FLUX), Stability AI, Recraft, Fal AI, RunwayML
- Embeddings — Voyage AI, Jina AI dedicated embedding providers
- EU Sovereign AI — Nscale provider for EU-sovereign deployments
- Nvidia NIM — dedicated NVIDIA inference platform support
- Docker Model Runner — local Docker-based model deployment support
- Triton Inference Server — NVIDIA Triton self-hosted inference support
- Milvus — vector store integration for RAG workflows
- Oracle OCI — Oracle Cloud Infrastructure AI provider support
- Additional Providers — Fal AI, RunwayML, GitHub Copilot, Morph, RAGFlow, Heroku, Snowflake, Codestral API [Mistral AI], Abliteration, Petals, and others (90+ in extended list)
Pricing
| Tier | Cost | Key Features |
|---|---|---|
| Open Source | $0 | Routing, load balancing, basic logging, 100+ providers |
| Enterprise Basic | $250/mo | Prometheus metrics, guardrails, JWT auth, SSO, audit logs |
| Enterprise Premium | $30,000/yr | Full compliance, managed support, dedicated channels |
| Managed by LiteLLM | Custom | LiteLLM hosts and maintains the proxy for you |
Also available on AWS Marketplace. Free 30-day enterprise trial available.
URLs to Monitor
| URL | Label | Notes |
|---|---|---|
https://litellm.ai |
Homepage | Landing page (JS-rendered) |
https://docs.litellm.ai |
Docs Home | Documentation hub |
https://docs.litellm.ai/docs/providers |
Providers | Supported providers list |
https://docs.litellm.ai/docs/proxy/configs |
Proxy Config | Proxy configuration reference |
https://docs.litellm.ai/docs/proxy/enterprise |
Enterprise | Enterprise features |
https://docs.litellm.ai/docs/proxy/model_management |
Model Management | Model CRUD and config |
https://docs.litellm.ai/docs/completion/input |
Completion API | API reference |
https://github.com/BerriAI/litellm/releases |
GitHub Releases | Open-source releases |
Strategy
- Open-core model: Free OSS proxy captures developer mindshare; enterprise features (SSO, RBAC, UI) drive revenue
- Rapid release cadence: Multiple releases per day, 40+ contributors — day-0 model support for GPT-5.4, Gemini 3.1, and Gemini 3.1 Flash Live Preview
- Self-host first: Positioning as the go-to choice for teams that need on-prem or private-cloud LLM routing
- OpenAI compatibility: Full OpenAI API compatibility including Responses API (now with prompt management) and WebSocket real-time API
- Platform expansion: Moving beyond pure routing into agent management, MCP tool orchestration (OpenAPI servers, non-admin submission, token auth, per-server health checks), built-in Chat UI with MCP + Responses API, Anthropic Files API, guardrails, and agent framework integrations (LangGraph, Pydantic AI, Manus) — becoming a full AI platform
- Enterprise upsell: Enterprise tiers ($250/mo - $30K/yr) target teams that outgrow the free tier and need compliance features
- Security & compliance: Adding guardrails (PIPEDA, CrowdStrike AIDR), Hashicorp Vault, tool policies — deepening enterprise appeal