Skip to content

LiteLLM

Overview

LiteLLM is an open-source Python library and proxy server that provides a unified OpenAI-compatible interface to 100+ LLM providers. Built by BerriAI, it translates API calls to each provider's format, enabling developers to switch models without changing code. Offers both a Python SDK for direct integration and a standalone proxy server (LLM Gateway) for centralized management with virtual keys, rate limiting, and spend tracking.

Formidability

Score: 7/10

LiteLLM is the most direct open-source competitor to OpenRouter's core routing functionality. Its massive provider coverage (100+ models) and OpenAI-compatible API make it a credible alternative for teams willing to self-host. The March 2026 feature push significantly expanded its platform surface: a built-in Chat UI, agent management with RBAC/budgets, MCP integration with OAuth 2.1 PKCE, Responses WebSocket API, and new guardrails (PIPEDA, CrowdStrike). It still requires significant DevOps investment to run in production (2-4 weeks setup, 10-20 hrs/month maintenance). The open-core model means enterprise features require paid licenses. Development cadence is extraordinary — multiple releases per day, 1,005+ contributors, day-0 model support for GPT-5.4 and Gemini 3.1. Migrated build tooling from Poetry to uv. Dropped Python 3.9 support (minimum 3.10) in v1.83.10.

Markets

  • Primary: Engineering teams and platform teams who want to self-host an LLM gateway
  • Secondary: Enterprises needing on-prem/private cloud LLM routing with compliance controls
  • Geographic: Global (open-source, self-hosted)

Products

  • Python SDK — unified API to call any LLM provider with consistent request/response format
  • Proxy Server (LLM Gateway) — centralized gateway with auth, virtual keys, rate limiting, spend tracking (now with Redis-backed distributed budget enforcement and multiple concurrent budget windows per key/team), Responses WebSocket API, health-check-driven routing, order-based deployment fallback priorities
  • Chat UI — built-in ChatGPT-like web interface with MCP tools and streaming (new March 2026)
  • Agent Management — agent RBAC, health checks, budget/TPM/RPM limits per agent and session
  • MCP Integration — BYOK MCP servers with OAuth 2.1 PKCE, admin tool overrides, Google Search API, OpenAPI MCP servers, non-admin submission with review workflow, token auth, AWS SigV4 for Bedrock AgentCore, zero trust auth pattern, per-user OAuth token storage for interactive flows, per-server initialization instructions exposed from gateway
  • Enterprise — SSO, RBAC, admin UI, guardrails (PIPEDA, CrowdStrike AIDR, Prisma AIRS, DynamoAI, PromptGuard; RestrictedPython sandbox for custom guardrail code) with project-level scoping and system message skip option, audit logs + export, Prometheus metrics (optimized: 18-bucket latency histograms), Hashicorp Vault, tool policies, Datadog metrics tracing, Public AI Hub (shareable model/agent catalog), AWS KMS v2 key decryption (Beta), web crawler blocking, per-model rate limits for teams, AWS GovCloud mode, bulk team permissions API, safety_identifier compliance tracking
  • Token Counting — public acount_tokens() API with OpenAI-compatible token counting
  • Anthropic Files API — native support for Anthropic's file upload API
  • SageMaker Nova — Amazon Nova models via SageMaker endpoint (added March 2026)
  • Agent Framework Integrations — LangGraph, Pydantic AI Agents, Manus listed as supported agent frameworks; AgentCore A2A-native agent support (JSON-RPC envelope preservation); Anthropic advisor tool type support
  • Speech/Audio — ElevenLabs and Deepgram provider support
  • Image Generation/Editing — Black Forest Labs (FLUX), Stability AI, Recraft, Fal AI, RunwayML
  • Embeddings — Voyage AI, Jina AI dedicated embedding providers
  • EU Sovereign AI — Nscale provider for EU-sovereign deployments
  • Nvidia NIM — dedicated NVIDIA inference platform support
  • Docker Model Runner — local Docker-based model deployment support
  • Triton Inference Server — NVIDIA Triton self-hosted inference support
  • Milvus — vector store integration for RAG workflows
  • Oracle OCI — Oracle Cloud Infrastructure AI provider support
  • Additional Providers — Fal AI, RunwayML, GitHub Copilot, Morph, RAGFlow, Heroku, Snowflake, Codestral API [Mistral AI], Abliteration, Petals, and others (90+ in extended list)

Pricing

Tier Cost Key Features
Open Source $0 Routing, load balancing, basic logging, 100+ providers
Enterprise Basic $250/mo Prometheus metrics, guardrails, JWT auth, SSO, audit logs
Enterprise Premium $30,000/yr Full compliance, managed support, dedicated channels
Managed by LiteLLM Custom LiteLLM hosts and maintains the proxy for you

Also available on AWS Marketplace. Free 30-day enterprise trial available.

URLs to Monitor

URL Label Notes
https://litellm.ai Homepage Landing page (JS-rendered)
https://docs.litellm.ai Docs Home Documentation hub
https://docs.litellm.ai/docs/providers Providers Supported providers list
https://docs.litellm.ai/docs/proxy/configs Proxy Config Proxy configuration reference
https://docs.litellm.ai/docs/proxy/enterprise Enterprise Enterprise features
https://docs.litellm.ai/docs/proxy/model_management Model Management Model CRUD and config
https://docs.litellm.ai/docs/completion/input Completion API API reference
https://github.com/BerriAI/litellm/releases GitHub Releases Open-source releases

Strategy

  • Open-core model: Free OSS proxy captures developer mindshare; enterprise features (SSO, RBAC, UI) drive revenue
  • Rapid release cadence: Multiple releases per day, 40+ contributors — day-0 model support for GPT-5.4, Gemini 3.1, and Gemini 3.1 Flash Live Preview
  • Self-host first: Positioning as the go-to choice for teams that need on-prem or private-cloud LLM routing
  • OpenAI compatibility: Full OpenAI API compatibility including Responses API (now with prompt management) and WebSocket real-time API
  • Platform expansion: Moving beyond pure routing into agent management, MCP tool orchestration (OpenAPI servers, non-admin submission, token auth, per-server health checks), built-in Chat UI with MCP + Responses API, Anthropic Files API, guardrails, and agent framework integrations (LangGraph, Pydantic AI, Manus) — becoming a full AI platform
  • Enterprise upsell: Enterprise tiers ($250/mo - $30K/yr) target teams that outgrow the free tier and need compliance features
  • Security & compliance: Adding guardrails (PIPEDA, CrowdStrike AIDR), Hashicorp Vault, tool policies — deepening enterprise appeal