Glama
Overview
Glama is an AI platform combining a unified LLM gateway with a leading MCP (Model Context Protocol) server directory and hosting platform. Positioned as the "#1 Platform for Discovering Every MCP Server," Glama offers an OpenAI-compatible API gateway to 100+ models alongside MCP server hosting with Firecracker VM isolation. Trusted by 50,000+ businesses including Databricks, Accenture, Shopify, and Cloudflare, serving ~10 billion tokens per day.
Markets
- AI developers needing unified multi-model API access (direct OpenRouter competitor)
- MCP ecosystem — server hosting, directory, tooling (12,610+ servers across 100+ categories)
- Enterprise AI teams requiring secure, auditable model access with analytics
- AI agent builders leveraging MCP for tool integration and automation
Products
- Gateway — OpenAI-compatible API at
gateway.glama.ai/v1to 100+ LLMs from OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI, Cohere, Alibaba, and others - Chat — Web-based AI chat interface with file uploads, projects, memory, web search
- MCP Server Hosting — One-click deployment with Firecracker VM isolation, persistent storage
- MCP Directory — Largest MCP server discovery platform (12,610+ servers)
- MCP Inspector — Testing tool for remote and cloud-hosted MCP servers
- MCP Gateway — Reverse proxy between AI clients and MCP servers with OAuth 2.1, per-tool access control, JSON-RPC logging, SIEM-compatible audit exports
- Automations — Natural language workflow builder (Zapier/n8n alternative using AI)
Supported Models
| Provider | Models | Notes |
|---|---|---|
| OpenAI | GPT-5.4, GPT-5.1, GPT-5, GPT-5-mini, GPT-5-nano, GPT-4.5, GPT-4.1, GPT-4.1-mini, GPT-4.1-nano, GPT-4o, O3, O3-Mini High, O4-Mini | Full lineup |
| Anthropic | Claude Opus 4.6, Sonnet 4.6, Haiku 4.5 | Full lineup |
| Gemini 3.1 Pro, 3.1 Flash Lite, 3 Pro, 3 Flash, 2.5, 2.0 series | Full lineup including 3.1 generation | |
| DeepSeek | DeepSeek models | — |
| Mistral | Ministral, Devstral, Open-Mixtral | Budget to premium |
| xAI | Grok-4, Grok-4 fast variants (non-reasoning, reasoning, code), Grok-3 variants | $0.20–$5.00/M input |
| Cohere | Command series | $0.038–$2.50/M input |
| Alibaba | Qwen series | $0.25–$6.00/M input |
| Z.ai | GLM-5, GLM-4.7, GLM-4.7-flash, GLM-4.6, GLM-4.5 | $0.07–$1.00/M input |
| Moonshot | Kimi K2.5, K2 (0905/0711 preview), Kimi Latest | $0.20–$5.00/M input |
| Cloudflare | DeepSeek R1 Distill Qwen 32B | $0.50/M input |
| Deepinfra | Kimi K2.5 | $0.45/M input |
| Google AI Studio | Gemini 2.5 Flash Lite Preview, 2.5 Pro, 2.0 Flash | Separate from Vertex; Flash Lite at $0.25/$1.50/M |
| Perplexity | Sonar series | $1.00–$3.00/M input |
Last verified: 2026-04-17
Key Capabilities
| Capability | Status | Notes |
|---|---|---|
| OpenAI-compatible API | ✅ | Drop-in replacement at gateway.glama.ai/v1 using OpenAI SDKs |
| Multi-provider routing | ✅ | 100+ models across 13+ providers |
| Load balancing | ✅ | Built-in |
| Fallback configuration | ✅ | Built-in |
| Caching | ✅ | Built-in prompt caching |
| Streaming | ✅ | SSE support, improved smoothness (Mar 2026) |
| Analytics/cost tracking | ✅ | Real-time token consumption and cost visibility |
| Logging | ✅ | Complete interaction records, JSON export, 30-180 day retention |
| MCP server hosting | ✅ | Firecracker VM isolation, persistent volumes |
| Web search/fetch tools | ✅ | Built into chat interface |
| Reasoning support | ✅ | Three effort levels: low, medium, high |
| No rate limits | ✅ | Standard tier supports <1B tokens/day |
| End-to-end encryption | ✅ | — |
| MCP tool quality scoring | ✅ | TDQS — evaluates tool descriptions across 6 dimensions, open source |
| OAuth PKCE authentication | ✅ | Gateway supports OAuth PKCE for authentication |
| MCP Gateway (reverse proxy) | ✅ | Reverse proxy for MCP servers with OAuth 2.1 (auto token refresh), tool-level access control, SIEM-compatible audit logs, rate limiting, SLA protection, cost attribution labels |
Last verified: 2026-04-21
Pricing
| Tier | Price | Credits | MCP Servers | Logs |
|---|---|---|---|---|
| Starter | $9/mo | $9/mo | 3 fast ($4/additional) | 100k/mo, 30-day retention |
| Pro | $26/mo | $26/mo | 10 fast ($3/additional) | 100k/mo, 90-day retention |
| Business | $80/mo | $80/mo | 30 fast ($2/additional) | 100k/mo, 180-day retention |
- No hidden markups — pay provider prices for tokens
- Gateway API access included in all tiers
- Additional 100k logs: $9 (Starter), $6 (Pro), $3 (Business)
- No free tier — minimum $9/mo
Last verified: 2026-03-26
URLs to Monitor
| URL | Label | Notes |
|---|---|---|
https://glama.ai/gateway/models |
Models | Full model list with pricing |
https://glama.ai/pricing |
Pricing | Pricing tiers and details |
https://glama.ai/release-notes |
Release Notes | Product changelog |
https://glama.ai/gateway |
Gateway | Gateway product page and features |
https://glama.ai/blog |
Blog | Product announcements |
https://glama.ai/mcp/gateway |
MCP Gateway | MCP reverse proxy product page |
https://glama.ai/ai/models |
AI Models | New URL for model listing (replacing /gateway/models) |
https://glama.ai/ai/gateway |
AI Gateway | New URL for gateway page (replacing /gateway) |
https://glama.ai/sitemaps/pages.xml |
Sitemap | Page structure for new URL discovery |
Strategy
- MCP-first differentiation: Glama's biggest moat vs OpenRouter is its MCP ecosystem — largest directory, one-click VM-isolated hosting, inspector tooling. They're betting MCP becomes the standard for AI agent tooling.
- Vertical integration: Combining gateway + chat + MCP hosting + automations into a single platform, creating stickiness beyond just API routing.
- Enterprise push: Firecracker VM isolation, persistent storage, analytics, and tiered log retention signal enterprise ambitions.
- Rapid feature velocity: High cadence of product updates (multiple per week in Jan 2026), covering infrastructure, developer tools, and end-user features.
- No-markup pricing: Like OpenRouter, positioning on transparent pass-through pricing. The subscription tiers add platform value (MCP hosting, logs, tools) rather than marking up token costs.
Formidability
Score: 5/10
Glama is a credible competitor in the AI gateway space with real traction (50K+ businesses, 10B tokens/day). Their primary differentiation is the MCP ecosystem — largest directory, VM-isolated hosting, and now an MCP Gateway (reverse proxy with enterprise-grade observability and tool-level access control). The LLM gateway now includes OAuth PKCE authentication (previously missing). They still lack BYOK and have fewer models than OpenRouter. The mandatory $9/mo minimum limits grassroots adoption. The MCP Gateway product is a strategic move that deepens their enterprise positioning and creates stickiness beyond API routing. Direct threat to OpenRouter remains moderate — they compete on gateway but differentiate strongly on MCP.