Edgee is an edge-native AI gateway that optimizes prompts using intelligent token compression, removing redundancy while preserving meaning before forwarding requests to LLM providers. It enables users to tag requests with custom metadata for usage tracking and cost monitoring.
Key features include token compression that reduces input tokens by up to 50% while preserving context and intent, universal compatibility with LLM providers including OpenAI, Anthropic, Gemini, xAI, and Mistral, and cost governance with tagging and alerting capabilities. The platform also offers edge tools for invoking shared or private tools at the edge, observability for monitoring latency and usage, edge models for running small models locally, and private models deployment capabilities.
Edgee works by sitting between applications and LLM providers behind a single OpenAI-compatible API. It applies policies at the edge including routing, privacy controls, and retries before forwarding requests to providers. The system normalizes responses across models, provides observability for debugging production AI traffic, and enables cost control through routing policies and caching.
The primary benefit is reducing LLM costs by intelligently compressing prompts at the edge, with applications particularly beneficial for long contexts, RAG pipelines, and multi-turn agents. It allows users to ship AI features faster with confidence through cost controls and optimization.
Target users include developers and organizations using multiple LLM providers who need cost optimization and governance. The platform supports TypeScript, Python, Go, Rust, curl, and OpenAI SDK integrations, with Bring Your Own Keys capability for billing control and custom models.
admin
Edgee targets developers and organizations using multiple LLM providers who need cost optimization and governance. The platform serves users requiring intelligent routing, token compression, and observability for production AI applications, particularly those working with long contexts, RAG pipelines, and multi-turn agents.