KostAI

Cut LLM spend by up to 92 percent with governed routing

The project provides an open‑source, local‑first toolkit that inserts a control layer into any LLM‑powered workflow. It measures each AI call, routes the request to an appropriately sized model, applies compression or caching, and enforces quality gates before the token cost is incurred. The system is intended to be installed quickly without wizards or configuration files, allowing teams to run a pilot on a small workload and see a plain‑English report of savings.

It targets developers, operations teams, and product managers who need to manage token spend for AI agents, research assistants, or content‑generation pipelines. By offering voluntary “skills” that can be adopted per‑user or per‑team, it avoids mandatory telemetry or centralized monitoring, while still delivering auditability for managers.

What distinguishes the toolkit is its emphasis on measured pilots and proof‑before‑rollout, using a routing architecture that can reduce token costs by up to ninety‑two percent in demonstrated tests. The code is MIT‑licensed, hosted publicly on GitHub, and designed to integrate with existing LLM stacks without requiring a separate dashboard.

Reviews

Loading reviews…

Similar apps

AI Coding Agents

CodeRouter

Cut your AI coding bill 70% with automatic task routing

System Monitoring & Maintenance

AgenSights

Know exactly which AI agent is burning your budget.

AI Coding Agents

AI App Cost Savings Video Series

Practical patterns for reducing LLM costs in production apps

Budgeting & Personal Finance

Traeco

Cost Optimization for AI Agents

DevOps & Infrastructure

Manifest

Complete backend that fits into 1 YAML file.

AI Coding Agents

Korven

AI agents can act. But they have zero security