Opik
Evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle.
Opik is an open‑source platform that provides end‑to‑end observability for generative AI applications, from prototype to production. It records detailed traces of LLM calls, conversation flows, and agent activity, allowing developers to monitor performance, track costs, and visualize the execution of complex systems. The platform also includes evaluation tools such as prompt testing, LLM‑as‑a‑judge, and experiment management, enabling systematic assessment of model outputs.
Built for developers, Opik offers a suite of features that support software‑testing‑style workflows for AI agents. Users can define regression tests and assertions, run end‑to‑end agent playground sessions, and automatically generate fixes that are written back to the codebase with accompanying test cases. These capabilities aim to reduce guesswork and streamline the iteration cycle for LLM‑driven products.
The project is self‑hostable, released under the Apache‑2.0 license, and provides a free tier without subscription requirements. It is positioned as a stable, production‑ready solution for teams that need deep tracing, online evaluation rules, and prompt optimization while maintaining open‑source transparency.
Reviews
Loading reviews…
Similar apps

Databases & Data Tools
Comet
Platform for tracking, visualizing, and managing machine learning experiments.

AI Coding Agents
Agenta
LLMOps platform for prompt management, LLM evaluation, and observability. Build, evaluate, and monitor production-grade LLM applications…
AI Coding Agents
Langfuse
LLM engineering platform for model tracing, prompt management, and application evaluation. Langfuse helps teams collaboratively debug…
AI Coding Agents
Arize Phoenix
Open-source platform for LLM tracing, evaluation, and optimization. Features automatic instrumentation, prompt playground, and real-time AI…

AI Coding Agents
Dify.ai
Build, test and deploy LLM applications.

AI Coding Agents
Langsmith
Observability platform for LLM applications, tracking prompts, latency, and costs.