Opik

Evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle.

Visit ↗ Source ↗

Opik is an open‑source platform that provides end‑to‑end observability for generative AI applications, from prototype to production. It records detailed traces of LLM calls, conversation flows, and agent activity, allowing developers to monitor performance, track costs, and visualize the execution of complex systems. The platform also includes evaluation tools such as prompt testing, LLM‑as‑a‑judge, and experiment management, enabling systematic assessment of model outputs.

Built for developers, Opik offers a suite of features that support software‑testing‑style workflows for AI agents. Users can define regression tests and assertions, run end‑to‑end agent playground sessions, and automatically generate fixes that are written back to the codebase with accompanying test cases. These capabilities aim to reduce guesswork and streamline the iteration cycle for LLM‑driven products.

The project is self‑hostable, released under the Apache‑2.0 license, and provides a free tier without subscription requirements. It is positioned as a stable, production‑ready solution for teams that need deep tracing, online evaluation rules, and prompt optimization while maintaining open‑source transparency.

Reviews

Loading reviews…

Similar apps

Databases & Data Tools

Comet

Platform for tracking, visualizing, and managing machine learning experiments.

AI Coding Agents

Agenta

LLMOps platform for prompt management, LLM evaluation, and observability. Build, evaluate, and monitor production-grade LLM applications…

AI Coding Agents

Langfuse

LLM engineering platform for model tracing, prompt management, and application evaluation. Langfuse helps teams collaboratively debug…

Arize Phoenix

AI Coding Agents

Arize Phoenix

Open-source platform for LLM tracing, evaluation, and optimization. Features automatic instrumentation, prompt playground, and real-time AI…

AI Coding Agents

Dify.ai

Build, test and deploy LLM applications.

AI Coding Agents

Langsmith

Observability platform for LLM applications, tracking prompts, latency, and costs.