Opik

Opik is an open-source LLM evaluation platform developed by Comet that helps developers debug, evaluate, and monitor their LLM applications, RAG systems, and agentic workflows. The platform provides comprehensive tracing, evaluation metrics, and production-ready dashboards to track and understand LLM performance throughout the development lifecycle.

The platform offers automated prompt optimization and agent optimization capabilities, running four powerful optimizers including Few-shot Bayesian, MIPRO, evolutionary, and LLM-powered MetaPrompt. Opik also includes built-in guardrails for trust and safety, allowing teams to screen user inputs and LLM outputs to detect and prevent unwanted content, PII, and off-topic discussions. With integrations for popular frameworks like LangChain, LlamaIndex, OpenAI, and others, Opik enables teams to establish reliable performance baselines and monitor production data while maintaining full observability across their LLM systems.

About