Eval & Testing
LLM and AI-agent evaluation, prompt testing and benchmarking.
17 vendors in this category
Agenta
Eval & Testing
Open-source LLM evaluation prompt management and observability platform
Germany
Arize AI
Eval & Testing
Agent observability evaluation and improvement platform with open-source Phoenix
USA
Arthur AI
Eval & Testing
ML monitoring fairness and LLM evaluation platform for enterprises
USA
Braintrust
Platforms & Products
AI observability platform for building quality AI products with evaluations, monitoring, and optimization tools
Confident AI (DeepEval)
Eval & Testing
Open-source LLM evaluation framework with hosted observability platform
USA
Deepchecks
Eval & Testing
Continuous validation and testing platform for ML models and LLM apps
Israel
Galileo
Analytics & Conversation Intelligence
AI observability and evaluation platform that turns offline evals into production guardrails for AI systems
Giskard
Eval & Testing
Open-source LLM testing and evaluation framework for quality and safety
France
LangWatch
Conversational & Voice QA
AI agent testing LLM evaluation and observability platform
Netherlands
Maxim AI
Eval & Testing
End-to-end evaluation and observability platform for AI agents
India/USA
Openlayer
Eval & Testing
AI agent evaluation and stress-testing platform for pre-deployment
USA
Opik
Open Source Projects
Open-source LLM evaluation platform for debugging, evaluating, and monitoring LLM applications and RAG systems
Patronus AI
Eval & Testing
Automated LLM evaluation and security platform for regulated industries
USA
Promptfoo
Eval & Testing
Open-source AI security and testing platform for LLM vulnerabilities
USA
Ragas
Eval & Testing
Open-source framework for evaluating RAG pipelines and LLM applications
USA
TruLens / TruEra
Eval & Testing
Open-source LLM evaluation and tracing framework (TruEra acquired by Snowflake)
USA
Vellum
Eval & Testing
LLM development platform with evaluation and testing capabilities
USA