Skip to main content

Quorum

Quorum is a RAG evaluation platform that routes each test case to the right evaluation strategy based on risk. Instead of paying for a full judge panel on every case, Quorum uses adaptive orchestration to balance quality, cost, and latency.

Why teams use Quorum

  • Catch silent RAG failures before they reach production
  • Run multi-judge evaluation only where it matters
  • Stream every evaluation milestone in real time over SSE
  • Inspect cost, verdict, and judge-level reasoning for each run

Evaluation strategies

StrategyWhen to use itWhat runs
councilHigh-risk casesOpenAI + Anthropic + Gemini judges, then an aggregator
hybridMedium-risk casesDeterministic checks plus one LLM judge
singleLow-risk casesOne lightweight judge
autoDefault modeRisk-based routing across all three

What the platform includes

  • A React frontend for uploads, live streaming, and history
  • An Express backend with orchestration, SSE, and persistence
  • JavaScript and Python SDKs
  • Public benchmark results at the benchmarks page

Next steps