This project is no longer maintained. Some features may be non-functional — see Funnel instead.

Consensus

Multi-LLM evaluation platform querying 6+ models simultaneously with synthesized consensus responses and support for 11+ APIs through a unified routing layer.

ReactExpress.jsNode.jsGemini APIVite

Overview

Consensus was a multi-LLM evaluation platform designed to compare concurrent model outputs in real-time. The system executed simultaneous queries across various Gemini models, automatically synthesizing responses to resolve discrepancies and extract high-confidence insights.

Highlights

Streaming Performance:

Achieved 236 tokens/s and 2.5s p95 latency using Gemini 2.0 Flash for streaming inference.

Multi-LLM Orchestration:

Built a web app running 6+ Gemini LLMs simultaneously with side-by-side LaTeX results and collapsible UI.

Aggregated Inference:

Developed a React and Express.js stack to aggregate and merge outputs from multiple lightweight models.

Takeaways

The technical focus was on coordinating asynchronous API calls and managing parallel model state transitions. These experiments in semantic aggregation directly informed the architectural shift toward extension-based orchestration in Funnel, moving beyond a standalone dashboard toward a more integrated, browser-native LLM workflow.