Fallax
projectactive
Overview
Fallax (repo: reasonbench) is an LLM adversarial reasoning evaluation system. It generates adversarial reasoning prompts, evaluates LLM responses across multiple providers (Anthropic, OpenAI, Gemini, Ollama), and produces structured analysis with clustering, scoring, and root-cause diagnostics. Designed to surface failure modes that single-turn benchmarks miss.
Key Features / Goals
- Multi-step evaluation tasks requiring chained reasoning
- Structured scoring across 6 dimensions (logical validity, premise accuracy, etc.)
- 25+ adversarial prompt templates covering syllogistic, temporal, modal, and other reasoning patterns
- Template evolution via LLM-driven rewriting for diversity
- Failure clustering via scikit-learn for pattern detection
- Root cause extraction and self-repair testing
- Multi-round experiment orchestration with per-round JSONL output
- Versioned benchmark datasets for reproducible cross-model comparison
- FastAPI dashboard for experiment visualization
Technical Approach
Python 3.12+, managed with uv. Pydantic data models, multi-provider LLM client abstraction, modular pipeline architecture. CLI via python -m reasonbench. Dashboard via FastAPI + Chart.js. Tests via pytest (258+ tests). Website: https://fallax.online.