Fallax

projectactive

Overview

Fallax (formerly reasonbench) is an LLM adversarial reasoning evaluation system. It generates adversarial reasoning prompts, evaluates LLM responses across multiple providers (Anthropic, OpenAI, Gemini, Ollama), and produces structured analysis with clustering, scoring, and root-cause diagnostics. Designed to surface failure modes that single-turn benchmarks miss.

Hosted at alawein/fallax and deployed at fallax.online.

Key Features

  • Multi-step evaluation tasks requiring chained reasoning
  • Structured scoring across 6 dimensions (logical validity, premise accuracy, etc.)
  • 25+ adversarial prompt templates covering syllogistic, temporal, modal, and other reasoning patterns
  • Template evolution via LLM-driven rewriting for diversity
  • Failure clustering via scikit-learn for pattern detection
  • Root cause extraction and self-repair testing
  • Multi-round experiment orchestration with per-round JSONL output
  • Versioned benchmark datasets for reproducible cross-model comparison
  • FastAPI dashboard for experiment visualization

Technical Approach

Python 3.12+, managed with uv. Pydantic data models, multi-provider LLM client abstraction, modular pipeline architecture. CLI via python -m reasonbench. Dashboard via FastAPI + Chart.js. Tests via pytest (258+).

Note: The Python module is still named reasonbench internally (intentional — see workspace CLAUDE.md). Imports and CLI invocation remain reasonbench; the repo, deployed brand, and slug are fallax.