Fallax

projectactive

Overview

Fallax (repo: reasonbench) is an LLM adversarial reasoning evaluation system. It generates adversarial reasoning prompts, evaluates LLM responses across multiple providers (Anthropic, OpenAI, Gemini, Ollama), and produces structured analysis with clustering, scoring, and root-cause diagnostics. Designed to surface failure modes that single-turn benchmarks miss.

Key Features / Goals

  • Multi-step evaluation tasks requiring chained reasoning
  • Structured scoring across 6 dimensions (logical validity, premise accuracy, etc.)
  • 25+ adversarial prompt templates covering syllogistic, temporal, modal, and other reasoning patterns
  • Template evolution via LLM-driven rewriting for diversity
  • Failure clustering via scikit-learn for pattern detection
  • Root cause extraction and self-repair testing
  • Multi-round experiment orchestration with per-round JSONL output
  • Versioned benchmark datasets for reproducible cross-model comparison
  • FastAPI dashboard for experiment visualization

Technical Approach

Python 3.12+, managed with uv. Pydantic data models, multi-provider LLM client abstraction, modular pipeline architecture. CLI via python -m reasonbench. Dashboard via FastAPI + Chart.js. Tests via pytest (258+ tests). Website: https://fallax.online.