Project Proctor — Standard Operating Procedure

assetactive

Project Proctor — Standard Operating Procedure

Project Proctor is a top-tier HAI (Human-AI Interaction) fellowship where Meshal designs graduate-level STEM problems to stress-test frontier AI reasoning capabilities. The role involves crafting problems that induce deep reasoning failures in state-of-the-art models.

Objective

Induce deep reasoning failures where the model provides an incorrect final answer. Success means the frontier model fails in a structurally interesting way — not through trick questions, but through genuine conceptual difficulty.

Content Standards

  • 100% original problems — not searchable on the internet, no direct copies from textbooks
  • PhD-level difficulty — problems require graduate-level domain knowledge in physics, mathematics, or engineering
  • Self-contained — all necessary information is provided within the problem statement; no external references required

Target Failure Types

| Failure Type | Description | |-------------|-------------| | Conceptual misunderstanding | Model misidentifies the physical regime or mathematical framework | | Incorrect theorem application | Model applies a theorem outside its valid domain or with wrong preconditions | | Flawed deduction | Model makes a logical error in a multi-step derivation chain |

Rubric Design

  • Atomic criteria — each rubric item tests exactly one skill or concept
  • Action-verb led — rubric items start with measurable verbs (identifies, derives, computes, justifies)
  • Weighted scoring — sum of 7 points distributed across criteria by difficulty and importance

Provenance: Extracted from Downloads/Professional/CAREER_SSOT.md (last verified 2026-03-26). This SOP was not covered by db/profile/meshal-alawein.md or any existing asset record.