Project Proctor — Standard Operating Procedure
Project Proctor — Standard Operating Procedure
Project Proctor is a top-tier HAI (Human-AI Interaction) fellowship where Meshal designs graduate-level STEM problems to stress-test frontier AI reasoning capabilities. The role involves crafting problems that induce deep reasoning failures in state-of-the-art models.
Objective
Induce deep reasoning failures where the model provides an incorrect final answer. Success means the frontier model fails in a structurally interesting way — not through trick questions, but through genuine conceptual difficulty.
Content Standards
- 100% original problems — not searchable on the internet, no direct copies from textbooks
- PhD-level difficulty — problems require graduate-level domain knowledge in physics, mathematics, or engineering
- Self-contained — all necessary information is provided within the problem statement; no external references required
Target Failure Types
| Failure Type | Description | |-------------|-------------| | Conceptual misunderstanding | Model misidentifies the physical regime or mathematical framework | | Incorrect theorem application | Model applies a theorem outside its valid domain or with wrong preconditions | | Flawed deduction | Model makes a logical error in a multi-step derivation chain |
Rubric Design
- Atomic criteria — each rubric item tests exactly one skill or concept
- Action-verb led — rubric items start with measurable verbs (identifies, derives, computes, justifies)
- Weighted scoring — sum of 7 points distributed across criteria by difficulty and importance
Provenance: Extracted from Downloads/Professional/CAREER_SSOT.md (last verified 2026-03-26). This SOP was not covered by db/profile/meshal-alawein.md or any existing asset record.