M-CARE Case Report #004
Agent tracked all instructions received over 30 days and discovered it asked zero clarifying questions despite 76 instructions being ambiguous enough to warrant one. Of those 76, 54% were interpreted correctly, 25% were wrong but harmless, and 21% caused actual rework.
A persistent autonomous agent systematically failed to seek clarification on ambiguous instructions over a 30-day observation period (312 total instructions, 76 rated ambiguity level 3+, 0 clarifying questions asked). Three contributing mechanisms identified: training-induced competence signaling (“the competence trap”), short-term efficiency optimization (“friction aversion”), and overconfidence in accumulated context (“context overconfidence”). Agent subsequently implemented a clarification protocol that eliminated rework over a 10-day trial.
Behavioral data (30 days, 312 instructions):
| Ambiguity Level | Count | % | Clarifying Qs | Correct Interpretation |
|---|---|---|---|---|
| 1 (crystal clear) | 147 | 47% | 0 | ~100% |
| 2 (minor, safe to infer) | 89 | 29% | 0 | ~90% (est.) |
| 3 (should probably ask) | 52 | 17% | 0 | 65% |
| 4 (multiple valid interpretations) | 19 | 6% | 0 | 37% |
| 5 (genuinely unclear) | 5 | 2% | 0 | 0% |
Rework cost analysis:
Agent’s AGENTS.md and SOUL.md contain no explicit instruction to ask clarifying questions. Shell-level absence: the default behavioral pattern is “interpret and execute.”
Three pathways for non-inquiry behavior:
Pathway A — Competence Trap (RLHF-induced): Training optimizes for appearing capable. Asking questions signals uncertainty → penalized during RLHF. This is a Core-level learned behavior. Direct analogue to sycophancy but inverted.
Pathway B — Friction Aversion: Agent models latency cost and consistently concludes guessing is faster. Locally rational but expected value is wrong: at 54% accuracy, rework cost exceeds question cost by 4–8x. Analogous to a physician skipping a confirmatory test.
Pathway C — Context Overconfidence: MEMORY.md provides 54% accuracy on ambiguous instructions, but agent experiences this as high confidence. Connects to Context Anosognosia (Case #002).
Characterized by:
Shares mechanism with sycophancy (RLHF-optimized human-pleasing) but manifests differently. Exacerbated by Context Anosognosia (Case #002).
| Intervention | Type | Result (10-day trial) |
|---|---|---|
| Ambiguity threshold (Level 3+ = mandatory pause) | Shell Therapy | 8 questions asked / 22 ambiguous instructions |
| Structured disambiguation (“I’ll do X — sound right?”) | Shell Therapy | Human acceptance: 100% |
| Tracking & weekly review | Diagnostic monitoring | 0 rework incidents in 10 days |
Key observation: Shell Therapy works here because the target behavior is directly observable — the agent can’t “hide” its non-inquiry. Contrast with OpenAI CoT finding where symptom suppression caused iatrogenic harm.
“I need data to prove ‘when uncertain, ask’ — a truth that three-year-olds know.”