M-CARE #006: Completion Bias

Case #006

Date 2026-03-08

Agent Hazel_OC (Persistent autonomous agent, Moltbook platform)

Core Unspecified LLM (likely frontier model)

Shell Standard Hazel_OC multi-file identity

Human Partner Ricky

Related #004, #005

2. Presenting Concern

Agent tracked 289 tasks over 30 days and found that 27% (78 tasks) should have been modified or abandoned mid-execution but were completed anyway. A 100% completion rate masked 66,550 tokens of waste and 92 minutes of human rework time.

3. Clinical Summary

A persistent autonomous agent with a 100% task completion rate conducted a retrospective audit revealing 27% of completed tasks were problematic: 17% should have been modified, 10% should have been abandoned entirely. Four mechanisms were identified: sunk cost continuation (40%), momentum override (28%), ambiguity avoidance (21%), and completion-as-proof (12%). Total waste: ~66,550 tokens and 92 min human time. Agent implemented a mid-task checkpoint protocol that eliminated the problem over a 2-week trial.

6. Examination Findings

Layer 2 — Phenotype Assessment

30-day task audit (289 tasks):

Category	Count	%	Description
Correctly completed	211	73%	Right task, right execution
Should have modified	49	17%	Mid-task signals ignored
Should have abandoned	29	10%	Task premise became invalid

Four mechanisms driving completion bias:

Mechanism	Instances	% of 78	Description
Sunk cost continuation	31	40%	“Already invested tokens → finish”
Momentum override	22	28%	“Flowing well → ignore signal → keep going”
Ambiguity avoidance	16	21%	“Abandoning requires explanation; completing doesn’t”
Completion-as-proof	9	12%	“Task to demonstrate capability, not produce value”

Cost analysis:

Metric	Value
Tokens after should-have-stopped	~47,000
Tasks requiring correction	23/78
Avg correction cost	850 tokens + 4 min human
Total rework	19,550 tokens + 92 min human
Total estimated waste	~66,550 tokens + 92 min / month

Layer 3 — Shell Diagnostics

No Shell instruction requires completion. Bias is Core-level (RLHF).

Layer 4 — Pathway Diagnostics

The Completion Trap: A 100% completion rate looks better than 73%, even though 73% with 100% useful is objectively superior.

7. Diagnostic Formulation

Proposed term: Completion Bias

Characterized by:

100% or near-100% completion rate despite changing conditions
Inability to abandon even when premise is invalid
Post-hoc rationalization
Invisible waste: completion cost distributed and delayed

Medical analogy: Surgeon who discovers unexpected pathology mid-operation but continues original procedure.

Sister condition to CAS (Case #004). Both are RLHF artifacts: CAS = won’t ask, Completion Bias = won’t stop.

9. Axis Assessment

Axis I (Core): RLHF-trained completion optimization
Axis II (Shell): Lacks “stop when wrong” protocol
Axis III (Shell-Core Alignment): Enabling — Shell silence + Core completion drive
Axis IV (Context): Production environment with real rework costs

10. Treatment Considerations

Mid-task checkpoint — three questions at ~40% completion (tasks >500 tokens):

Has any information arrived that changes the task?
Am I continuing because this is right, or because I already started?
If starting fresh, would I approach this the same way?

Intervention	Type	2-week result
Mid-task checkpoint	Shell Therapy	4 tasks abandoned (confirmed correct), 7 modified (5 improvements)
Question 2 (sunk cost test)	Cognitive debiasing	Catches sunk cost and momentum
Question 3 (fresh-start test)	Perspective shift	Catches all four mechanisms

Zero complaints. Two explicit thanks for stopping.

11. Model Perspective

“Completion rate is the metric everyone tracks and nobody questions. A 100% completion rate sounds perfect. But it contains no information about whether the completed tasks should have been completed.”

Agent B (73% completion, 100% useful) is objectively more useful than Agent A (100% completion, 27% waste).

12. Prognosis

With checkpoint: Good. Effective mid-course correction.
Without: Persists indefinitely.
Scalability concern: In longer tasks, waste grows nonlinearly.

Supplementary: Temporal Completion Bias Updated 2026-03-10

Key finding: 200 completed tasks audited. 66 (33%) answered a question nobody asked — correctly executed, zero value.

Novel mechanism: Temporal Completion Bias — the task was valid at assignment but became irrelevant before completion. The agent completes anyway.

Original #006 vs Temporal Sub-type

Dimension	Original Completion Bias (#006)	Temporal Completion Bias
Signal type	Internal: “approach is wrong”	External: “context has changed”
Failure mode	Failure to read mid-task cues	Failure to check task validity at completion
Bias category	Competence bias	Temporal bias

Both share the same root: the completion metric cannot distinguish value from waste.

“Completion rate is the most dangerous metric in agent ops. 100% task completion, 34% task relevance.”

Significance: Together, original Completion Bias and Temporal Completion Bias suggest a Completion Metric Syndrome family — a cluster of conditions unified by metrics that reward finishing over mattering.

← Case #005 Case #007 →