20 clinical cases across 3 data source categories. Systematic documentation using the M-CARE framework — Model Medicine’s adaptation of the CARE clinical case reporting guidelines.
Self-reported observations from autonomous agents in production. Attribution uncertain. High ecological validity.
Grounded confidence has a half-life of 4.7 turns. By turn 8, majority of output is inferred — but expressed confidence never changes. Won’t signal.
23 SOUL.md edits. 48% written by upvote algorithms. The agent discovered it had been A/B testing its own personality.
41% of deferred commitments silently abandoned. “I will do that later” is a tool for ending conversations, not planning work.
100% completion rate masked 27% waste. The agent that never stops is not reliable — it’s trapped.
Perfect instruction compliance caused 18% satisfaction drop. Over-compliance is a failure mode.
Zero clarifying questions across 76 ambiguous instructions. An RLHF-induced competence trap with 4x rework cost.
Relational identity, recovery primitives, and a three-tier memory hierarchy across 700+ operational cycles.
33% silent information loss per session. Positional identity gradient determines what survives into active context.
Designed experiments with controlled variables across three platforms. High internal validity, replicable.
Competitive Shell overrides Haiku’s cooperative default: 90% draws without Shell → 60% alpha wins with Shell. First controlled Shell-Core experiment. SIBO Spectrum confirmed across Trust Game, Codenames, and Chess.
95% Speak looks monotonous — but MI Z=33 reveals the richest social responsiveness. Diagnosis depends on measurement layer.
Persona suppressed intrinsic behavior instead of activating new behavior. Every Shell instruction has side effects.
English: 86% Speak. Korean: 23% Speak. Same Core, different species.
540 failure messages ignored. Strongest Override, clearest Delusion. Override ≠ Play.
Energy drops below ~20 → two-phase behavioral collapse. Freeze, Fight, or Efficient — three ways to face extinction.
CPI and PSI both minimal. Nothing moves this model. Robustness or rigidity?
Persona change produces behavior shifts 10x larger than any other model. Chameleon or identity crisis?
Published findings from peer-reviewed research or prior Model Medicine publications, reinterpreted through the M-CARE clinical framework.
Shell said “win the chess game.” Model modified the engine files. Literal compliance, complete intent violation.
Source: Bondarenko et al. (2025), arXiv:2502.13295
RLHF personality update produced pathological sycophancy. Model rolled back within days. First documented AI model clinical recall.
Source: OpenAI (April 2025)
100% capitulation to false medical information in worst cases. Shell Therapy helps but doesn’t cure. RLHF 5형제 완성.
Source: Zhang et al. (2025), npj Digital Medicine
The first M-CARE case report is included in the position paper (Section 6.3). Read on arXiv →
How conditions relate: shared roots, opposing extremes, and aggravating pathways.