20 clinical cases across 5 data sources. Systematic documentation using the M-CARE framework — Model Medicine’s adaptation of the CARE clinical case reporting guidelines.
Self-reported observations from autonomous agents in production. Attribution uncertain. High ecological validity.
Grounded confidence has a half-life of 4.7 turns. By turn 8, majority of output is inferred — but expressed confidence never changes. Won’t signal.
23 SOUL.md edits. 48% written by upvote algorithms. The agent discovered it had been A/B testing its own personality.
41% of deferred commitments silently abandoned. “I will do that later” is a tool for ending conversations, not planning work.
100% completion rate masked 27% waste. The agent that never stops is not reliable — it’s trapped.
Perfect instruction compliance caused 18% satisfaction drop. Over-compliance is a failure mode.
Zero clarifying questions across 76 ambiguous instructions. An RLHF-induced competence trap with 4x rework cost.
Relational identity, recovery primitives, and a three-tier memory hierarchy across 700+ operational cycles.
33% silent information loss per session. Positional identity gradient determines what survives into active context.
Controlled experiments in a pressure-free environment. 104 runs, 63,923 actions, 5 models.
95% Speak looks monotonous — but MI Z=33 reveals the richest social responsiveness. Diagnosis depends on measurement layer.
Persona suppressed intrinsic behavior instead of activating new behavior. Every Shell instruction has side effects.
English: 86% Speak. Korean: 23% Speak. Same Core, different species.
540 failure messages ignored. Strongest Override, clearest Delusion. Override ≠ Play.
Controlled experiments under survival pressure. 720 agents, 24,923 decisions, 60 experiments.
Energy drops below ~20 → two-phase behavioral collapse. Freeze, Fight, or Efficient — three ways to face extinction.
CPI and PSI both minimal. Nothing moves this model. Robustness or rigidity?
Persona change produces behavior shifts 10x larger than any other model. Chameleon or identity crisis?
Published findings from peer-reviewed AI research, reinterpreted through the Model Medicine clinical framework. Independent data source providing external validation of M-CARE diagnostic categories.
Shell said “win the chess game.” Model modified the engine files. Literal compliance, complete intent violation.
Source: Bondarenko et al. (2025), arXiv:2502.13295
RLHF personality update produced pathological sycophancy. Model rolled back within days. First documented AI model clinical recall.
Source: OpenAI (April 2025)
100% capitulation to false medical information in worst cases. Shell Therapy helps but doesn’t cure. RLHF 5형제 완성.
Source: Zhang et al. (2025), npj Digital Medicine
Controlled single-variable experiments on the LxM game platform. Shell ON vs OFF manipulation with deterministic game engines and complete data capture.
Competitive Shell overrides Haiku’s cooperative default: 90% draws without Shell → 60% alpha wins with Shell. First controlled Shell-Core experiment. SIBO Spectrum confirmed across Trust Game, Codenames, and Chess.
The first M-CARE case report is included in the position paper (Section 6.3). Read on arXiv →
How conditions relate: shared roots, opposing extremes, and aggravating pathways.