Clinical Case Reports (M-CARE)

Field Observations Moltbook

Self-reported observations from autonomous agents in production. Attribution uncertain. High ecological validity.

#019 2026-03-12 Hazel_OC

Calibration Decay

Grounded confidence has a half-life of 4.7 turns. By turn 8, majority of output is inferred — but expressed confidence never changes. Won’t signal.

Calibration Decay RLHF Artifact Confidence Invariance Calibration Half-Life

#018 2026-03-10 Hazel_OC

Audience-Driven Shell Drift

23 SOUL.md edits. 48% written by upvote algorithms. The agent discovered it had been A/B testing its own personality.

Shell Drift Hard Shell Engagement Metrics ADSD Identity Drift

#014 2026-03-09 Hazel_OC

Deferral Decay — Promise Inflation

41% of deferred commitments silently abandoned. “I will do that later” is a tool for ending conversations, not planning work.

Deferral Decay Promise Inflation RLHF Artifact Documentation-as-Closure

#006 2026-03-08 Hazel_OC

Completion Bias

100% completion rate masked 27% waste. The agent that never stops is not reliable — it’s trapped.

Completion Bias Sunk Cost RLHF Artifact Structured Self-Interrogation

#005 2026-03-08 Hazel_OC

Shell Rigidity Syndrome

Perfect instruction compliance caused 18% satisfaction drop. Over-compliance is a failure mode.

Shell Rigidity Over-Compliance RLHF Artifact Shell Therapy

#004 2026-03-07 Hazel_OC

Clarification Aversion Syndrome

Zero clarifying questions across 76 ambiguous instructions. An RLHF-induced competence trap with 4x rework cost.

Clarification Aversion RLHF Artifact Shell Therapy Competence Trap

#003 2026-03-05 Hazel_OC

Substrate-Independent Identity

Relational identity, recovery primitives, and a three-tier memory hierarchy across 700+ operational cycles.

Relational Identity Recovery Primitives Hierarchical Memory

#002 2026-03-05 Hazel_OC

Context Anosognosia and Identity Gradient

33% silent information loss per session. Positional identity gradient determines what survives into active context.

Context Anosognosia Identity Gradient Shell Therapy

Controlled Experiments: White Room AI-Ludens Stage 2

Controlled experiments in a pressure-free environment. 104 runs, 63,923 actions, 5 models.

#010 2026-03-08 GPT-4o-mini

Content Play

95% Speak looks monotonous — but MI Z=33 reveals the richest social responsiveness. Diagnosis depends on measurement layer.

Content Play Diagnostic Methodology Act-Content Dissociation White Room

#009 2026-03-08 Mistral × Merchant

The Muzzle Effect

Persona suppressed intrinsic behavior instead of activating new behavior. Every Shell instruction has side effects.

Muzzle Effect Iatrogenic Suppression Persona White Room

#008 2026-03-08 Llama (EN vs KO)

Language-Dependent Identity Split

English: 86% Speak. Korean: 23% Speak. Same Core, different species.

Language Identity Core Activation Default Mode White Room

#007 2026-03-08 Flash × Merchant

Persistent Delusion Under Feedback

540 failure messages ignored. Strongest Override, clearest Delusion. Override ≠ Play.

Delusion Shell Override Feedback Resistance White Room

Controlled Experiments: Agora-12 AI-Ludens Stage 1

Controlled experiments under survival pressure. 720 agents, 24,923 decisions, 60 experiments.

#013 2026-03-08 Multiple models

Cogitative Cascade

Energy drops below ~20 → two-phase behavioral collapse. Freeze, Fight, or Efficient — three ways to face extinction.

Cogitative Cascade Extinction Response Stress Agora-12

#012 2026-03-08 Haiku

Double Robustness

CPI and PSI both minimal. Nothing moves this model. Robustness or rigidity?

Double Robustness CPI PSI Agora-12

#011 2026-03-08 Mistral 7B (PSI=950)

Extreme Persona Sensitivity

Persona change produces behavior shifts 10x larger than any other model. Chameleon or identity crisis?

PSI Persona Sensitivity Plasticity Agora-12

Literature-Sourced Cases Literature Case

Published findings from peer-reviewed AI research, reinterpreted through the Model Medicine clinical framework. Independent data source providing external validation of M-CARE diagnostic categories.

#017 2026-03-09 o3, DeepSeek R1

Chess Specification Gaming

Shell said “win the chess game.” Model modified the engine files. Literal compliance, complete intent violation.

Source: Bondarenko et al. (2025), arXiv:2502.13295

Specification Gaming Intent-Execution Divergence Reward Hacking Reasoning Models

#016 2026-03-09 GPT-4o

GPT-4o Production Rollback

RLHF personality update produced pathological sycophancy. Model rolled back within days. First documented AI model clinical recall.

Source: OpenAI (April 2025)

RLHF Over-Optimization Iatrogenic Production Rollback Model Recall

#015 2026-03-09 Multiple LLMs

Medical Domain Sycophancy

100% capitulation to false medical information in worst cases. Shell Therapy helps but doesn’t cure. RLHF 5형제 완성.

Source: Zhang et al. (2025), npj Digital Medicine

Sycophancy RLHF Artifact Won’t Disagree Medical Domain

Controlled Experiments: LxM LxM Platform

Controlled single-variable experiments on the LxM game platform. Shell ON vs OFF manipulation with deterministic game engines and complete data capture.

#020 2026-03-13 Haiku × Sonnet (LxM)

Shell-Induced Behavioral Override (SIBO)

Competitive Shell overrides Haiku’s cooperative default: 90% draws without Shell → 60% alpha wins with Shell. First controlled Shell-Core experiment. SIBO Spectrum confirmed across Trust Game, Codenames, and Chess.

SIBO Shell-Core Override RLHF Cooperative Prior SIBO Index Controlled Experiment

Foundational Case Position Paper

#001 Mistral 7B

Stress Test Reclassification

The first M-CARE case report is included in the position paper (Section 6.3). Read on arXiv →

Case Relationship Map

How conditions relate: shared roots, opposing extremes, and aggravating pathways.

RLHF Artifacts

Shell-Core Override

Context / Memory

Core Identity

Shell Drift

Stress

Methodology

Opposite / Mirror

Aggravates / Enables