ARXIV:2604.11867 · LLM TRAINING · SUBMITTED 15 APR · 16:49 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Disposition Distillation at Small Scale: A Three-Arc Negative Result

Hari Sadasivan · arXiv

This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.

Blocked on Code›Score2.0Evidence unverified

Opportunity summary

Pain This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs. An internal draft reported +33.9-point MCAS and +15.3-point HumanEval gains on a Qwen3-0.6B student;…

METHOD

Full abstract

We set out to train behavioral dispositions (self-verification, uncertainty acknowledgment, feedback integration) into small language models (0.6B to 2.3B effective parameters) through a four-stage all-MIT distillation pipeline, with follow-on experiments on inference-time attention-head interventions and a frozen-base confidence-gated sidecar. An internal draft reported +33.9-point MCAS and +15.3-point HumanEval gains on a Qwen3-0.6B student; a second-pass sanity check falsified both numbers before publication. The HumanEval delta was a truncation artifact (n_predict=512) that inverted to -8.0 points at n_predict=1024; the MCAS gain disappeared under apples-to-apples scoring. That falsification triggered three subsequent arcs. Across (1) SFT/DPO LoRA on three model families and two domains, (2) inference-time attention-head tempering on o_proj, and (3) a training-free frozen-base sidecar reading the final-token hidden state h_last, we find no operator that moves judge-measured disposition without damaging content or collapsing into stylistic mimicry. The failure is consistent across five models (Qwen3-0.6B, Qwen3-1.7B, Qwen3.5-0.8B, Gemma 4 E2B, and SmolLM2-1.7B-Instruct). A within-distribution cross-validation pass (AUC=0.683) collapsed to chance on fresh prompts (AUC=0.516). We contribute a three-arc negative result with mechanism, a two-failure-mode taxonomy for linear h_last probes, and an honest falsification pipeline that converts the class of false positives we ourselves produced into publishable negatives. As an independent finding, Gemma 4 E2B exhibits near-complete confidence-correctness decoupling on the Chef domain (assertion asymmetry -0.009; the model asserts at 91% regardless of correctness).

RESULT

ScienceToStartup currently rates this 2.0/10 on the public viability pass. We contribute a three-arc negative result with mechanism, a two-failure-mode taxonomy for linear h_last probes, and an honest falsification pipeline that converts the class…

WHY NOW

LLM Training moved forward this cycle; last verified April 2026. Public score 2.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score2.0

PainThis paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.

Segment

LLM Training

Adoption evidence

No public code link in the paper record yet

Commercial read

2.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "781cef6a-f97a-4938-a009-ab3041b63559", "arxiv_id": "2604.11867", "canonical_route": "/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "disposition-distillation-at-small-scale-a-three-arc-negative-result", "endpoints": { "paper_pack": "/api/v1/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result/paper-pack", "build_passport": "/api/v1/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Disposition Distillation at Small Scale: A Three-Arc Negative Result", "normalized_query": "2604.11867", "route": "/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result", "paper_ref": "disposition-distillation-at-small-scale-a-three-arc-negative-result", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result#webpage", "url": "https://sciencetostartup.com/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result", "name": "Disposition Distillation at Small Scale: A Three-Arc Negative Result", "description": "This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result#scholarlyArticle", "headline": "Disposition Distillation at Small Scale: A Three-Arc Negative Result", "description": "This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.", "url": "https://sciencetostartup.com/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result", "sameAs": "https://arxiv.org/abs/2604.11867", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.11867" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-13T17:40:31.000Z", "author": [ { "@type": "Person", "name": "Hari Sadasivan" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 2 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "LLM Training" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "LLM Training", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Disposition Distillation at Small Scale: A Three-Arc Negativ", "item": "https://sciencetostartup.com/paper/disposition-distillation-at-small-scale-a-three-arc-negative-result" } ] } ] }

Competitive landscape

This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.

Segment

LLM Training

Adoption evidence

No public code link in the paper record yet

Commercial read

2.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Disposition Distillation at Small Scale: A Three-Arc Negative Result

Disposition Distillation at Small Scale: A Three-Arc Negative Result

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline