ARXIV:2605.23901 · LLM TRAINING · SUBMITTED 25 MAY · 20:39 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Xu Ouyang · Deyi Liu · Yuhang Cai · Jing Liu · Yuan Yang · Chen Zheng · +2 at arXiv

A theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena.

Blocked on Code›Score3.0Evidence unverified

Opportunity summary

Pain A theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena. We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission…

METHOD

Full abstract

Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute. We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a noisy channel, grounded in the Shannon-Hartley theorem. By mapping model parameters to channel bandwidth and training tokens to signal power, our formulation explicitly captures the interaction between learning signal and intrinsic noise. This perspective reveals a fundamental Shannon capacity for LLMs: scaling model size or data without preserving a sufficient signal-to-noise ratio (SNR) inevitably amplifies noise, inducing a transition from monotonic improvement to U-shaped performance degradation. We validate our theory through experiments on Pythia and OLMo2 under perturbations, including Gaussian noise, quantization and supervised fine-tuning on math, QA and code tasks. The Shannon Scaling Law consistently outperforms classical scaling laws and recent perturbation-aware laws, achieving strong $R^2$ scores and accurately capturing loss basins missed by prior approaches. It also extrapolates: fitted on $\leq$6.9B Pythia models with $\leq$180B tokens, it predicts the unseen 12B model up to 307B tokens at pooled $R^2{=}0.847$, while monotonic baselines collapse.

RESULT

ScienceToStartup currently rates this 3.0/10 on the public viability pass. It also extrapolates: fitted on $\leq$6.9B Pythia models with $\leq$180B tokens, it predicts the unseen 12B model up to 307B tokens at pooled $R^2{=}0.847$,…

WHY NOW

LLM Training moved forward this cycle; last verified May 2026. Public score 3.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score3.0

PainA theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena.

Segment

LLM Training

Adoption evidence

No public code link in the paper record yet

Commercial read

3.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "baf6ff7d-3b01-478d-9f28-3c418003a080", "arxiv_id": "2605.23901", "canonical_route": "/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws", "endpoints": { "paper_pack": "/api/v1/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws/paper-pack", "build_passport": "/api/v1/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws", "normalized_query": "2605.23901", "route": "/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws", "paper_ref": "llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws#webpage", "url": "https://sciencetostartup.com/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws", "name": "LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws", "description": "A theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws#scholarlyArticle", "headline": "LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws", "description": "A theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena.", "url": "https://sciencetostartup.com/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws", "sameAs": "https://arxiv.org/abs/2605.23901", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.23901" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-22T17:59:38.000Z", "author": [ { "@type": "Person", "name": "Xu Ouyang" }, { "@type": "Person", "name": "Deyi Liu" }, { "@type": "Person", "name": "Yuhang Cai" }, { "@type": "Person", "name": "Jing Liu" }, { "@type": "Person", "name": "Yuan Yang" }, { "@type": "Person", "name": "Chen Zheng" }, { "@type": "Person", "name": "Thomas Hartvigsen" }, { "@type": "Person", "name": "Yiyuan Ma" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 3 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "LLM Training" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "LLM Training", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "LLMs as Noisy Channels: A Shannon Perspective on Model Capac", "item": "https://sciencetostartup.com/paper/llms-as-noisy-channels-a-shannon-perspective-on-model-capacity-and-scaling-laws" } ] } ] }

Competitive landscape

A theoretical framework models LLM training as information transmission over a noisy channel, explaining non-monotonic scaling phenomena.

Segment

LLM Training

Adoption evidence

No public code link in the paper record yet

Commercial read

3.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline