Rotated Robustness: A Training-Free Defense against Bit-Flip Attacks on Large Language Models | Signal Canvas | ScienceToStartup

← Back to Paper

Rotated Robustness: A Training-Free Defense against Bit-Flip Attacks on Large Language Models

Stale68d agoVerification pending / evidence receipt incomplete

Export Brief Open in Build Loop Connect with Author

Use This Via API or MCP

Use this Signal Canvas via API or MCP

Route this paper proof surface into REST, MCP, or developer workflows while preserving the same evidence receipt and related-resource context.

Signal Canvas guide REST guide MCP guide

Page Freshness

Signal Canvas proof surface

Canonical route: /signal-canvas/rotated-robustness-a-training-free-defense-against-bit-flip-attacks-on-large-language-models

stale

Proof freshness: stale
Proof status: unverified
Display score: 8/10
Last proof check: 2026-04-02
Score updated: 2026-04-02
Score fresh until: 2026-05-02
References: 0
Source count: 0
Coverage: 17%

This page is showing the last landed evidence receipt and score bundle because the latest proof data is outside the freshness window.

Agent Handoff

Rotated Robustness: A Training-Free Defense against Bit-Flip Attacks on Large Language Models

Canonical ID rotated-robustness-a-training-free-defense-against-bit-flip-attacks-on-large-language-models | Route /signal-canvas/rotated-robustness-a-training-free-defense-against-bit-flip-attacks-on-large-language-models

REST example

curl https://sciencetostartup.com/api/v1/agent-handoff/signal-canvas/rotated-robustness-a-training-free-defense-against-bit-flip-attacks-on-large-language-models

MCP example

{
  "tool": "search_signal_canvas",
  "arguments": {
    "mode": "paper",
    "paper_ref": "rotated-robustness-a-training-free-defense-against-bit-flip-attacks-on-large-language-models",
    "query_text": "Summarize Rotated Robustness: A Training-Free Defense against Bit-Flip Attacks on Large Language Models"
  }
}

source_context

{
  "surface": "signal_canvas",
  "mode": "paper",
  "query": "Rotated Robustness: A Training-Free Defense against Bit-Flip Attacks on Large Language Models",
  "normalized_query": "2603.16382",
  "route": "/signal-canvas/rotated-robustness-a-training-free-defense-against-bit-flip-attacks-on-large-language-models",
  "paper_ref": "rotated-robustness-a-training-free-defense-against-bit-flip-attacks-on-large-language-models",
  "topic_slug": null,
  "benchmark_ref": null,
  "dataset_ref": null
}

Paper mode· single-doc scopescope: rotated-robustness-a-training-free-defense-against-bit-flip-attacks-on-large-language-models

Preparing verified analysis

GitHub Code Pulse

No public code linked for this paper yet.

Claim map

Strong 8Mixed 0Weak 0

Evidencepartial
Hardware faults, specifically bit-flips in quantized weights, pose a severe reliability threat to Large Language Models (LLMs), often triggering catastrophic model collapses.
Implicationpartial
Directly stated in the abstract as the core problem being addressed
Verificationpartial
partial
Evidencepartial
We demonstrate that this vulnerability fundamentally stems from the spatial alignment between sensitive weight bits and extreme activation outliers
Implicationpartial
Directly stated as the root cause explanation in the abstract
Verificationpartial
partial
Evidencepartial
Under random bit-flip attacks, RoR reduces the stochastic collapse rate from 3.15% to 0.00% on Qwen2.5-7B.
Implicationpartial
Specific numeric result directly stated in the abstract
Verificationpartial
partial
Evidencepartial
under severe targeted attacks with 50 Progressive Bit Search flips, RoR sustains robust reasoning on Llama-2-7B, maintaining a 43.9% MMLU accuracy that nearly matches its 45.2% unattacked accuracy
Implicationpartial
Specific numeric results directly stated in the abstract with clear comparison
Verificationpartial
partial
Evidencepartial
against the Single-Point Fault Attack (SPFA) -- the most aggressive targeted threat -- RoR exponentially inflates the attack complexity from a few bits to over 17,000 precise bit-flips.
Implicationpartial
Specific numeric result directly stated in the abstract
Verificationpartial
partial
Evidencepartial
With a negligible storage overhead of 0.31% and a minimal inference latency increase of 9.1% on Llama-2-7B
Implicationpartial
Specific numeric results directly stated in the abstract
Verificationpartial
partial
Evidencepartial
we propose Rotated Robustness (RoR), a training-free defense utilizing orthogonal Householder transformations. By applying an orthogonal rotation to the activation space
Implicationpartial
Direct description of the method's core mechanism in the abstract
Verificationpartial
partial
Evidencepartial
mathematically guaranteeing original model accuracy
Implicationpartial
Directly stated in the abstract as a key property of the method
Verificationpartial
partial

Startup potential card

Startup potential card preview

Share on X LinkedIn