When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making

When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making | Signal Canvas | ScienceToStartup

Page Freshness

Signal Canvas proof surface

Canonical route: /signal-canvas/when-names-change-verdicts-intervention-consistency-reveals-systematic-bias-in-llm-decision-making

stale

Proof freshness: stale
Proof status: unverified
Display score: 8/10
Last proof check: 2026-04-02
Score updated: 2026-04-02
Score fresh until: 2026-05-02
References: 0
Source count: 0
Coverage: 17%

This page is showing the last landed evidence receipt and score bundle because the latest proof data is outside the freshness window.

Agent Handoff

Canonical ID when-names-change-verdicts-intervention-consistency-reveals-systematic-bias-in-llm-decision-making | Route /signal-canvas/when-names-change-verdicts-intervention-consistency-reveals-systematic-bias-in-llm-decision-making

REST example

curl https://sciencetostartup.com/api/v1/agent-handoff/signal-canvas/when-names-change-verdicts-intervention-consistency-reveals-systematic-bias-in-llm-decision-making

MCP example

{
  "tool": "search_signal_canvas",
  "arguments": {
    "mode": "paper",
    "paper_ref": "when-names-change-verdicts-intervention-consistency-reveals-systematic-bias-in-llm-decision-making",
    "query_text": "Summarize When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making"
  }
}

source_context

{
  "surface": "signal_canvas",
  "mode": "paper",
  "query": "When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making",
  "normalized_query": "2603.18530",
  "route": "/signal-canvas/when-names-change-verdicts-intervention-consistency-reveals-systematic-bias-in-llm-decision-making",
  "paper_ref": "when-names-change-verdicts-intervention-consistency-reveals-systematic-bias-in-llm-decision-making",
  "topic_slug": null,
  "benchmark_ref": null,
  "dataset_ref": null
}

Evidence Receipt

Route status: building

Claims: 7

References: Pending verification

Proof: Verification pending

Freshness state: computing

Source paper: When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making

PDF: https://arxiv.org/pdf/2603.18530v1

Source count: Pending verification

Coverage: 17%

Last proof check: 2026-04-02T02:30:40.136Z

Signal Canvas receipt window

Watch and verify: When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making

/buildability/when-names-change-verdicts-intervention-consistency-reveals-systematic-bias-in-llm-decision-making

Watchwatch

Subject: When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making

Verdict

Watch

Verdict is Watch because viability or proof quality is intermediate and should be re-evaluated before execution.

Preparing verified analysis

GitHub Code Pulse

No public code linked for this paper yet.

Claim map

Strong 7Mixed 0Weak 0

Evidencepartial
We introduce ICE-Guard, a framework applying intervention consistency testing to detect three types of spurious feature reliance: demographic (name/race swaps), authority (credential/prestige swaps), and framing (positive/negative restatements).
Implicationpartial
The abstract explicitly introduces ICE-Guard and its purpose.
Verificationpartial
partial
Evidencepartial
we find that (1) authority bias (mean 5.8%) and framing bias (5.0%) substantially exceed demographic bias (2.2%), challenging the field's narrow focus on demographics;
Implicationpartial
The abstract provides specific percentages for each bias type, directly comparing them.
Verificationpartial
partial
Evidencepartial
(2) bias concentrates in specific domains -- finance shows 22.6% authority bias while criminal justice shows only 2.8%;
Implicationpartial
The abstract provides specific domain examples and their corresponding bias percentages.
Verificationpartial
partial
Evidencepartial
(3) structured decomposition, where the LLM extracts features and a deterministic rubric decides, reduces flip rates by up to 100% (median 49% across 9 models).
Implicationpartial
The abstract quantifies the reduction in flip rates achieved by structured decomposition.
Verificationpartial
partial
Evidencepartial
We demonstrate an ICE-guided detect-diagnose-mitigate-verify loop achieving cumulative 78% bias reduction via iterative prompt patching.
Implicationpartial
The abstract states the cumulative bias reduction achieved by the proposed loop.
Verificationpartial
partial
Evidencepartial
Validation against real COMPAS recidivism data shows COMPAS-derived flip rates exceed pooled synthetic rates, suggesting our benchmark provides a conservative estimate of real-world bias.
Implicationpartial
The abstract directly compares real and synthetic data flip rates and draws a conclusion about the benchmark's conservatism.
Verificationpartial
partial
Evidencepartial
Across 3,000 vignettes spanning 10 high-stakes domains, we evaluate 11 LLMs from 8 families
Implicationpartial
The abstract explicitly states the number of LLMs and families evaluated.
Verificationpartial
partial

Author intelligence and commercialization panels stay hidden until the proof receipt is verified, cites at least 3 references, includes at least 2 sources, and clears 50% coverage. The paper narrative and citation surfaces remain public while verification is pending.

When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making

Use Signal Canvas as the narrative proof surface

Use this Signal Canvas via API or MCP

Signal Canvas proof surface