Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind

Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind | Signal Canvas | ScienceToStartup

Page Freshness

Signal Canvas proof surface

Canonical route: /signal-canvas/selective-deficits-in-llm-mental-self-modeling-in-a-behavior-based-test-of-theory-of-mind

stale

Proof freshness: stale
Proof status: unverified
Display score: 7/10
Last proof check: 2026-03-30
Score updated: 2026-04-02
Score fresh until: 2026-05-02
References: 17
Source count: 3
Coverage: 50%

This page is showing the last landed evidence receipt and score bundle because the latest proof data is outside the freshness window.

Agent Handoff

Canonical ID selective-deficits-in-llm-mental-self-modeling-in-a-behavior-based-test-of-theory-of-mind | Route /signal-canvas/selective-deficits-in-llm-mental-self-modeling-in-a-behavior-based-test-of-theory-of-mind

REST example

curl https://sciencetostartup.com/api/v1/agent-handoff/signal-canvas/selective-deficits-in-llm-mental-self-modeling-in-a-behavior-based-test-of-theory-of-mind

MCP example

{
  "tool": "search_signal_canvas",
  "arguments": {
    "mode": "paper",
    "paper_ref": "selective-deficits-in-llm-mental-self-modeling-in-a-behavior-based-test-of-theory-of-mind",
    "query_text": "Summarize Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind"
  }
}

source_context

{
  "surface": "signal_canvas",
  "mode": "paper",
  "query": "Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind",
  "normalized_query": "2603.26089",
  "route": "/signal-canvas/selective-deficits-in-llm-mental-self-modeling-in-a-behavior-based-test-of-theory-of-mind",
  "paper_ref": "selective-deficits-in-llm-mental-self-modeling-in-a-behavior-based-test-of-theory-of-mind",
  "topic_slug": null,
  "benchmark_ref": null,
  "dataset_ref": null
}

Evidence Receipt

Route status: building

Claims: 12

References: 17

Proof: Verification pending

Freshness state: computing

Source paper: Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind

PDF: https://arxiv.org/pdf/2603.26089v1

Source count: 3

Coverage: 50%

Last proof check: 2026-03-30T21:55:06.832Z

Signal Canvas receipt window

Watch and verify: Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind

/buildability/selective-deficits-in-llm-mental-self-modeling-in-a-behavior-based-test-of-theory-of-mind

Watchwatch

Subject: Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind

Verdict

Watch

Verdict is Watch because viability or proof quality is intermediate and should be re-evaluated before execution.

Preparing verified analysis

GitHub Code Pulse

No public code linked for this paper yet.

Claim map

Strong 12Mixed 0Weak 0

Evidencepartial
LLMs released before mid-2025 fail at all of our tasks
Implicationpartial
This is explicitly stated in the abstract and supported by the findings presented in the figures and text.
Verificationpartial
partial
Evidencepartial
more recent LLMs achieve human-level performance on modeling the cognitive states of others
Implicationpartial
This is explicitly stated in the abstract and supported by the text indicating an upward trend for 'other-modeling' tasks with recent LLMs.
Verificationpartial
partial
Evidencepartial
even frontier LLMs fail at our self-modeling task - unless afforded a scratchpad in the form of a reasoning trace
Implicationpartial
This is explicitly stated in the abstract and supported by the text contrasting performance with and without a scratchpad.
Verificationpartial
partial
Evidencepartial
we further demonstrate cognitive load effects on other-modeling tasks, offering suggestive evidence that LLMs are using something akin to limited-capacity working memory to hold these mental representations in mind during a single forward pass
Implicationpartial
The abstract suggests this based on observed cognitive load effects, indicating suggestive evidence rather than a definitive conclusion.
Verificationpartial
partial
Evidencepartial
we show that they readily engage in strategic deception
Implicationpartial
The abstract states this as a finding from exploring the mechanisms by which reasoning models succeed.
Verificationpartial
partial
Evidencepartial
We therefore develop a novel experimental paradigm that requires that subjects form representations of the mental states of themselves and others and act on them strategically rather than merely describe them
Implicationpartial
The abstract clearly describes the development of a new paradigm with specific requirements.
Verificationpartial
partial
Evidencepartial
We test a wide range of leading open and closed source LLMs released since 2024, as well as human subjects, on this paradigm
Implicationpartial
The abstract explicitly mentions testing human subjects alongside LLMs.
Verificationpartial
partial
Evidencepartial
We find that 1) LLMs released before mid-2025 fail at all of our tasks
Implicationpartial
This is explicitly stated in the abstract and supported by the findings presented in Figure 2 (nonthinking models).
Verificationpartial
partial
Evidencepartial
2) more recent LLMs achieve human-level performance on modeling the cognitive states of others
Implicationpartial
This is explicitly stated in the abstract and supported by the trend shown in Figure 2 (nonthinking models) for other-modeling tasks.
Verificationpartial
partial
Evidencepartial
3) and even frontier LLMs fail at our self-modeling task - unless afforded a scratchpad in the form of a reasoning trace.
Implicationpartial
This is explicitly stated in the abstract and supported by the comparison between 'nonthinking' and 'thinking' models in Figure 3.
Verificationpartial
partial
Evidencepartial
We further demonstrate cognitive load effects on other-modeling tasks, offering suggestive evidence that LLMs are using something akin to limited-capacity working memory to hold these mental representations in mind during a single forward pass.
Implicationpartial
The abstract suggests this as 'suggestive evidence' based on observed cognitive load effects.
Verificationpartial
partial
Evidencepartial
Finally, we explore the mechanisms by which reasoning models succeed at the self- and other-modeling tasks, and show that they readily engage in strategic deception.
Implicationpartial
This is stated in the abstract as a finding from exploring the mechanisms of successful self- and other-modeling.
Verificationpartial
partial

Author intelligence and commercialization panels stay hidden until the proof receipt is verified, cites at least 3 references, includes at least 2 sources, and clears 50% coverage. The paper narrative and citation surfaces remain public while verification is pending.

Selective Deficits in LLM Mental Self-Modeling in a Behavior-Based Test of Theory of Mind

Use Signal Canvas as the narrative proof surface

Use this Signal Canvas via API or MCP

Signal Canvas proof surface