Buildability / Receipt
Removing Sandbagging in LLMs by Training with Weak Supervision
This public receipt window renders only fields present in the canonical receipt object, deterministic fixture receipt, or canonical evidence receipt. Missing compute, demo, hash, signature, approval, telemetry, and adoption fields stay explicit.
Public buildability page receipt window
Watch and verify: Removing Sandbagging in LLMs by Training with Weak Supervision
/buildability/removing-sandbagging-in-llms-by-training-with-weak-supervision
Subject: Removing Sandbagging in LLMs by Training with Weak Supervision
Verdict
Watch
Verdict is Watch because viability or proof quality is intermediate and should be re-evaluated before execution.
Time to first demo
Insufficient data
No first-demo timestamp, owner estimate, or elapsed demo receipt is attached to this surface.
Compute envelope
Structured compute envelope
Insufficient data
No data, compute, hardware, memory, latency, dependency, or serving requirement receipt is attached.
Evidence ids
Receipt path
/buildability/removing-sandbagging-in-llms-by-training-with-weak-supervision
Paper ref
removing-sandbagging-in-llms-by-training-with-weak-supervision
arXiv id
2604.22082
Freshness
Generated at
2026-04-27T20:16:11.802Z
Evidence freshness
fresh
Last verification
2026-04-27T20:16:11.802Z
Sources
3
References
0
Coverage
50%
Hash state
Lineage hash
3d1019c1ec4ffea5debc62f7a9d1ce14cf69b723d62cb290c11b6f6df163129c
Canonical opportunity-kernel lineage hash.
Signature state
External signature
unsigned_external
No founder, registry, pilot, or production-adoption signature is attached to this receipt.
Verification
not_verified
Verification is blocked until an external signature is provided.
Blockers
- Missing: repo_url
- Missing: references
- Missing: proof_status
- Unknown: proof verification has not been recorded yet
Canonical opportunity-kernel evidence is available for this receipt window.
repo_url
references
Truth Boundary
External gate remains unresolved for live deployment claims.
Buildability surfaces only report computed viability and proof receipts. They do not claim live production usage, pilot outcomes, founder sign-off, public Brier calibration, judge divergence, or external adoption unless explicitly sourced.