Buildability / Receipt

AI Agent Capabilities: How Good Are LLMs in Real-World Tasks?

This public receipt window renders only fields present in the canonical receipt object, deterministic fixture receipt, or canonical evidence receipt. Missing compute, demo, hash, signature, approval, telemetry, and adoption fields stay explicit.

API receipt Buildability hub

Public buildability page receipt window

Watch and verify: AI Agent Capabilities: How Good Are LLMs in Real-World Tasks?

/buildability/the-hierarchy-of-agentic-capabilities-evaluating-frontier-models-on-realistic-rl-environments