5Evaluation Cards: An Interpretive Layer for AI Evaluation ReportingAI Evaluation1d35.8
6Multi-Turn Evaluation of Deep Research Agents Under Process-Level FeedbackUncategorized1d34.79
7An 84-Format Numeric Catalog with Bit-Exact Conformance Vectors: A Vendor-Neutral Reference for FP8, BF16, MXFP4, and Microscaling FormatsUncategorized1d34.02