SHARP

Gold definitionUpdated Apr 2, 2026

Definition

SHARP (Social Harm Analysis via Risk Profiles) is a framework for multidimensional, distribution-aware evaluation of social harm in large language models (LLMs). It models harm as a multivariate random variable, decomposing it into bias, fairness, ethics, and epistemic reliability, using risk-sensitive statistics like CVaR95 to characterize worst-case behavior.

At a glance

Executive summary

SHARP is a new framework designed to thoroughly evaluate the potential for social harm in advanced AI models (LLMs), especially in critical applications. It goes beyond simple average scores by looking at many types of harm and focusing on the worst-case scenarios, revealing hidden risks that standard tests often miss.

TL;DR

SHARP is a system for evaluating AI models that deeply analyzes various types of social harm and focuses on worst-case failures, rather than just average performance, to uncover hidden risks.

Key points

Models harm as a multivariate random variable and uses risk-sensitive statistics like CVaR95 to identify worst-case behavior.
Solves the problem of prevailing evaluation benchmarks obscuring distributional structure and severe, rare failures in LLMs.
Used by researchers and ML engineers deploying LLMs in high-stakes domains where failures can cause irreversible harm.
Unlike mean-centered scalar scores, SHARP provides a multidimensional, distribution-aware evaluation of social harm.
Represents a key research trend towards robust, safety-critical evaluation for LLMs, particularly concerning tail risks and ethical AI.

Use cases

Evaluating LLMs used in medical diagnostics to quantify and mitigate risks of severe misdiagnosis or biased treatment recommendations.
Assessing fairness and ethical compliance of LLMs deployed in financial services for loan applications or investment advice.
Identifying extreme ethical violations or harmful content generation in LLMs used for public-facing information or creative tasks.
Benchmarking new LLM releases for deployment in legal or governmental advisory roles to ensure minimal tail exposure to critical errors.

Also known as

SHARP

SHARP

Definition

At a glance

Executive summary

TL;DR

Key points

Use cases

Also known as

Related papers

Related topics