Fisher Scopes

Fisher Scopes is a specific variant within the Jacobian Scopes framework, a suite of gradient-based, token-level causal attribution methods developed for interpreting predictions made by large language models (LLMs). Its precise technical definition is a method that quantifies the sensitivity of the *full predictive distribution* with respect to input tokens. This is achieved by analyzing the linearized relations between the final hidden state and the inputs, effectively measuring how each prior token causally influences the entire probability distribution over the next token. Fisher Scopes matters because it addresses the challenge of understanding which parts of the input context most strongly influence an LLM's complex predictions, especially given the deep and intricate architectures of modern models. By providing fine-grained insights into the causal impact of input tokens, it helps researchers and ML engineers diagnose model behavior, uncover biases, and shed light on mechanisms like in-context learning. It is primarily used in LLM interpretability research, bias detection, and understanding complex model behaviors in applications like instruction understanding, translation, and time-series forecasting.

Key Aspects of Fisher Scopes

Purpose and Focus of Fisher Scopes: Fisher Scopes specifically targets the sensitivity of an LLM's *full predictive distribution* to input tokens. Unlike other variants that might focus on specific logits or confidence, Fisher Scopes provides a holistic view of how context shapes the entire probability landscape of the next token prediction.
Gradient-Based Attribution in Fisher Scopes: As a member of the Jacobian Scopes suite, Fisher Scopes operates as a gradient-based method. It quantifies the influence of input tokens by analyzing the linearized relationships between the final hidden state and the inputs, providing token-level causal attribution for LLM predictions.

Fisher Scopes within the Jacobian Scopes Framework

Suite of Interpretability Methods: Fisher Scopes is one of three variants of Jacobian Scopes, alongside Semantic Scopes and Temperature Scopes. While Semantic Scopes target specific logit sensitivities and Temperature Scopes focus on model confidence, Fisher Scopes uniquely addresses the impact on the overall predictive distribution.

Key Aspects of Fisher Scopes

Fisher Scopes within the Jacobian Scopes Framework

Applications and Insights from Fisher Scopes

Sources

At a glance

Executive summary

TL;DR

Key points

Use cases

Also known as

Related topics