Linear transformers are a class of transformer models designed to reduce the quadratic computational complexity of the self-attention mechanism to linear with respect to sequence length. They achieve this by reformulating the attention operation, making them more efficient for processing very long sequences.
Linear transformers are a more efficient version of the popular transformer AI models, designed to handle much longer data sequences like text or DNA. They achieve this by simplifying how the model pays attention to different parts of the input, making them faster and less demanding on computer memory than standard transformers.
Linformer, Performer, Reformer (with LSH attention), Linear Attention, FNet
Was this definition helpful?