In LLM-based systems, persistence refers to an attacker's ability to maintain control or influence over the system over time, often achieved through 'memory and retrieval poisoning.' It is a critical stage in the promptware kill chain, analogous to traditional malware campaigns.
Persistence in LLM security refers to an attacker's ability to maintain control over an AI system, similar to how malware stays on a computer. It's achieved by subtly altering the LLM's memory or data sources, allowing for ongoing malicious actions without needing to restart the attack.
LLM persistence, promptware persistence, memory poisoning, retrieval poisoning
Was this definition helpful?