AgenticRed is an automated pipeline that leverages LLMs' in-context learning to iteratively design and refine red-teaming systems without human intervention. It treats red-teaming as a system design problem, evolving agentic systems using evolutionary selection to expose model vulnerabilities.
AgenticRed is an AI system that automatically creates and improves other AI systems designed to find flaws and vulnerabilities in large language models. It does this by treating the problem as a design challenge and using an evolutionary process, leading to much better results than human-designed methods.
Was this definition helpful?