arxiv:2506.08837

Design Patterns for Securing LLM Agents against Prompt Injections

Published on Jun 10, 2025

Authors:

Abstract

Principled design patterns are proposed to build AI agents with resistance to prompt injection attacks, balancing utility and security.

Generated by Qwen/Qwen2.5-Coder-32B-Instruct

As AI agents powered by Large Language Models (LLMs) become increasingly versatile and capable of addressing a broad spectrum of tasks, ensuring their security has become a critical challenge. Among the most pressing threats are prompt injection attacks, which exploit the agent's resilience on natural language inputs -- an especially dangerous threat when agents are granted tool access or handle sensitive information. In this work, we propose a set of principled design patterns for building AI agents with provable resistance to prompt injection. We systematically analyze these patterns, discuss their trade-offs in terms of utility and security, and illustrate their real-world applicability through a series of case studies.