"Red-Teaming LLM Applications: Building Test Suites for Jailbreaks and Abuse"
LLM security has outgrown one-off jailbreak demos. This book is for experienced engineers, security practitioners, and technical leaders who need a repeatable way to find—and keep finding—the failures that matter in production LLM applications. You'll learn to treat red-teaming as an engineering discipline: define the real system under test (models, orchestration, tools, data, policies), set measurable success criteria, and operate within clear rules of engagement.
The core of the book is a practical method for converting architecture and threat modeling into durable test suites. You'll map attack surfaces across user input, RAG pipelines, tool/function calls, memory, and multi-tenant state; then design coverage models, parameterized scenarios, and flake-resistant oracles (rules, classifiers, LLM-as-judge, and human review). Dedicated chapters build deep capability in jailbreak families, instruction-hierarchy failures, prompt injection (direct and indirect), RAG retrieval abuse and corpus poisoning, agentic escalation, and high-impact data exfiltration—always with evidence capture and minimal reproduction packs that drive remediation.
The differentiator is operationalization: CI/CD harness design with sandboxing and mocks, metrics and reporting aligned to OWASP LLM Top 10 and NIST AI RMF, verification testing for defense-in-depth controls, and monitoring/runbooks that turn incidents into regression tests. Readers should be comfortable with threat modeling, testing, and modern LLM application st