"Redis Vector Similarity: Real-Time Retrieval and Caching for AI Apps"
Modern AI systems live or die by retrieval quality and latency. This book is written for experienced engineers, platform architects, and senior application developers who need Redis to do more than act as a fast cache: they need it to serve as real-time retrieval infrastructure for semantic search, RAG pipelines, and inference-adjacent workloads. It frames Redis as an operational AI platform and shows where it fits, where it does not, and how to design around those boundaries.
Readers will learn how embeddings, distance metrics, and scoring semantics shape retrieval behavior; how to model vector-enabled data with HASH and JSON; how to design and optimize KNN, range, and hybrid filtered queries; and how to choose among FLAT, HNSW, and newer retrieval options such as vector sets. The book also covers semantic and key-based caching, update and reindexing flows, latency budgeting, memory planning, and production RAG architecture, with a strong emphasis on trade-offs, operational correctness, and version-aware design decisions.
Rather than treating vector search as an isolated feature, the book connects retrieval, indexing, and caching into a coherent serving strategy for AI applications. Familiarity with Redis, distributed systems, and modern ML application patterns is assumed, making this a practical, architecture-driven guide for readers who want to build fast, maintainable, and production-safe AI systems.