Data systems outlive applications, frameworks, and infrastructure.
They encode decisions that shape what a system can become-and what it can never safely change.
Designing Modern Data Systems is a deep, decision-driven guide to building data systems that are reliable, scalable, and adaptable over time. Rather than focusing on tools or trends, this book teaches how to reason about architecture itself: how guarantees are chosen, where authority lives, how failures manifest, and how systems evolve under real-world pressure.
Written for experienced engineers and architects, the book treats data systems as long-lived sociotechnical systems-not just databases or pipelines. It focuses on clarity of responsibility, explicit trade-offs, and preserving meaning as data moves, changes, and ages.
This book takes a structured journey through modern data system design:
- How to define data systems as distinct from applications and infrastructure
- How non-functional requirements like reliability, availability, latency, and cost shape architecture long before technology choices
- How to design data models, storage engines, and indexing strategies that survive product evolution
- How to reason about replication, partitioning, coordination, and distributed transactions without accidental complexity
- How batch and stream processing fit into a unified view of data over time
- How logs, history, and derived data enable recovery, reprocessing, and safe change
- How to operate systems in production with observability, backpressure, and failure isolation
- How to design data systems that support machine learning and large language model platforms, including feature pipelines and embeddings
- How to migrate, evolve, and decommission systems without outages or loss of trust
Throughout the book, ideas are grounded in a single evolving reference system, allowing readers to see how architectural decisions accumulate and interact as requirements change.
What Makes This Book Different
- Decision-focused, not tool-driven
The book avoids product comparisons and instead teaches how to evaluate any technology within clear architectural constraints. - Explicit trade-offs, not recipes
Every design choice is examined in terms of what it enables, what it forbids, and what it costs. - Modern, without being trendy
AI and LLM systems are addressed where they introduce real architectural pressure-without hype or speculation. - Written for longevity
The principles in this book are designed to remain relevant as tools, platforms, and organizational structures change.
This book is written for:
- Software engineers designing backend and platform systems
- Data engineers responsible for storage, processing, and pipelines
- Staff, principal, and senior engineers shaping architectural direction
- Architects and technical leaders responsible for long-term system evolution
- Practitioners preparing for system design interviews who want judgment, not templates
This is not:
- A beginner's introduction to databases
- A step-by-step tutorial for specific tools
- A catalog of technologies or patterns
Instead, it is a book about how to think clearly about data systems, and how to design them so they remain understandable, trustworthy, and changeable over time.
If you are responsible for making architectural decisions-and living with their consequences-Designing Modern Data Systems is written for you.