A single dramatic software failure can cost a company millions of dollars - but can be avoided with simple changes to design and architecture. This new edition of the best-selling industry standard shows you how to create systems that run longer, with fewer failures, and recover better when bad things happen. New coverage includes DevOps, microservices, and cloud-native architecture. Stability antipatterns have grown to include systemic problems in large-scale systems. This is a must-have pragmatic guide to engineering for production systems.
If you're a software developer, and you don't want to get alerts every night for the rest of your life, help is here. With a combination of case studies about huge losses - lost revenue, lost reputation, lost time, lost opportunity - and practical, down-to-earth advice that was all gained through painful experience, this book helps you avoid the pitfalls that cost companies millions of dollars in downtime and reputation. Eighty percent of project life-cycle cost is in production, yet few books address this topic.
This updated edition deals with the production of today's systems - larger, more complex, and heavily virtualized - and includes information on chaos engineering, the discipline of applying randomness and deliberate stress to reveal systematic problems. Build systems that survive the real world, avoid downtime, implement zero-downtime upgrades and continuous delivery, and make cloud-native applications resilient. Examine ways to architect, design, and build software - particularly distributed systems - that stands up to the typhoon winds of a flash mob, a Slashdotting, or a link on Reddit. Take a hard look at software that failed the test and find ways to make sure your software survives.
To skip the pain and get the experience...get this book.
About the Author
Michael Nygard has been a professional programmer and architect for over 15 years. He has delivered systems to the U. S. Government, the military, banking, finance, agriculture, and retail industries. Michael has written numerous articles and editorials, spoken at Comdex, and coauthored one of the earliest Java books.
Michael has designed, built, and engineered systems for B2B exchanges, retail commerce sites, travel and leisure sites, an information brokerage, and web applications for the intelligence community.
Among other exciting projects in his position as Director of Engineering for Totality Corporation, Michael led the operations team through the launch of a tier 1 retail site. His experience with the birth and infancy of this retail platform gives him a unique perspective on building software for high performance and high reliability in the face of an actively hostile environment.