Applying chaos engineering principles to community resilience
Chaos engineering doesn’t just look at the software, it considers the entire system: software, hardware and people. This may involve multiple programs, run on many different servers, with input from people or from other programs. Periodically, user demand leads to addition of new features but with the expectation that the system will remain reliable. And yet, even if each individual program is operating “correctly” sometimes the system produces unreliable output. In general, faulty communications among the different parts of the distributed system most often are the root cause of these problems.