
Cloud consultant Asanka Nissanka looks at how to use AWS Fault Injection Simulator (FIS) for chaos testing.
One of the most common misconceptions about cloud-based systems is that they are inherently reliable and don’t require failure testing. Yet under the shared responsibility model, cloud vendors are only responsible for reliability of the cloud, not workloads in the cloud. This blogpost looks at how to enhance the reliability of workloads in Amazon Web Services (AWS) with failure in mind.
Firstly, all workloads should be designed to withstand failures. Ten years after the launch of Amazon, CTO Werner Vogels said a key lesson he’d learnt was that:
Failures are a given and everything will eventually fail over time: from routers to hard disks, from operating systems to memory units corrupting TCP packets, from transient errors to permanent failures. This is a given, whether you are using the highest-quality hardware or lowest cost components.
…