AWS EC2 Auto Recovery Using CloudWatch

Today I heard about a feature baked into CloudWatch that enables auto recovery of an EC2 instance if it ever fails status its status check. The really cool thing here is that it relaunches an instance with the exact same configuration; preserving any auto-assigned public IP's and using the current instance volumes.

There are two different types of status checks, System and Instance, that each instance is monitored for and that report as metrics to CloudWatch.

  • System status checks identify AWS infrastructure issues.
  • Instance status checks identify things like startup, networking, memory, file system and kernel issues.

What this option really does is enable automation of instance migration to a new host when and if it ever fails its system status check. This feature only works with/applicable to the "StatusCheckFailed_System" CloudWatch metric.

It looks like this feature requires VPC EBS Backed instances in shared tenancy with a limited set of instance types, that seems to cover the main current gen options. It also looks like it is available in most regions.

See: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-recover.html

NOTICE: All thoughts/statements in this article are mine alone and do not represent those of Amazon or Amazon Web services. All referenced AWS services and service names are the property of AWS. Although I have made every effort to ensure that the information in this article was correct at writing, I do not assume and hereby disclaim any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause.