Tuesday, November 20, 2012

How to Avoid Application Failures in the Cloud: Part 4

This is the fourth in a series of five blog posts that examine how you can build cloud applications that are secure, scalable, and resilient to failures - whether the failures are in the application components or in the underlying cloud infrastructure itself. In this post we will look at application monitoring.

Monitoring


A key component of any successful application deployment — whether in the cloud or on premise — is the ability to know what is happening with your application at all times. This means monitoring the health of the application and being alerted when something goes wrong, preferably before it becomes noticeable to the application users. For on-premise applications, a wealth of solutions is available, such as HP’s Application Performance Management and Business Availability Center products. Most of the cloud infrastructure providers offer similar capabilities for your applications in the cloud. On Amazon EC2, application monitoring is provided by CloudWatch.

CloudWatch provides visibility into the state of your application running in the Amazon cloud and provides the tools necessary to quickly — and, in many cases, automatically — correct problems by launching new application instances or taking other corrective actions, such as gracefully handling component failures with minimal user disruption.

Cloudwatch allows you to monitor your application instances using pre-defined and user-defined alerts and alarms. If an alarm threshold is breached for a specified period of time (such as more than three monitoring periods), CloudWatch will trigger an alert. The alert can be a notification, such as an email message or an SMS text message sent to a system administrator, or it can be a trigger to automatically take action to try to rectify the problem. For example, the alert might be the trigger for the EC2 auto-scaling feature to start up new application instances or to run a script to change some configuration settings (e.g. remap an elastic IP Address to another application instance).


In the final post we'll look at a real life example of how all of the features that I've described over the first four posts in the series are used to create a secure, scalable and resilient service offering. 

No comments:

Post a Comment