Wednesday, March 13, 2013

How to Avoid Application Failures in the Cloud: Part 5

In this final post of the series of five, we'll look at a real life example of how Axway has used the features that I've described over the first four posts to create a secure, scalable and resilient service offering.

Apologies for taking so long to post this final part of the 'Cloud Failure' series, but I was waiting for my employer (Axway) to prepare and publish a whitepaper version of these posts on the Axway website. You can download the whitepaper from here - it does require (very brief) registration.

Putting It All Together — A Real-Life Example


We’ll finish this series of five blog posts with a look at a real-life example of how the features we have discussed are used to provide a highly secure and reliable service built on the Amazon EC2 cloud infrastructure — the Axway Cloud.

The Axway Cloud provides a set of business integration capabilities including B2B and managed file transfer (MFT) interactions. To illustrate how the previously described Amazon EC2 features can be combined to deliver an enterprise-scale service, we’ll look at the Axway Cloud MFT Service.

The Axway Cloud MFT Service provides a complete platform for secure, auditable and managed transfer of critical business files between two parties, whether they are in the same organization or in different organizations. This comprehensive cloud-based solution dramatically simplifies deployment and management by providing a fully configured system running on a highly scalable and flexible cloud infrastructure. The deployment architecture of the Axway Cloud MFT Service is illustrated in Figure 4.

Figure 4 - Axway Cloud MFT Service Deployment Architecture

Security groups: The Axway Cloud MFT Service is a classic multi-tiered application. Each tier is isolated and protected using EC2 security groups.
  • The outward-facing Edge security group opens the appropriate ports for the protocols selected by the user (e.g. port 20 for FTP, port 22 for SFTP, port 80 for HTTP, port 443 for HTTPS, and so on). These ports are open to any source, so there is no IP source filtering.
  • The ST security group restricts inbound traffic to port 4455 and only from Edge Service instances in the Edge security group. All other inbound traffic to the ST Service instances is blocked by the security group filters.
  • The DB security group opens port 1521 to the ST Service instances in the ST security group. All other inbound traffic is blocked.

Elastic Load Balancing (ELB) and auto scaling: ELB instances are used to share the application load over the Edge Service instances and also over the ST Service instances. In each case, the minimum number of instances running is two, and auto scaling is used to automatically increase the number of instances as the load on the application increases.

Elastic Block Store (EBS): The ST Service instances store configuration information, log files and file transfer restart data in a local Elastic Block Store (EBS) volume. The EBS volume is shared with all service instances. Periodic snapshots of the EBS volume (which are effectively incremental backups of the volume) are replicated to a standby copy of the MFT Service residing in a separate Amazon Availability Zone. This standby copy of the MFT Service acts as a disaster recovery instance in case of a complete failure of the primary, or production, copy of the service.

Similarly, the Oracle DB Server uses another EBS volume for a database instance. A snapshot of this volume is copied to the disaster recovery instance of the MFT Service to replicate the database information in case of a failure in the primary Availability Zone.

Disaster recovery: Axway’s Cloud Operations team has also created a set of scripts that utilize Amazon’s Web Services APIs to verify that the MFT Service instances in the disaster recovery Availability Zone are available and ready to start up if the primary Availability Zone fails. These scripts run on a regular and frequent basis, giving Axway the additional confidence that, in the case of a complete Availability Zone failure, the disaster recovery mechanism will switch over to the remote MFT Service instances and that these instances will provide continuous service to Axway’s customers.

Summary


The Axway Cloud MFT Service utilizes Amazon EC2 features to provide a cloud-based service that is not only highly secure, but is also designed to be resilient in the event individual component failures or a complete failure of a data center, such as those suffered by Amazon data centers in Virginia and Ireland. The result is an enterprise-scale service that enables organizations to exchange files containing confidential and business critical information securely and reliably in the cloud.

Conclusion


Despite some people believing that applications running in the cloud never fail and are always available, component failure or a wider problem at a data center can cause outages just like they can for on-premise applications. To help ensure availability and performance for cloud applications like the Axway Cloud MFT Service, cloud infrastructure providers including Amazon have delivered a set of infrastructure features that allow you to design and build secure, scalable and resilient applications that will meet the needs of your organization.