Saturday, November 10, 2012

How to Avoid Application Failures in the Cloud: Part 1

This is the first of a series of five blog posts that examine how you can build cloud applications that are secure, scalable, and resilient to failures - whether the failures are in the application components or in the underlying cloud infrastructure itself.

When people think of “the cloud,” they tend to imagine an amorphous thing that is always there, always on. However, the truth is that the cloud — or, rather, applications running in the cloud — can suffer from failures just like those running on your on-premise systems. This became painfully clear in June, 2012, when an electrical storm in the mid-Atlantic region of the United States knocked out power to an Amazon data center in Virginia, resulting in temporary outages to services such as Netflix and Instagram. Similarly, in 2011, a transformer failure in Dublin, Ireland affected Amazon and Microsoft data centers, bringing down some cloud services for up to two days.  And, as recently as October of 2012, a problem with the storage component of the Amazon EC2 infrastructure caused disruptions for sites including Pinterest, reddit, TMZ, and Heroku.

As these examples show, the cloud itself is not immune to failures. But there are things you can do to protect your applications running in the cloud. In this series of blog posts, we will discuss some of the ways you can make your cloud applications more reliable and less prone to failures.

When looking at improving the resilience and reliability of your applications, you need to consider the following four factors:
  1. Security: Is your application protected against intrusion?
  2. Scalability and Availability: How can you make your application respond effectively to changing demand and, at the same time, protect against component failures?
  3. Disaster Recovery: What happens if, as in the examples above, an entire data center fails?
  4. Monitoring: How do you know when you have problems? And how can you respond quickly enough to prevent outages?
We will look at each of these factors in the context of an application running in the Amazon EC2 cloud infrastructure, as this is the environment in which Axway has the most experience. (Other cloud providers, such as Rackspace, provide similar capabilities.)

Security


Obviously, application security is very important to every organization. Preventing unwanted and unauthorized access to applications and data is critical because the consequences of a security breach, including potential data loss and exposure of confidential information, can be extremely costly in both financial and business terms.

When you are running applications in your own on-premise data center, your IT department can configure and manage security using well-tested methods such as firewalls, DMZs, routers, and secure proxy servers. They can create multi-layered security zones to protect internal applications, with each layer becoming more restrictive in terms of how and by whom it can be accessed. For example, the outer layer might allow access via certain standard ports (e.g. port 80 for HTTP traffic, port 115 for SFTP traffic, port 443 for secure HTTP traffic (SSL), and so on). The next layer might restrict inbound access to certain secure ports and only from servers in the adjacent layer — so, if you have a highly secure inner layer containing your database(s), you can allow access only via Port 1521 (the standard port used by Oracle database servers) and only from servers in the application layer.

When you move to the cloud, however, you are relying on others (the cloud infrastructure providers) to provide these security capabilities on your behalf. But even though you are outsourcing some of these security functions, you are not powerless when it comes to making your applications more secure and less susceptible to security breaches.

Amazon EC2 Security Groups


Amazon EC2 provides a feature called “security groups” that allows you to recreate the same type of security zone protection and isolation you can achieve with on-premise systems. You can use Amazon EC2 security groups to create a DMZ/firewall-like configuration, even though you don’t have access or control of the physical routers within the EC2 cloud. This allows you to isolate and protect the different layers of your application stack to protect against unauthorized access and data loss. Based on rules you define to control traffic, security groups provide different levels of protection and isolation within a multi-tier application by acting as a firewall for a specific set of Amazon EC2 instances. (See Figure 1)

 
Figure 1 - Amazon EC2 Security Groups

In this example, three different security groups are used to isolate and protect the three tiers of the cloud application: the web server tier, the application server tier, and the database server tier.
  • Web server security group: All of the instances of the web server are assigned to the WebServerSG security group, which allows inbound traffic on ports 80 (HTTP) and 443 (HTTPS) only — but from anywhere on the Internet. This makes the web server instances open to anyone who knows their URL, but access is restricted to the standard ports for HTTP and HTTPS traffic. This is typical practice for anyone configuring an on-premise web server. By defining security groups, you can have the same type of configuration in the Amazon EC2 cloud.
  • Application server security group: The AppServerSG security group restricts inbound application server access to those instances in the previously defined WebServerSG security group or to developers using SSH (port 22) from the corporate network. This illustrates a couple of important capabilities of security groups:
    1. You can specify other security groups as a valid source of inbound traffic.
    2. You can restrict inbound access by IP address.
    Specifying other defined security groups as a valid source of inbound traffic means that you can dynamically scale the web server group to meet demand by launching new web server instances — without having to update the application server security group configuration. All instances in the web server security group are automatically allowed access to the application servers based on the application server security group rule. Being able to restrict inbound access by IP address means that you can open ports within the security group, but only allow access by known (and presumably friendly) sources. In our example, we allow access to the application servers via SSH (for updates, etc.) only to developers connecting from the corporate network.
  • Database server security group: The DBServerSG security group is used to control access to the database server instances. Because this tier of the application contains the data, access is more restricted than the other layers. In our example, only the application server instances in the AppServerSG security group can access the database servers. All other access is denied by the security group filters. In addition to restricting access to the instances in the AppServerSG security group, you can also restrict the access to certain ports.  In our case, we’ve restricted access from the application servers so they can use only port 1521, the standard Oracle port.

In the next blog post in this series, we'll look at scalability and availability.

No comments:

Post a Comment