Strategic Product Management

Saturday, November 10, 2012

How to Avoid Application Failures in the Cloud: Part 1

This is the first of a series of five blog posts that examine how you can build cloud applications that are secure, scalable, and resilient to failures - whether the failures are in the application components or in the underlying cloud infrastructure itself.

When people think of “the cloud,” they tend to imagine an amorphous thing that is always there, always on. However, the truth is that the cloud — or, rather, applications running in the cloud — can suffer from failures just like those running on your on-premise systems. This became painfully clear in June, 2012, when an electrical storm in the mid-Atlantic region of the United States knocked out power to an Amazon data center in Virginia, resulting in temporary outages to services such as Netflix and Instagram. Similarly, in 2011, a transformer failure in Dublin, Ireland affected Amazon and Microsoft data centers, bringing down some cloud services for up to two days. And, as recently as October of 2012, a problem with the storage component of the Amazon EC2 infrastructure caused disruptions for sites including Pinterest, reddit, TMZ, and Heroku.

As these examples show, the cloud itself is not immune to failures. But there are things you can do to protect your applications running in the cloud. In this series of blog posts, we will discuss some of the ways you can make your cloud applications more reliable and less prone to failures.

When looking at improving the resilience and reliability of your applications, you need to consider the following four factors:

Security: Is your application protected against intrusion?
Scalability and Availability: How can you make your application respond effectively to changing demand and, at the same time, protect against component failures?
Disaster Recovery: What happens if, as in the examples above, an entire data center fails?
Monitoring: How do you know when you have problems? And how can you respond quickly enough to prevent outages?

We will look at each of these factors in the context of an application running in the Amazon EC2 cloud infrastructure, as this is the environment in which Axway has the most experience. (Other cloud providers, such as Rackspace, provide similar capabilities.)

Security

Obviously, application security is very important to every organization. Preventing unwanted and unauthorized access to applications and data is critical because the consequences of a security breach, including potential data loss and exposure of confidential information, can be extremely costly in both financial and business terms.

When you are running applications in your own on-premise data center, your IT department can configure and manage security using well-tested methods such as firewalls, DMZs, routers, and secure proxy servers. They can create multi-layered security zones to protect internal applications, with each layer becoming more restrictive in terms of how and by whom it can be accessed. For example, the outer layer might allow access via certain standard ports (e.g. port 80 for HTTP traffic, port 115 for SFTP traffic, port 443 for secure HTTP traffic (SSL), and so on). The next layer might restrict inbound access to certain secure ports and only from servers in the adjacent layer — so, if you have a highly secure inner layer containing your database(s), you can allow access only via Port 1521 (the standard port used by Oracle database servers) and only from servers in the application layer.

When you move to the cloud, however, you are relying on others (the cloud infrastructure providers) to provide these security capabilities on your behalf. But even though you are outsourcing some of these security functions, you are not powerless when it comes to making your applications more secure and less susceptible to security breaches.

Amazon EC2 Security Groups

Amazon EC2 provides a feature called “security groups” that allows you to recreate the same type of security zone protection and isolation you can achieve with on-premise systems. You can use Amazon EC2 security groups to create a DMZ/firewall-like configuration, even though you don’t have access or control of the physical routers within the EC2 cloud. This allows you to isolate and protect the different layers of your application stack to protect against unauthorized access and data loss. Based on rules you define to control traffic, security groups provide different levels of protection and isolation within a multi-tier application by acting as a firewall for a specific set of Amazon EC2 instances. (See Figure 1)

Figure 1 - Amazon EC2 Security Groups

In this example, three different security groups are used to isolate and protect the three tiers of the cloud application: the web server tier, the application server tier, and the database server tier.

Web server security group: All of the instances of the web server are assigned to the WebServerSG security group, which allows inbound traffic on ports 80 (HTTP) and 443 (HTTPS) only — but from anywhere on the Internet. This makes the web server instances open to anyone who knows their URL, but access is restricted to the standard ports for HTTP and HTTPS traffic. This is typical practice for anyone configuring an on-premise web server. By defining security groups, you can have the same type of configuration in the Amazon EC2 cloud.

Application server security group: The AppServerSG security group restricts inbound application server access to those instances in the previously defined WebServerSG security group or to developers using SSH (port 22) from the corporate network. This illustrates a couple of important capabilities of security groups:

You can specify other security groups as a valid source of inbound traffic.
You can restrict inbound access by IP address.

Specifying other defined security groups as a valid source of inbound traffic means that you can dynamically scale the web server group to meet demand by launching new web server instances — without having to update the application server security group configuration. All instances in the web server security group are automatically allowed access to the application servers based on the application server security group rule. Being able to restrict inbound access by IP address means that you can open ports within the security group, but only allow access by known (and presumably friendly) sources. In our example, we allow access to the application servers via SSH (for updates, etc.) only to developers connecting from the corporate network.

Database server security group: The DBServerSG security group is used to control access to the database server instances. Because this tier of the application contains the data, access is more restricted than the other layers. In our example, only the application server instances in the AppServerSG security group can access the database servers. All other access is denied by the security group filters. In addition to restricting access to the instances in the AppServerSG security group, you can also restrict the access to certain ports. In our case, we’ve restricted access from the application servers so they can use only port 1521, the standard Oracle port.

In the next blog post in this series, we'll look at scalability and availability.

Thursday, October 11, 2012

Engaging the Hybrid Cloud (Complete Post)

What is a “hybrid cloud”?

Is it 1) an environment where applications and processes exist both in the public and private cloud and on premise? Or is it 2) a combination public/private cloud without an on-premise component?

For the sake of this discussion, we’ll concede definition 1. Clarifying this concept is important because the vast majority of cloud-adopting organizations — which is to say the vast majority of organizations, period — are about to become hybrid-cloud-adopting organizations, and for good reason: they’re not ready to simply switch off their existing on-premise systems — legacy systems that already have significant business and operational value — and re-invent them in the cloud.

Let’s solidify this hybrid notion with a simple example of a business process nearly all organizations are familiar with: the HR onboarding process.

Onboarding begins. A cloud-based recruiting system like Taleo is used to identify a candidate. When the candidate is hired, the business process moves from the cloud-based recruiting system to the on-premise HR system.

Onboarding continues. The candidate is given systems access, login credentials, and an e-mail account. IT is cued to furnish the candidate with a laptop and other equipment. The office manager assigns the candidate an office space.

Onboarding concludes. HR moves the business process back to the cloud by using a cloud-based performance-management system like SumTotal, where new-hire details are updated.

Cloud. On-premise. Cloud again.

This isn’t some supposed future scenario. This hybridized process is happening now, throughout most organizations, and in many other departments besides HR. To ensure the success of those departments in a hybrid cloud environment, organizations should address three key issues: security, service level agreements (SLAs), and application integration.

Security

The move to the cloud does mean that security and data privacy — something that was previously your IT department’s concern — is now your cloud provider’s concern. Yet it doesn’t mean your organization is absolved from ensuring that the cloud provider is doing its part. You need to demand that the cloud provider is clear about how they secure and protect your customers’, partners’, and employees’ data — both when it’s stored in the cloud and when it’s transferred to and from your on-premise systems.

A cloud-based application in isolation is reason enough for insisting on a clear understanding of how your cloud provider stores your data. Imagine, then, how imperative a clear understanding becomes when that cloud-based application is no longer isolated but integrated into a hybrid cloud environment. It’s now transferring data out into the world — perhaps from an Amazon data center in Europe or the Pacific Northwest to your offices on the other side of the globe. Or perhaps it’s transferring data to your trading partner’s systems, where you have much less control over security and protection.

This spawns several questions you should ask your cloud provider:

Is the data encrypted both when it’s in motion and at rest?
If cloud-application access is via an application programming interface (API), is the security token secured and encrypted when it’s used in the API core?
What’s the security token’s lifetime? Is it per-session or permanent?
How easily could this security token be hijacked and reused?
Is the security token tied to IP addresses?

Getting solid answers to important questions like these will ensure that the cloud part of your hybrid environment is always serving your business and never compromising the strength of its security profile.

SLAs

What is your cloud-based application’s availability and reliability? When an application is hosted on-premise, availability and reliability is your responsibility, and if it’s critical to business operations, you put a lot of effort into maintaining it.

Again, with the move to the cloud, this becomes the cloud provider’s concern, but you still need to keep in mind the application’s role in the bigger picture. How well would the business tolerate moments of application unavailability and unreliability?

For example, if a cloud-based HR application wasn’t available for a day or two, it probably wouldn’t impact a supermarket’s business process.

However, if a cloud-based supply-chain application wasn’t available for even an hour or two, it would wreak havoc on a supermarket’s business process. The lack of availability would mean a lack of deliveries, empty shelves, and loss of revenue.

A thorough SLA will communicate to your cloud provider in no uncertain terms which applications your business counts on the most, and what the consequences will be should those applications fail.

Application integration

In order to reap the benefits and realize the full potential of your new cloud applications, you must embrace the term “hybrid” by fully integrating them with your existing, on-premise applications and business processes.

Questions to ask include:

How are you going to get data into or out of the cloud application and into your on-premise systems?
Does the cloud application have an API and/or support on-demand exchange of data?
Does the cloud application have a scheduled exchange (e.g., daily updates instead of on demand)?
Does the cloud application support standards like Web services, XML, etc.?

Further, how will integrating cloud applications affect your existing business processes?

For example, if you move from an old, back-end integration to an on-demand, real-time integration, will this have a knock-on effect (i.e., a secondary effect) with other applications, especially your on-premise applications? How will the applications accommodate this effect (particularly in light of the fact that you actually have less flexibility when integrating applications in the cloud, as you have to work with the integration points provided by the cloud application itself, not the on-premise points you’ve provided)?

By considering the above three key issues and answering the questions surrounding them, the daunting implications of our initial question, “What is a ‘hybrid cloud’?” will diminish. Organizations that aren’t ready to simply switch off their existing on-premise systems and re-invent them in the cloud can rest assured that they aren’t losing anything from holding onto a legacy system. Instead, they can benefit from a new approach — one that draws on the incomparable agility of the public/private cloud and the time-tested security profile of on-premise systems — and enjoy enhanced business operations using a hybridized whole that’s truly greater than the sum of its parts.

(This post was first published at http:blogs.axway.com)

Wednesday, October 10, 2012

Engaging the Hybrid Cloud: Part 3: Application Integration

Application integration

In order to reap the benefits and realize the full potential of your new cloud applications, you must embrace the term “hybrid” by fully integrating them with your existing, on-premise applications and business processes.

Questions to ask include:

How are you going to get data into or out of the cloud application and into your on-premise systems?
Does the cloud application have an API and/or support on-demand exchange of data?
Does the cloud application have a scheduled exchange (e.g., daily updates instead of on demand)?
Does the cloud application support standards like Web services, XML, etc.?

Monday, October 8, 2012

Engaging the Hybrid Cloud: Part 2: SLAs

SLAs

What is your cloud-based application’s availability and reliability? When an application is hosted on-premise, availability and reliability is your responsibility, and if it’s critical to business operations, you put a lot of effort into maintaining it.

Again, with the move to the cloud, this becomes the cloud provider’s concern, but you still need to keep in mind the application’s role in the bigger picture. How well would the business tolerate moments of application unavailability and unreliability?

For example, if a cloud-based HR application wasn’t available for a day or two, it probably wouldn’t impact a supermarket’s business process.

However, if a cloud-based supply-chain application wasn’t available for even an hour or two, it would wreak havoc on a supermarket’s business process. The lack of availability would mean a lack of deliveries, empty shelves, and loss of revenue.

A thorough SLA will communicate to your cloud provider in no uncertain terms which applications your business counts on the most, and what the consequences will be should those applications fail.

(TO BE CONTINUED)

(This post was first published at http:blogs.axway.com)

Sunday, September 30, 2012

Engaging the Hybrid Cloud: Part 1: Security

Onboarding begins. A cloud-based recruiting system like Taleo is used to identify a candidate. When the candidate is hired, the business process moves from the cloud-based recruiting system to the on-premise HR system.

Onboarding continues. The candidate is given systems access, login credentials, and an e-mail account. IT is cued to furnish the candidate with a laptop and other equipment. The office manager assigns the candidate an office space.

Onboarding concludes. HR moves the business process back to the cloud by using a cloud-based performance-management system like SumTotal, where new-hire details are updated.

Is the data encrypted both when it’s in motion and at rest?
If cloud-application access is via an application programming interface (API), is the security token secured and encrypted when it’s used in the API core?
What’s the security token’s lifetime? Is it per-session or permanent?
How easily could this security token be hijacked and reused?
Is the security token tied to IP addresses?

Wednesday, August 29, 2012

An Example of User Innovation

In an earlier post, I talked about user innovation and how to harness this inventive power for your products. I want to give an example of user innovation that I have come across when working for my previous company, SirsiDynix.

SirsiDynix builds software for libraries. The software products run many aspects of a library's operations, including the library's web presence. The web site allows library users to search for library materials online, check their availability, and, if desired, reserve them (to be picked up later). This functionality is termed the OPAC (Online Public Access Catalog) and is a basic component of almost all library management systems. SirsiDynix also provides a feature rich Web Services API for their library management system (called Symphony) which allows developers to access the data and functionality of the system. The Web Services API provides an interface to the Symphony system and is intended to allow developers to enhance and extend the base product.

The Role-Playing Game — How Enterprise IT Should Prepare for Cloud Adoption

I’m often asked, “What’s the biggest thing standing in the way of enterprise IT getting on board with cloud adoption?”

My response is always the same: “The biggest thing standing in the way of enterprise IT cloud adoption is IT’s unwillingness to accept that business units (BUs) are already adopting the cloud.”

By 2012, BUs are eager to flout IT authority and circumvent IT constraints in order to solve problems now rather than see their requests languish in IT’s backlog of special projects, hostage to unreasonable wait times.
Those days are over. IT now has two options: Get on board or get left behind.

I’m seeing this exact scenario in our customer organizations as well. Customer BUs approach IT seeking solutions, without ever involving IT in the preliminary decision making process. They prefer instead to drag IT in at the very end and inform them of what is going to happen, rather than consult them about what may happen.

IT’s role has changed, whether they choose to recognize it or not. Their long-standing position as “policy police,” arbiter of good taste in applications, judge over whether an application requires IT policy and corporate security standards or not — it’s all coming to an end.

IT must face the fact that BUs are increasingly adopting the cloud, and support that move by:

• Becoming more aligned with the BUs and their goals;
• Providing security in the cloud;
• Managing service level agreements with cloud providers;
• Following escalation procedures;
• and advising the BUs on how — not whether — to adopt the cloud.

Don’t wait for cloud adoption to start getting your house in order. If IT stays in reactive mode as BUs make cloud decisions, they’ll end up with “integration” minus “strategy” — applications will be integrated on an ad hoc, project-by-project basis, creating a proliferation of point-to-point connections that is a repeat of the fragile, “spaghetti” integrations of the past.

IT must act now to get ahead of the curve — meaning ahead of BU demand — defining a solid integration strategy before the cloud apps start building out (or as early in that process as possible).

Does moving to the cloud mean that IT will lose some control? Yes. But I challenge them to be big-minded about it: Support BU adoption of the cloud, embrace your new role, shed your service-manager chrysalis and spread your trusted-adviser wings.

(This post was first published at http:blogs.axway.com)