Warehouse Computing and the Evolution of the Datacenter: A Layman’s Guide

You may not have noticed, but we’re in the midst of another massive platform shift in enterprise computing. We can debate chicken or egg, but I believe this most recent transformation is being driven primarily by requirements placed on modern applications; requirements that are the result of the on-demand, always-on computing paradigm predicated by cloud and mobile. Simply, applications need to be scalable, available and performant enough to reach millions, if not billions, of connected devices and end-users. Infrastructure must mirror these specifications, in kind.

Historically, systems design has ebbed and flowed between periods of aggregation (centralized) and disaggregation (distributed) of compute resources. The most recent evolution, from client/server to virtualized, cloud infrastructure was driven largely by a desire to contain costs and consolidate IT around standards (x86 instruction set, Windows and Linux) form factors (first blade servers, then VMs) and physical locations (emergence of sprawling datacenters and giant cloud vendors). Now we’re seeing the pendulum swing back. Why?

A strong first principle is the notion that infrastructure is beholden to the application. Today, many applications are being built as large-scale distributed systems, composed of dozens (or even thousands) of services running across many physical and virtual machines and often across multiple datacenters. In this paradigm, virtualization – which really dealt with the problem of low physical server utilization – doesn’t make much sense. In a highly distributed, service-oriented world, VMs come with too much overhead (read more on this here). Instead of slicing and dicing compute, network and storage, the better solution becomes to aggregate all machines and present them to the application as a pool of programmable resources with hardware-agnostic software that manages isolation, resource allocation, scheduling, orchestration etc. In this world, the datacenter becomes one giant, warehouse computer controlled by a software brain.

image

However, the fact of the matter is that building, deploying and maintaining distributed applications is a highly technical feat. It requires a rethinking of the way applications treat and interact with other applications, databases, storage and network. Moreover, it requires a new toolkit that is central to solving the coordination and orchestration challenges of running systems that span across multiple machines, datacenters and time zones. To help understand what’s taking place, let’s deconstruct this new stack and, along the way, define some other key terms. Note that this is in no way a static, absolute taxonomy, but rather a simplified way to understand the layers that make up today’s application stack.

Layer 1: Physical Infrastructure – Actual servers, switches, routers and storage arrays that occupy the datacenter. This area was dominated by legacy OEMs (EMC, Cisco, HP, IBM, Dell) who are now giving way to low-cost ‘whitebox’ ODMs.

Vendors/Products:

image

Layer 2: Virtualized

image
image
image
image
image
image

– Emulated physical compute, network and storage resources that are the basis for cloud-based architectures. The enabling technology here is the hypervisor which sits on top bare metal infrastructure and creates virtual clones of the server (or switch or storage array) each complete with a full OS, memory management, device drivers, daemons, etc.

Vendors/Products:

image

Layer 3: Operating System – Host or guest OS that sits atop a virtual or physical host. The rise of Linux has been a key catalyst for the commoditization the OS and physical infrastructure, decoupling applications from hardware. Microsoft with Windows Server is still a dominant player in traditional enterprise.

Vendors/Products:

image

Layer 4: Container Engine – This is where it starts to get interesting so let’s spend a little more time here. Linux containers offer a form of operating system-level virtualization, where the kernel of the OS allows for multiple user space instances. More simply, if hypervisor-based virtualization abstracted physical resources to create multiple server clones each with their own OS, memory, etc., the type of virtualization enabled by containers is a higher level abstraction of the OS. This begets the necessary degree of isolation and resource utilization to run multiple applications on a single kernel.

The beauty of containers lies in the idea of “code once, run anywhere.” A container holds the application logic and all of its dependencies, running as an isolated process. It ultimately doesn’t matter what’s inside the container (files, frameworks, dependences); it will still execute the same way in any environment – from laptop, to testing, to production across any cloud, at least theoretically. This enables application portability, which, in turn, commoditizes cloud infrastructure altogether.

Docker has become synonymous with containerization by making Linux Containers (LXC) user-friendly. The important thing to note is that container technology is made up of two fundamental components: the runtime and the container image format. The runtime is effectively a high-level API that runs processes and manages isolation. The image format is a specification for a standard composable unit for containers. In recent months we’ve seen several container runtimes and specs come to market which has caused a stir. I’m sure we’ll continue to see more.

Vendors/Products:

image

Layer 5: Scheduling & Service Discovery – Tools that solve and automate the coordination challenges related to breaking up and running applications across multiple nodes and datacenters. Schedulers interface with the resources of the cluster and are responsible for providing a consistent way to intelligently place tasks based on those resources. Service discovery tools manage how processes and services in a cluster can find and talk to one another. This area is largely greenfield but the ecosystem has coalesced around a few well known projects like Mesos, etcd and Zookeeper.

Vendors/Products:

image

Layer 6: Orchestration & Management – Tools that automate the deployment, scaling and management of applications and infrastructure.  This is what some refer to as the management plane. These tools enables devs, DevOps and sysadmins to maintain applications across clusters. This area is greenfield as new solutions are being adapted to support containers running across distributed environments. Those who win here will reap the greatest rewards. In addition to purpose-built products, there are a number of companies here who are creating application lifecycle management platforms optimized for containers including Deis, Flynn, Terminal, Tutum and many others.

Vendors/Products:

image

A few other helpful definitions:

Distributed System – A computing system consisting of a collection of autonomous nodes connected through a network and software/middleware which enables nodes to coordinate tasks and share resources of the entire system. The principle of distributed computing has been around for decades but only recently has it entered into mainstream IT as traditional software architecture has been pushed to its limits at Web scale. Perhaps the best known example is Apache Hadoop, an open-source data storage and processing framework where jobs are split and run across multiple commodity servers.

Microservices – Microservice architecture is a way of designing software applications as sets of modular, self-contained, deployable services. Whereas historically applications would be split into client-side, server-side/logic and database, the idea with microservices is to develop each application as a suite of smaller, modular services each running its own process with a minimal amount of centralized management. Microservices architecture is appealing because it enables greater agility (entire applications don’t need be taken down during change cycles), speed-to-market and code manageability.

image

        (Source: http://martinfowler.com)

The application stack is an ever-evolving, dynamic organism. Ultimately, whether it’s microservices, distributed systems, or containers, the changes we’re seeing at both the code and infrastructure level are about one thing: delivering better, more scalable software faster and cheaper. As a result, today we find ourselves at the outset of, what I’ll call, the warehouse computing era, defined by cheap, commodity infrastructure presented to the application as a pool of dynamically programmable resources with intelligent, hardware-agnostic software as the control plane operating at n-scale.

To me, though, this is still an incremental step. The world of IT I envision is one where code is written once and is executed anywhere, automatically at scale, irrespective of the underlying cloud, OS, container engine, orchestrator, scheduler, etc. In this world, ops ceases to be an IT function and becomes a product within, or even feature of, the underlying stack. This world is several years away, and before we get there I can promise the tech stack is going to get a lot more convoluted before it radically simplifies. Either way, it’ll be fun to watch.