An Open Web

Tiers of the Cloud

Cloud Computing uses shared computer resources distributed throughout the Internet to deliver services and storage. A number of leading software and software service firms such as Amazon, Google, Microsoft and others now offer individual access to the powerful computing resources of their massive ‘clouds’. However, this easy access to high-performance computing comes at a terrible cost: the centralization of control in a single service provider.

The technique of distributed computing has been put into practice since the first local-area networks were established to allow computers to communicate and interact. The primary advantage of distributing the workload among two or more devices is that their computational power can be combined even when the computational units are remote from one another.

The most basic type of distributed computing is a client-server architecture, which partitions computational workloads between a centralized node (which we call a server), sharing resources and data with its edge nodes (which we call clients). More complex still, the computations of a single application can be partitioned into separate but interconnected functional tiers; for example, a traditional 3-tier architecture separates a user interface (presentation logic) from data storage (data access logic), which are connected together by an information exchange layer (business logic). A 3-tier architecture is the the primary model of distributed computing on the web.

More powerful results can be achieved by what is know as a ‘cluster’—large sets of machines coupled into powerful and robust units; a clustered architecture is essential to modern high-performance scientific computing. Conversely, a peer-to-peer architecture divides computational responsibility equally between a large number of loosely coupled computers. Peer-to-peer file sharing networks like BitTorrent, and anonymity networks like Tor, both work on this principle.

In all of these architectures, the computations are distributed in more than one sense: they can both be separated in physical space, and dissected into separate, autonomous but interacting processes that communicate via message passing.

With the right technical implementation, distributed computing has three primary advantages for fast and stable web services: the increased efficiency in terms of both lower cost and higher performance gained by clustering a set of low-end computational units based on commodity hardware; the increased reliability that is gained by avoiding a single point of failure in the system; and the relative ease of scaling the network up or down by bringing additional nodes online or offline.

Enterprises whose business depends upon ownership of capital-intensive data centers have begun to offer on-demand rental access to these computational resources to individuals and small- and medium-sized companies. These services treat computation as a pure utility, insofar as the details of the where, the what and the how is abstracted from its users. In this way, cloud computing provides the power of high-performance and dynamically scalable resources to users, with lower barriers to entry and minimal capital expenditure.

At the same time, the same innovations that eliminate the requirement for consumer expertise in the underlying infrastructure of these computing platforms, in the last analysis robs them of control over these resources. Cloud computing as the pure exemplar of distributed computing technology is also the pinnacle of centralized control over computing resources.

Online file storage and back-up services such as Dropbox (http://dropbox.com) have made it easy for individuals to move their home folders into the “cloud” and sync personal files across all computing devices, whether laptop, phone or tablet. Website developers are likewise able to deploy and manage web applications in the “cloud” that can effectively scale from dozens up to millions of users, by availing themselves of services such as Engine Yard (http://engineyard.com) or Heroku (http://heroku.com).

But there is a price to be paid for this convenience. Dropbox, Engine Yard and Heroku are not themselves in the business of cloud computing. Each of them, as well as hundreds of other services, are merely clever interfaces to Amazon Elastic Compute Cloud (http://aws.amazon.com/ec2/). While having your data and online accounts backed by Amazon’s data centers may sound like your best guarantee of stability, it is also means surrendering control of these data to a single company. This threat became real enough for one organization, when Amazon shut down hosting the WikiLeaks website after succumbing to government coercion.1

  1. http://www.guardian.co.uk/media/2010/dec/01/wikileaks-website-cables-servers-amazon^