My presentation on cloud hosting providers - "Compare and Contrast" at Boston Barcamp attracted a good crowd, including some guys with expertise in Google App Engine and Microsoft Azure. I divided the cloud hosting world into two main categories. Category one is hosts that give you individual servers, like Amazon EC2 and Rackspace. Category Two is the hosts that give you a scaleable cluster of services, like Google App Engine or MS Azure. When should you commit your business to one of these models?
First, I will review how servers came to be concentrated in big datacenters, where they are now in the process of becoming virtual or "cloud" servers. In the beginning (1994/5 for me), Internet businesses would set up servers in their own offices. You had to beg, beg, beg your monopoly phone provider to pull a wire into your office. If that worked, you had the problem that power would go out on hot summer days and your servers would go down.
Anybody serious started moving servers to datacenters where they could get "co-location". The datacenter brought in multiple high-speed network fibers, and provided backup power generators, and a rack. You brought in the machinery, and stuck it in a rack. This was OK if you were willing to wait for vendors to deliver your equipment, and if you lived close enough to the datacenter to visit and crawl into the rack. There are still a lot of people that do this. It seems slow and painful, but it gets you exactly the network topology and equipment that you want.
Here, you already see one of the main tendencies of the hosting business. There is an advantage to going with bigger providers who can build huge datacenters that have big power systems, big network pipes, multiple datacenters for emergencies, and can absorb or deflect DDOS attacks.
In the next phase, you could call the hosting vendor and rent a server that the vendor would own and stick in the rack, on as little as one hour's notice. That saves a lot of time and energy. I would guess that most people who are moving to "cloud" or virtual hosting are moving from this model. The rental model can be more economical if you need a fixed set of servers. However, it has problems with reliability. If a server goes down, it takes some time to fix it or replace it, and you are relying on the inconsistent expertise of the host admins. Essentially, you need to buy two of everything to get hot swaps. We've seen problems ranging from "your ethernet cable pulled out" to "Our power room exploded and 9,000 servers are down."
Get a Server: Virtual hosts
When Xen server virtualization became available, you could order a virtual server. The menu of servers fromt these first generation providers looks very similar to the menu for the non-virtual rentals. You can select from a fixed set of operating system configurations and "slice" sizes. However, you can get the servers a lot faster, and you can get smaller slices, and in most cases you can save images and swap instances on the fly. These hosts can be considered "Cloud" hosts because they will deliver a server on demand.
Get Your Server: Virtual hosts with custom images
In the following generation, you can build your own server images, and save them in a catalog. Obviously, this gives you a bigger menu of configurations that makes it easier to get started. The most important impact is that it speeds the process of replacing servers or expanding your sever cluster. You no longer need to have two of everything. You also get an API for starting and stopping server instances, so you can have scalability in your server count, or automated hot swaps.
Amazon was the first vendor to provide this level of handling for virtual server images, including a catalog, and they dominate this business now. At Assembla, we are intending to use custom server images to provide workspaces for popular development platforms that include instant-on staging servers with pre-configured continuous integration. I have talked to a number of other virtual hosting providers, and most of them do not support a catalog of customized server images. It's an important capability. There seem to be some obstacles to implementing it. In any case, I think all virtual hosting vendors will eventually offer a catalog of images.
Most production systems involve a cluster of servers with different images connected in a particular topology. This ranges from a simple app+db cluster to the 6 server types we need to run Assembla.com. There are vendors that sell management software to manage these clusters on server-based virtual hosting, and Sun is showing a prototype of their service which includes a graphical editor for building and deploying clusters.
Get a scalable cluster of services: Google app engine, MS Azure, Appforce, Morph, AWS
All of the above hosts will sell you a server. You are responsible for deciding how many servers you want, and configuring the servers. The next emerging category of cloud services doesn't sell you a server. They provide you with services that run on a cluster of servers, which they maintain and configure.
The upside of this approach is scalability. As long as you follow their rules, they can run your application or service request on any of their servers, and they have a lot of servers running. For example, with Google App Engine you build a stateless application that can respond to a web request by starting fresh and looking in Google's "datastore". Google can run this type of stateless application anywhere on their vast network of servers, and you get an app that can handle any load.
Amazon Web Services (AWS) is also a big player in the services business. They offer scalable services for data storage, computation, queuing, etc. There are also third party services like Morph and Heroku that build on the Amazon platform by organizing servers into services. Salesforce.com has a relatively mature offering that they call the Force.com platform. Microsoft is launching the Azure Services Platform, which gives you servers and services that are almost, but crucially, not quite, like your installed windows servers.
The downside of this approach is that you have to follow their rules. This category of services is emerging, not emerged. The rules can be quite strict, and the services are incomplete. For instance, some services allow you to build apps that use database data, but cannot save files. Google apps have to be stateless, and written in Python or recently, a subset of Java. These restrictions are necessary if you want scalability, because the servers in clusters are all the same. They aren't configured with your special stuff.
Clearly, these services will become more complete and ever more compelling. That is going to force buyers to make some decisions. You can move a server to a new host, but you typically can't move a service. How much do you trust these vendors? How much are you willing to invest in a product that can only run if one of these vendors supports it? Is Google always not evil? Does Microsoft have a stable business model? Will Salesforce.com jack up their already high prices, or just compete directly with a successful service?
Entepreneurs are also going to have to ask themselves what services are going to be the domain of the big players that run the datacenters - Amazon, Google, Microsoft, Sun/Oracle - and what services will run as value-added applications suitable for startups.
I think that as the "service" providers get more mature, there will be some portable standards and open source solutions, which will make it easier to move not only servers, but services, and that will drive adoption. For example, Amazon compute service are based on Hadoop, and open source framework that you can run on other hosts.
Please let me know if I missed something in your area of expertise. Thanks.