March 17th, 2009

Cloud Computing: Defending the Undefinable

My notes from the 2009 Cloud Computing panel at SxSW Interactive. My own research seems to dominate the comments below... oh well, it's useful stuff!

Cloud Computing: Defending the Undefinable

The brave new world of cloud computing is radically changing how we build web applications. What is a platform, what is a service, and how will the future of web applications be built? More importantly, how do these various clouds compare, and what do the differences mean? Are they ready for your world-rockin’ startup? In this panel, we’ll get nerdy with technical details, you’ll yell at us, and we’ll argue why your app should already be in the cloud.



Kevin Gibbs Tech Lead & Mgr Google App Engine, Google App Engine



Yousef Khalidi Distinguished Engineer, Microsoft



Werner Vogels CTO, Amazon.com



Google App Engine is Python-specific right now. The implication is that it’s a very high-level abstraction, not a mobile linux instance with root access or similar as in the EC2 world.



That means Amazon is the only realistic choice for a PHP shop right now.



An early EC2 management console in PHP:



http://developer.amazonwebservices.com/connect/entry.jspa?externalID=442



“But now you don’t host Amazon on AWS.”



“Yes we do.”



“But when AWS goes down, Amazon doesn’t…”



”... We launch new instances on the fly etc. It is absolutely hosted on Amazon Web Services.”



http://aws.amazon.com/what-is-aws/



Gibbs, Google:



“Make your front end code stateless and your back end code stateful.”



“Our database system does not support joins. That’s something we removed in order to help you scale… joins break scalability.”



Tom: in Amazon you can run instances of MySQL if you feel like it, but you do run into these issues.



Why are joins evil for scalability? Because when you select against a single table, it’s easy to say “okay let’s sling this query against all four database servers and bang we get an aggregate result back and we’re done” or, even better, “we know the ID being asked for is on a single db server so all of the work can be completed there.” But joins complicate this because they require data that’s scattered between servers in a complex relationship.



The alternative is a simple datastore with put and get operations, which forces programmers to think in those terms; when you add serialization and unserialization of objects on top of that, you get an “object database,” but that’s just a glorification of this simple-store concept.



A great discussion of this can be found here:



http://www.10gen.com/blog/2008/7/databases-and-the-cloud



See also Amazon SimpleDB:



http://aws.amazon.com/simpledb/



And Google BigTable:



http://labs.google.com/papers/bigtable.html



You might think this stuff looks a lot like a filesystem, and you’d be right. You could absolutely build this on top of Amazon S3 alone. However Amazon SimpleDB is optimized for little stuff with fancy indexing, while Amazon S3 is optimized for bigger blobs (i.e. what we think of as "files"):



http://aws.amazon.com/simpledb/faqs/#When_should_I_use_Amazon_S3_vs_Amazon_SimpleDB



]



Khan, Microsoft:



“We have a long tail of little applications within Microsoft… it’s not cost effective to maintain servers for them manually.”



Microsoft might start running private clouds on behalf of particular clients at some point.



Amazon offers “Elastic IP Addresses” which can be mapped to any instance on the fly. This means that you can do failover among DNS servers, not just web servers.



Microsoft says they will offer multiple stacks, including LAMP and “legacy stacks.”



Amazon still seems like the only “serious” cloud provider right now.



Question: are enterprise clients resistant to cloud computing?



Amazon: most enterprises are getting comfortable with at least moving software services into the cloud… reduced capital costs are a big deal right now.



“Is Amazon Web Services profitable?”



“We don’t give out those numbers.”