Amazon removes the database scaling wall
Saturday, December 15, 2007 by Dave Winer.
When Amazon introduced S3 in March 2006 I knew I would use it and I was sure a lot of other developers would. I saw it as a solution to a problem we all have -- storage that scales up when needed, and scales down when not. Otherwise we all have to buy as much bandwidth as we need in peak periods. With S3, you pay for what you use. It makes storage for Internet services more rational. Later they did the same for processors and queuing. And a couple of days ago they announced a lightweight scalable database, using the same on-demand philosophy and simple architecture and API. It's going to be a huge hit and forever change the way apps are developed for the Internet.
When I developed Frontier in the late 80s and early 90s my target platform was a modern desktop computer, a few megabytes of RAM, a half-gig of disk, a few megahertz CPU. A system capable of running Quark XPress, Hypercard or Filemaker. It would be used to develop apps that would drive desktop publishing. Later, it was used to generate static websites, then a demonstration of democracy (a multi-author ultra-simple CMS), then news sites, which became weblogs, then blogs, and editthispage.com, Manila, weblogs.com, and that's when scaling became an issue. (Later we side-stepped the scaling issue by moving most of the processing to the desktop with Radio 8.)
Same with weblogs.com. It worked great when there were a few thousand blogs. Once we hit 50K or so, we had to come up with a new design. Eventually we were tracking a couple million, and Frontier was hopelessy outclassed by the size of the problem.
If only Amazon's database had been there, both Manila and weblogs.com could have been redesigned to keep up. It would have been a huge programming task for Manila, but it would have made it economically possible.
Today, when a company raises VC, it's probably because their app has achieved a certain amount of success and to get to the next level of users they need to spend serious money on infrastruture. There's a serious economic and human wall here. You need to buy hardware and find the people who know how to make a database scale. The latter is the hard problem, the people are scarce and the big companies are bidding up the price for their time. Now Amazon is willing to sell you that, to turn this scarce thing into a commodity, at what likely is a very reasonable price. (Haven't had time to analyze this yet, but the other services are.) Key point, the wall is gone, replaced with a ramp. If you coded your database in Amazon to begin with you will never see the wall. As you need more capacity you have to do nothing, other than pay your bill.
Further, the design of Amazon's database is remarkably like the internal data structures of modern programming languages. Very much like a hash or a dictionary (what Perl and Python call these structures) or Frontier's tables, but unlike them, you can have multiple values with the same name. In this way it's like XML. I imagine all languages have had to accomodate this feature of XML (we did in Frontier), so they should all map pretty well on Amazon's structure. This was gutsy, and I think smart.
They're going down a road we went down with XML-RPC and then SOAP. There may be some bumps along the way but there are no dead-ends, no deal-stoppers. All major environments can be adapted to work with this data structure, unless I'm missing something (standard disclaimers apply).
Their move makes many things possible. As I said earlier, if it existed when we had to scale weblogs.com, we would certainly have used it. One could build an open identity system on it, probably in an afternoon, it would be perfect for that. A Twitter-like messaging system, again, would be easy. It's amazing that Microsoft and Google are sitting by and letting Amazon take all this ground in developer-land without even a hint of a response. It seems likely they have something in the works. Let's hope there's some compatibility.