Home >  Archive >  2011 >  November >  19

Previous / Next

Preserving Manila sites
By Dave Winer on Saturday, November 19, 2011 at 9:44 AM.

When you've been blogging as long as I have, you leave behind a trail of sites that need to be kept accessible. The longer you blog, the more maintenence work you accumulate. Non-technical people don't seem to realize that the sites don't just take care of themselves. The environments they run in change, in so many ways.  #

A bunch of these sites are Manila sites, which has meant that I've had to keep a Manila server running. Long-term this isn't viable, and there are problems even in the immediate term. This last week the server that's hosting the Manila sites was getting some abusive traffic. This meant that the legacy sites were mostly inaccessible. And I found out how much developers and writers depend on them. I want them to be able to depend on them, even after I die. So this week I decided to do something about it. #

A picture named tramp.jpgMigrating Manila sites to static sites has been something I've wanted to do for a long time. I've even had some starts at it, but investing in the past is always less interesting than investing in the future. So I never quite completed the projects.  #

I picked up on one of those starts, a tool called staticManilaSites.root, and got it to process the XML-RPC site. I had used this tool previously for some simpler sites, like bloggercon.org and thetwowayweb.com. But the XML-RPC site was one of the first Manila sites, and it used some features in one-off ways, that later versions of the software made impossible. But I got through that. #

I built another tool called redirector.root, that I pointed xmlrpc.com at. It in turn redirects to the new S3 static site, xmlrpc.scripting.com. I had to point through an intermediate host for a couple of reasons: 1. There's no way to point a root domain at an S3 bucket, since the pointer must be a CNAME. 2. There is no equivalent of an .htaccess file for redirecting, on Amazon. The URLs had subtle changes. Like spec changes to spec.html. That's why I needed a separate redirector app, and it couldn't be a static site. #

At first, I used a temporary redirect in case I made mistakes, or wanted to change strategies. When I was comfortable that everything was working, it changed to a permanent redirect. Hopefully the search engines will adapt, and stop pointing to the old pages, and eventually the redirector will only be needed for the various incoming pointers from sites that build on XML-RPC. There are a lot of them, and they generate a substantial amount of traffic. I know people think XML-RPC is "dead" but you'd be amazed how much development goes on in this supposedly dead protocol (which is one of the reasons I argue whenever people use that awful word). I mapped the four versions of the domain to the new S3 bucket. xml-rpc.com, xmlrpc.com, xml-rpc.org and xmlrpc.org. Eventually I'll probably give up all but one of these domains.  #

I also moved the domains, as I worked through them, to hover.com. I feel better about working with them than I do with GoDaddy. They're very anxious to please, and I feel like I'm one of their first customers, but I've also known them for a long time (it's part of Tucows).  #

I also wrote a script to scrape all the urls out of the site into a table, where I manually review them, and eliminate all the pointers that aren't images or downloads. Then I wrote a script that moves the assets from their old locations, which are pretty scattered, into a sub-folder of the bucket that contains the site. This reduces the probability of images or downloads being lost. Luckily none of them were gone, they were all still there. Which is surprising for a site that's over ten years old, like xmlrpc.com. #

Then I converted www.opml.org to dev.opml.org, which is also an S3 site. And now I'm almost finished with outliners.com, which is moving to outliners.scripting.com. Safing-up the outliners site is particularly satisfying, because it was a project I did in 2000 to safe-up work that was done in the 80s. That proves something important. Unfortunately we are nowhere near a stable situation too. It may feel right now that S3 is a very safe place to store static content, but it's likely that 10 or 20 years from now it won't be. I remember keeping stuff on 5-inch Apple diskettes, thinking that we'll always be able to read those. Heh. In 2011 you can tell how silly an idea that was. But in 1982, it would have seemed quite rational. #

I feel at this time I have a process, albeit a manual one, for converting a Manila site to a static S3 site. I have to do the same for my father's and uncle's sites. And a bunch of other Manila sites. My goal is to be able to turn off the Manila server altogether, soon. #




Christmas Tree
This site contributes to the scripting.com community river.


© Copyright 1997-2011 Dave Winer. Last update: Saturday, November 19, 2011 at 10:36 AM Eastern. Last build: 12/12/2011; 1:21:54 PM. "It's even worse than it appears."

RSS feed for Scripting News

Previous / Next