Home > Archive > 2009 > June > 4

Twitter clients could help with backup

Thursday, June 04, 2009 by Dave Winer.

A picture named hebrewHunk.jpg I don't know to what extent Twitter archives my posts. For example, here's a blog post from January of this year. It links to a tweet, which is still there. Not sure if it keeps around older stuff, or how I would browse them if I wanted to see what I had written. The search command in Twitter stops at a certain point, exactly where that is -- I don't think anyone knows.

Uncertainty about what's backed up is a sure sign of a problem with backups.

Because I want a record of whatever I post to Twitter, I wrote an app that archives all my posts and those of people I follow. It's a very easy bit of code to write, since Twitter has an API call that returns all the recent tweets of all the people I follow, every client has to make this call regularly, so it has to be efficient, on both ends; and it is. I have shared the code, anyone can download it for free. It runs in the OPML Editor. But I think more developers should add this to their Twitter clients as a service to their users, as a competitive advantage, and as a way of making the work we do safer.

Right now we're all running without a safety net. Or more accurately, only Twitter knows how much of a safety net we have. And as we saw in the financial meltdown, it's not wise to assume that people we depend on to understand how complex systems work actually understand how they work.

In software, it's always a good idea to back up your work. And the people who make the popular Twitter clients could do a lot to help us there.

Discussion:

1. Some environments allow apps to write to local disks, and others don't. I don't know if Air, the platform many of the Twitter clients run on, allows this. If so, then I recommend that the clients simply maintain a calendar-structured folder of XML files containing each days' tweets, one file for each user. If not, then the backup has to be maintained in the cloud.

2. The size of these files is negligable in the age of MP3 and AVI. Text files are tiny and disks are relatively huge. Size isn't an issue.

3. Neither is performance. The file systems of today's computers are incredibly good at saving small text files.

4. It might add a little complexity to the Prefs user interface. At least it would require a panel that allows the user to choose a folder, and to enable or disable the feature. I would have it enabled by default.

5. You might want to allow the user to save his or her backup in Amazon S3 or to use FTP to upload to another server. Again, the overhead is negligable. I have the software running on my desktop system in the background. It's just an ordinary iMac. I don't notice any delays. Honestly.

6. What format to use? The simplest choice would be to use the XML-based format that Twitter itself uses. Other choices include RSS, Atom, OPML, or something of your own invention. I think RSS is the most rational choice, but I used OPML. I'm beginning to think that was a mistake, though I had good reasons for that choice at the time.

7. I also dereference short URLs and store both the long and short version. Wouldn't want to go to all the trouble of backing up the tweets only to find out the URLs broke because tinyurl (or whatever) went away.

8. The most basic reason to do this is backup, and that was the original motivation in my suggestion, in the summer of 2008. I suggested to the client vendors I could reach that they support RSS-based backup. That way, when Twitter went down -- as it was doing regularly then -- their users would not go down. But then Twitter started becoming more reliable so the urgency of this decreased

9. However, storing user backups first on the desktop, then in the cloud, those are the first steps towards an open, low-tech, simple form of federation that doesn't depend on a central node. If for no other reason, we as a community, should start down that road, asap. Murphy's Law says that at some point we will wish we had.

10. I'm sure there are other considerations, please post comments if you think of them, and I'll add to this list as I think of them.

Recent stories

Dave Winer, 54, pioneered the development of weblogs, syndication (RSS), podcasting, outlining, and web content management software; former contributing editor at Wired Magazine, research fellow at Harvard Law School, entrepreneur, and investor in web media companies. A native New Yorker, he received a Master's in Computer Science from the University of Wisconsin, a Bachelor's in Mathematics from Tulane University and currently lives in Berkeley, California.

"The protoblogger." - NY Times.

"The father of modern-day content distribution." - PC World.

One of BusinessWeek's 25 Most Influential People on the Web.

"Helped popularize blogging, podcasting and RSS." - Time.

"The father of blogging and RSS." - BBC.

"RSS was born in 1997 out of the confluence of Dave Winer's 'Really Simple Syndication' technology, used to push out blog updates, and Netscape's 'Rich Site Summary', which allowed users to create custom Netscape home pages with regularly updated data flows." - Tim O'Reilly.

http://twitter.com/davewiner

Dave Winer

.

Last update: 6/4/2009; 8:29:55 AM Pacific. "It's even worse than it appears."