Home >  Archive >  2010 >  November >  3

Previous / Next

Christmas Tree
This site contributes to the scripting.com community river.
About the author

A picture named daveTiny.jpgDave Winer, 56, is a visiting scholar at NYU's Arthur L. Carter Journalism Institute and editor of the Scripting News weblog. He pioneered the development of weblogs, syndication (RSS), podcasting, outlining, and web content management software; former contributing editor at Wired Magazine, research fellow at Harvard Law School, entrepreneur, and investor in web media companies. A native New Yorker, he received a Master's in Computer Science from the University of Wisconsin, a Bachelor's in Mathematics from Tulane University and currently lives in New York City.

"The protoblogger." - NY Times.

"The father of modern-day content distribution." - PC World.

"Dave was in a hurry. He had big ideas." -- Harvard.

"Dave Winer is one of the most important figures in the evolution of online media." -- Nieman Journalism Lab.

10 inventors of Internet technologies you may not have heard of. -- Royal Pingdom.

One of BusinessWeek's 25 Most Influential People on the Web.

"Helped popularize blogging, podcasting and RSS." - Time.

"The father of blogging and RSS." - BBC.

"RSS was born in 1997 out of the confluence of Dave Winer's 'Really Simple Syndication' technology, used to push out blog updates, and Netscape's 'Rich Site Summary', which allowed users to create custom Netscape home pages with regularly updated data flows." - Tim O'Reilly.

8/2/11: Who I Am.

Contact me

scriptingnews1mail at gmail dot com.

Facebook

Twitter

Friendfeed

My sites
Recent stories

Recent links

My 40 most-recent links, ranked by number of clicks.

My bike

People are always asking about my bike.

A picture named bikesmall.jpg

Here's a picture.

Calendar

November 2010
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 

Oct   Dec

Warning!

A picture named warning.gif

FYI: You're soaking in it. :-)


A picture named xmlMini.gif
Dave Winer's weblog, started in April 1997, bootstrapped the blogging revolution.

Meeting at Library of Congress Permalink.

A picture named floppy.gifI'm on my way to Washington for the second time this week. Last time it was to have a discussion about rebooting the news at the Online News Association. This time I'm going down for a two-day meeting at the Library of Congress to talk about creating an archive of what they refer to as citizen journalism. I think that term doesn't capture what's going on, it's anachronistic, as horseless buggy seems today. Who would think you could hitch a horse to a car to get around, but people did used to do that. In the future, we'll get our news from the sources, so the question is how to create a record of what the sources are saying.

I'm one of the people who will kick off the meeting by saying how I came to be interested in this subject.

My story will be about various formats that seemed so pervasive that they were safe choices for archiving content. There was a time when the 5 inch Apple II floppy disk was ubiquitous. You didn't have to carry a computer with you, because you could be sure there would be an Apple II when you got where you were going. Today, such a disk would be useless. CP/M 8 inch floppies seemed the same way, and the hard-shell 3 inch disks used by the first Mac. Yet none of them have held up over time, and most of the stuff that was written on computers in the 80s is gone now, unless it was printed out. Printing turns out to be a pretty good way to back up digital content. Or it was. Today we create far too much material to rely on printing as a backup. We're going to have to come up with something else.

I came up against this after I left Berkman, when the RSS 2.0 spec, which was stored on one of their servers, became inaccessible. I was using a CMS I had written, and somehow the app had stopped running. The sysadmin of the Berkman site didn't know how to keep it going. That was a big lesson. If you want content to stick around, you have to take deliberate steps to make sure it survives. And there are some best practices. When I focused on this problem, I was able to arrive at a way to store the spec on a Harvard server such that now, six years later, it's still accessible. Whether it will be available next year is anyone's guess.

Academics have always had this problem. A university employs a scholar, sometimes for a lifetime. He or she creates a body of work, that then must be made available to future generations. That's why we have libraries at universities. But lately, as with all kinds of intellectual work, scholarship is being done on computers. So when a professor retires or dies, we are left with an array of electronic files and folders in a variety of formats. What use will they be in the future if the apps that can read them aren't maintained.

This blog and its related sites are another good example. As much as I don't like thinking about it, someday I am going to die. And when that happens, unless someone pays the ISPs, and someone relaunches the servers when they crash, and cleans out the databases when they fill up -- poof -- there goes Dave's online presence.

But we can do a lot better than we are doing, we just have to have the will to do it.

I've written about this many times, I call the topic future-safe archives.

A few bullet-points:

1. I want my content to be just like most of the rest of the content on the net. That way any tools create to preserve other people's stuff will apply to mine.

2. We need long-lived organizations to take part in a system we create to allow people to future-safe their content. Examples include major universities, the US government, insurance companies. The last place we should turn is the tech industry, where entities are decidedly not long-lived. This is probably not a domain for entrepreneurship.

3. If you can afford to pay to future-safe your content, you should. An endowment is the result, which generates annuities, that keeps the archive running.

4. Rather than converting content, it would be better if it was initially created in future-safe form. That way the professor's archive would already be preserved, from the moment he or she presses Save.

5. The format must be factored for simplicity. Our descendents are going to have to understand it. Let's not embarass ourselves, or cause them to give up.

6. The format should probably be static HTML.

7. ??



© Copyright 1997-2011 Dave Winer. Last build: 12/12/2011; 1:37:52 PM. "It's even worse than it appears."

RSS feed for Scripting News


Previous / Next