Previous / Next


Scripting News, the weblog started in 1997 that bootstrapped the blogging revolution.
NY Times metadata

A picture named accordion.gifIf you do a View Source on a NY Times story, you'll see that there's lots of metadata in the HTML, including keywords for most of the of the stories.  Permanent link to this item in the archive.
Behind the keywords is a taxonomy that I haven't seen, but would like to. I asked them to make this public, both at my meeting there last Thursday and in a phone talk this morning. I think there could be a lot of value in the Times taxonomy, it might even set a standard.  Permanent link to this item in the archive.
In the meantime, I wrote a script last night that tracks the keywords in NY Times stories as they flow through the nytimesriver application. Here's a report that's updated once per hour. Permanent link to this item in the archive.
http://nytimesriver.com/keywords.html  Permanent link to this item in the archive.
Obviously it would be interesting to be able to click on the keywords to see what articles reference each of the keywords. And it would also be nice to have a cumulative list and a daily list. Right now all we have is the cumulative version.  Permanent link to this item in the archive.
But it's still pretty interesting, bordering on fascinating to think of the possibilities if they provide the framework behind these keywords.  Permanent link to this item in the archive.
When the pros try to figure out how what they do will continue to make sense after the Internet achieves all its promise, this may be an example. The metadata is generated by librarians, and we don't as yet have our own librarians in the blogosphere (though some might disagree). And it's possible that after a release of the taxonomy that something like Wikipedia may happen, with the public taking over maintenence of the taxonomy. No one knows what will happen, but one thing seems clear, there can be value in a news organization beyond the reporting and editing it does. Permanent link to this item in the archive.



     

Last update: Thursday, June 3, 2010; 4:01:50 PM



~About the Author~

A picture named dave.jpgDave Winer, 55, is a visiting scholar at NYU's Arthur L. Carter Journalism Institute. He pioneered the development of weblogs, syndication (RSS), podcasting, outlining, and web content management software; former contributing editor at Wired Magazine, research fellow at Harvard Law School, entrepreneur, and investor in web media companies. A native New Yorker, he received a Master's in Computer Science from the University of Wisconsin, a Bachelor's in Mathematics from Tulane University and currently lives in New York City.

"The protoblogger." - NY Times.

"The father of modern-day content distribution." - PC World.

One of BusinessWeek's 25 Most Influential People on the Web.

"Helped popularize blogging, podcasting and RSS." - Time.

"The father of blogging and RSS." - BBC.

"RSS was born in 1997 out of the confluence of Dave Winer's 'Really Simple Syndication' technology, used to push out blog updates, and Netscape's 'Rich Site Summary', which allowed users to create custom Netscape home pages with regularly updated data flows." - Tim O'Reilly.

Mail: Mailto icon scriptingnews1mail at gmail dot com.

October 2007
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
Sep   Nov


RSS feed for Scripting News



© Copyright 1997-2010 Dave Winer. Last build: 6/4/10; 7:39:35 AM. "It's even worse than it appears."


Previous / Next