Previous / Next


Scripting News, the weblog started in 1997 that bootstrapped the blogging revolution.
NY Times topics in OPML, the mother lode?

A picture named shovel.jpgAmyloo was digging around the NY Times code weblog and found this OPML file, weighing in at a monstrous 3.3MB that contains some mysterious but rich data about the NY Times and a guide to using the Times to cover special topics that I don't think anyone outside the Times knew existed, but there it is, in a public folder, so lets have a look.  Permanent link to this item in the archive.
1. There are 10522 top-level headlines. There's no structure to the OPML, it's absolutely flat. Permanent link to this item in the archive.
Here's an HTML rendering of the list: timestopics.html.  Permanent link to this item in the archive.
2. It's a subscription list. Each item has four attributes, type, title, htmlUrl and xmlUrl.  Permanent link to this item in the archive.
3. The htmlUrl for each element points to a page of stories for the topic. For example, here's a page of stories about table tennis. On that page is a link to an RSS 2.0 feed containing the same information.  Permanent link to this item in the archive.
4. The xmlUrl links for at least some of the elements are broken, the error appears to be very simple, if you replace the ampersand with a question mark, it works.  Permanent link to this item in the archive.
If you look around at the topics you'll see it's an incredibly rich set of data. Here are just some of the topics that begin with the letter T: Tableware, Taste, Tattoos, Tax Credits, Tax Evasion, Taxation, Taxicabs and Taxicab Drivers, Tea, Teachers and School Employees, TED Conference News, Teflon, Telephones and Telecommunications, Television, Television Sets, Table Tennis, Terra Cotta, Terrorism, Tests and Testing, Textbooks, Thanksgiving Day. Permanent link to this item in the archive.



     

Last update: Thursday, June 3, 2010; 4:01:50 PM



~About the Author~

A picture named dave.jpgDave Winer, 55, is a visiting scholar at NYU's Arthur L. Carter Journalism Institute. He pioneered the development of weblogs, syndication (RSS), podcasting, outlining, and web content management software; former contributing editor at Wired Magazine, research fellow at Harvard Law School, entrepreneur, and investor in web media companies. A native New Yorker, he received a Master's in Computer Science from the University of Wisconsin, a Bachelor's in Mathematics from Tulane University and currently lives in New York City.

"The protoblogger." - NY Times.

"The father of modern-day content distribution." - PC World.

One of BusinessWeek's 25 Most Influential People on the Web.

"Helped popularize blogging, podcasting and RSS." - Time.

"The father of blogging and RSS." - BBC.

"RSS was born in 1997 out of the confluence of Dave Winer's 'Really Simple Syndication' technology, used to push out blog updates, and Netscape's 'Rich Site Summary', which allowed users to create custom Netscape home pages with regularly updated data flows." - Tim O'Reilly.

Mail: Mailto icon scriptingnews1mail at gmail dot com.

October 2007
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
Sep   Nov


RSS feed for Scripting News



© Copyright 1997-2010 Dave Winer. Last build: 6/4/10; 7:39:38 AM. "It's even worse than it appears."


Previous / Next