NY Times topics -- an embarassment of richesWednesday, October 17, 2007 by Dave Winer. Amyloo was digging around the NY Times open weblog and found this OPML file, weighing in at a monstrous 3.3MB that contains some very mysterious but very rich data about the NY Times and a guide to using the Times to cover special topics that I don't think anyone outside the Times knew existed, but there it is, in a public folder, so lets have a look. Some facts. 1. There are xxx top-level headlines. There's no structure to the OPML, it's absolutely flat. 2. It's a subscription list. Each item has four attributes, type, title, htmlUrl and xmlUrl. 3. The htmlUrl for each element points to a page of stories for the topic. For example, here's a page of stories about table tennis. On that page is a link to an RSS 2.0 feed containing the same information. 4. The xmlUrl links for at least some of the elements are broken, the error appears to be very simple, if you replace the ampersand with a question mark, it works. I'll produce an HTML rendering of the list in a few minutes, and in that rendering I'll fix the links. If you look around at the topics you'll see it's an incredibly rich set of data. Here are just some of the topics that begin with the letter T: Tableware, Taste, Tattoos, Tax Credits, Tax Evasion, Taxation, Taxicabs and Taxicab Drivers, Tea, Teachers and School Employees, TED Conference News, Teflon, Telephones and Telecommunications, Television, Television Sets, Table Tennis, Terra Cotta, Terrorism, Tests and Testing, Textbooks, Thanksgiving Day. |