Saturday, July 26, 1997 by Dave Winer.
Following up on yesterday's piece, Steve Wozniak and the Garage which also included an analysis of the Netscape-Marimba deal, I got an email from Mike McCue, firstname.lastname@example.org, a lead developer at Netscape, promising that the Castanet protocol would be opened.
He said: "For the record, we are working very hard to make the Marimba protocol an open standard asap. It will usually be the case with fringe features that they are initially non-standard. Then over time the open standards circle will widen to encompasses these features."
Thanks Mike. As soon as we get a look at the spec, we'll have a report, of course. Should be interesting!
In the last piece I talked about courage and commended Netscape and Marimba for having it. All of us who don't work at Microsoft have a stake in their success, I think, because if we can point to an example of a recent standard that happened without Microsoft driving the process, it would be safe for other people to try new ideas out too. That would be a good thing.The often-cited VHS-Betamax battle had two winners. Let's not buy into black and white. Don't cry for Netscape and Marimba yet. They may have tapped into a big idea, especially if they have the courage to take the next step and open it up.
Software offers so many rewards! It's a business, an artform and a culture. When you try to understand why software moves one way or another, I think you have to look at all three sides.
The culture side, to me, is represented by friendships. I go to Windows for business reasons but also to reconnect with old friends who make Windows software. I want to take my Mac friends with me, but I know I can't. I hope the culture we've developed will live beyond and outside the operating system we met on.
I also do software for art. To me this is about breaking rules and mixing functionality from other lifetimes in new ways, and it's also about working with great people. The best art is unpredictable. Add two free people to the mix for an unpredictable result every time.
Software is about movement. You don't stop after version 1.0. You keep going places, learning from people who use the software and developers who work with you, but also people who compete with you.
Part of the fun in software is the other side of the empty feeling in my competitor's belly when he sees the feature list for the next version of my software. It's the same excitement I feel when I do the same.
Sometimes when I ship something, the word Bonk! comes out of my mouth. I imagine my competitor being hit over the head! It's an immature idea, but it's also proof that my inner child is alive and well, and I dig that.
So, who listens to you the best? Good competitors. They can teach you so much. Being open is being open to competition. It forces us and our software to be better, more creative, more subtle and more aggressive.
It offers us new ways to be friends.
NewsTracker is the just in time search engine (JIT-SE) from Excite. It's a very useful tool for following news stories on the web. I last wrote about it in Two Crazy Days 7/10/97. From there, follow the links back. The first piece about JIT-SEs was Floating Ideas, 9/7/96.
I'd still like to play with NewBot from Wired, but there have been technical problems. So for now, NewsTracker is the testbed I use to explore the ideas around timely web-based search engines.
NewsTracker has already had a positive impact on the Scripting News page. I was able to follow the Trellix story, to read other takes on the product before I wrote my own piece. Same with the Marimba-Netscape deal. Newstracker found an early Bloomberg News report on the New York Times Syndicate website. It was a good piece, and helped put the deal in perspective for me.
In the software world I like it when I can stand on someone's shoulders, and I also like it when someone stands on mine. The web, for writers, offers the same opportunity. We can all reinvent the wheel, or we can learn from each other. NewsTracker makes learning easier.
NewsTracker is a great start, but I want it to keep moving. There's room for improvement, as I see it, in three areas: user interface, better filtering and better coordination with the sites they index.
First the user interface issue. When you go to the NewsTracker home page it can be hard to spot the just-in-time part. The solution -- a special entry page:
You can bookmark this page. I'll leave it right where it is. If Excite puts up a simple entry page for their JIT-SE I'll redirect to it from this page.
Yesterday when I did a search for Marimba, the list of hits included articles on McAfee, new music formats and CMP's public offering, articles that have nothing to do with Marimba. Why? Because news sites often use a template with links to all their hot stories on each page. The word 'Marimba' appears on these pages, but only in links to an articles about Marimba.
This problem ultimately will be solved by better separation of content from form, a new architecture for the web. In the meantime, NewsTracker does an incredible job of building summaries of the articles, so maybe they can guess what's structure and what's not. This not in my area of expertise.
An interesting sidebar, within a few days the filtering issue goes away as the sites rebuild their pages from new templates and NewsTracker picks up the change in its next traversal.
It's not clear how much scanning NewsTracker does, how deep its traversal goes, or how tuned it is into changes that happen on my site. Do they just scan the home page and follow links that stay on my server? Do they visit all the back-issues of DaveNet in every scan (this would be wasteful, back issues change very rarely).
Back in April I was thinking about this issue and came up with an automated way to help direct search engines to the stuff that changed. This page has been maintained by a background script since April 29.
It's a tab-indented file with one top-level line for each day, each second-level line is the URL of a page that changed during the day, and the third-level line is the exact time of the last change to the page.
A JIT-SE could read this file once a day, index each of the pages that changed, and then go on. It would be efficient, would allow the engine to index more sites, or more important, index sites more frequently.
We could generate this file every hour or every ten minutes, as could any webmaster who can run a script. We've posted the full source to the Frontier suite that generates this file, and have no problem if people port it to PERL,Visual Basic, tcl or AppleScript or modify the Frontier suite for their own purposes.
Microsoft thinks that CDF is the answer here, but I don't see their stake in this area, since most leading news sites are not run by Microsoft or built using their tools, and they don't run a search engine.
More imporant, there's a basic philosophical difference between the SiteChanges idea and CDF. We don't point to pages that could change, we point to pages that did change. This is lower level, simpler, easier to generate, more low tech and most important, we don't have to wait for Microsoft.
That said, we'd be happy to change the format, change the frequency and perhaps generate more information, but only in coordination with one or more search engines. If Microsoft wants to show us how to express this in CDF, we'll definitely take a look.
I think we've got a good starting place. Let's see if anyone wants to play.
A final note, after writing about NewsTracker I got email asking me to check out my.yahoo.com, which people said does what NewsTracker does.
I did check it out, and yes, it has a JIT-SE, but it's not as broad. For example, I only found one Trellix story on Yahoo, but NewsTracker found eight.
Further, since Yahoo doesn't index Scripting News, I can't learn how it works from the content side of the connection.
Of course, from my point of view, the more the merrier; competition keeps the quality of listening high.