Nutch, an open source search engineWednesday, August 13, 2003 by Dave Winer. Two horses, pleaseEvery time Google gets competition, I hope this is the one that sticks, the one that makes search a two horse race. As a heavy user of search, I know this is not a good situation, one Silicon Valley company with so much power. When one of them takes hold it's as if we have a new royal family, people who breathe air that's finer than ours. They "get" things we don't. They think outside the box, we're stuck inside. Every time around the loop I wonder if this time it might be different, that instead of getting a royal family, we might get some Minutemen, destined to win the revolution, people ready for the long haul, anxious to work with others on equal terms, who want to win so bad they see the brilliance that's not on their payroll. A company that wants to be built into the infrastructure the way leading technology companies are, not alone and singular like so many Silicon Valley phenoms. So when Teoma showed signs of life, we hoped. We hoped when Yahoo bought Overture, with each rumor about Longhorn, and now with Nutch, an open source search engine that people seem very excited about. What I know about NutchI first heard about Nutch at a lunch in Cambridge last week with John Battelle, a visiting professor at UC Berkeley, and former editor of the high-flying dotcom journal The Industry Standard. He has been watching Nutch in his role as a columnist for Business 2.0. After the lunch I did a Google search for Nutch, found it; confirming some but not all of what Battelle told me. So I linked to it on Scripting News and waited to see what would come next. On Monday, Gary Price summarized, on his weblog, a Battelle article in Business 2.0 with lots of details about Nutch. Everything Battelle told me, and more, was on the record. So I linked to Price's summary saying that Nutch is "an open source search engine that aims to dethrone Google." Last night I got an email from Nutch board member Tim O'Reilly. "Actually, Nutch has no ambitions to dethrone Google. It's just trying to provide an open source reference implementation of search to help keep Google and other search engines honest, by letting people compare the results of an engine whose algorithms and methodologies are transparent and accessible. It also aims to give a platform for people outside of the search heavyweights to research new search algorithms." Excellent. Keeping Google honest is good, I'd also like them to be hungry and on their toes. Lots of developers, myself included, would love to be able to tweak up the ranking algorithms of a search engine. Let's hope that Nutch makes it easy to do that. Open source might be just the right thing for search, right now. We'll be watching this one, carefully. Dave Winer |