Top > Dave's World > Weblog Archive > 2000 > October > 06Previous/Next

Scripting News, the weblog started in 1997 that bootstrapped the blogging revolution.
Permanent link to archive for Friday, October 06, 2000. Friday, October 06, 2000

DaveNet: Dinner with Doug Engelbart.

I want to work with search engine vendors on XML formats that say which pages changed, so I started an eGroups mail list for this purpose. It's for CMS vendors. Users can join too, but the discussion will be focused on common needs between the vendors. Please pass the pointer to the list to people who do content management software or search engines. Thanks.

Word of the day: Lagniappe.

Bryan Bell: "You might think that this theme is simple and stupid, and I can understand that because it is meant to look like a Frontpage site. However it has a feature that I haven't released on a theme before."

Brent Schlender: "A funny thing has happened since the turn of the millennium. The IT business has in essence become the Internet business. And suddenly Bill and Andy don't seem to matter as much anymore."

Wired: Patent battle takes TV turn.

Salon reviews Stephen King's On Writing.

More mail. Great stuff, it keeps comin!

How I do it: mailPages.root.

NY Times: Tendency to Embellish Fact Snags Gore.

SJ Merc: "In the past year, as the buzz around Loudcloud has gotten louder, a number of companies have adopted new names that sound suspiciously similar."

Inside.Com: "The site intended to explain Zope to 'newbies' runs not on Zope, but on a rival program Manila. A public acknowledgment that one product doesn't solve all problems for all companies? Now that's revolutionary." I agree.

BTW, the site they're referring to is ZopeNewbies, hosted with pride by UserLand.

Reuters: Salon's Talbott calls for Web Marketing Co op.

Oy, Doc's back went out. Get well soooon.

More on integrity 

More thoughts after writing the piece about integrity on the Web earlier this week.

Hopefully not too far down the line, we'll have a concise statement, not unlike the GPL or the US Constitution, that says what integrity means in online journalism.

First, I can hear the print journalists saying that it means the same thing as it does in print, but I'm afraid it doesn't. There are differences. Here's an example.

If you write for the Web, as I do, you'll get challenges to your ethics or integrity every day. How do you deal with those? Can you ignore them? What if there's some substance to one of the challenges? How much arguing do you have to do? Is this something we can help each other with? How does the substance get dealt with, without leaving you completely paranoid about expressing opinion, for fear of being dragged into the ditches defending your integrity?

How and where do you disclose conflicts of interest? What's considered fair notice? The Web offers more bandwidth for writing than print. A website can contain many more paragraphs than a print magazine. Now, it seems, there's no excuse for not fully disclosing conflicts.

On the other side, there's no longer an excuse for not disclosing qualifications. If you're writing about technical subjects, what's your background? Education? Experience? Do you come from PR or engineering? Who's writing this piece? In general we know so little about the people who have so much power to form opinion. Again, without the space limitation of print, what should the reader be allowed to know about the person who's writing? (Also this is good for the authors, they can promote their books, or other writing they do.)

Further, I may have missed this, but is there already a standard guideline doc for integrity in journalism? Is it just common sense or is it written down somewhere? Is it on the Web? I'm embarassed to say that I don't know if there's a definitive document for integrity of all kinds of journalism.

Another question. If you run a publication, your reporters may have impeccable integrity, but what about they people they quote? If you know they're on the vendor's payroll, is your integrity challenged if you quote them as if they didn't have a conflict of interest? I suspect this is the question the pubs don't want to be asked. (They'll have to do more work, get real quotes from people without conflicts. But of course the value of their work will go up substantially, which is another way of saying there's negative value of an apparently objective source who is highly conflicted.)

Problems with crawlers 

Web crawlers are creating serious problems for us. Tens of thousands of hits a day on our servers. We have a theory that they don't know about virtual domains. When they decide to go back to a server, they should use the IP address, not the domain name. Even a static server might have trouble keeping up with the amount of traffic they generate for us.

To be clear, the search engine crawler should figure out that and map to the same IP address, and should queue up all requests by IP address, so as not to pound any individual server.

We've had to turn off the crawlers using the robots.txt convention. If any search engine company fixes this, let us know and we'll let your crawler through to our sites.

Even better would be to work out an XML-based format for us to tell you which pages changed, so you can avoid requesting pages that haven't. It's not just a good idea anymore, without some coordination I don't think we can support search engines. Let's get over this scaling wall by working together.


Last update: Friday, October 06, 2000 at 8:08 PM Eastern.

Dave Winer Mailto icon

Click here to view the OPML version of Scripting News.

Morning Coffee Notes, an occasional podcast by Scripting News Editor, Dave Winer.

October 2000
Sep   Nov

Click here to see an XML representation of the content of this weblog.


© Copyright 1997-2005 Dave Winer. The picture at the top of the page may change from time to time. Previous graphics are archived.