News and commentary from the cross-platform scripting community.
cactus Mail Starting 4/29/98

From: rmcassid@uci.edu (Robert Cassidy);
Sent at Thu, 30 Apr 1998 11:39:42 -0700;
A specific XML project

I have a new little project cooking up this summer and all this XML stuff got me thinking. My project is just to put up a job posting and online resume service (well, a better one than I have now) for our school. I got to thinking about how hard it is to get job postings and resumes all in one place and in a managable form. Perhaps a job position DTD and a resume DTD should be in place. We could then mark up resumes and job postings, make them accessable to others, build tools to match them, and still allow people to format them in an attractive manner.

So if you get wind of a DTD for either of these (I'll keep checking your database - excellent idea, thanks) in the next two months, I'll likely adopt it. If not, I'll make my own and share.

From: michael@memra.com (Michael Dillon);
Sent at Thu, 30 Apr 1998 11:28:31 -0700 (PDT);
Why the Internet seems fragile

I'm not sure if this is your specific problem, but many backbone network operators will block traffic to and from other providers for a couple of reasons. One is if the blocked provider has misconfigured mail servers that are used as staging points by spammers. These providers are listed in the Realtime Blackhole List which you can learn about here:


And now a couple of providers have started publishing blacklists of networks that have misconfigured routers that can be used a SMURF attack amplifiers. I wrote about this in my weekly Internet World column last week.

In general, backbone providers are private companies operating private networks and reserve the right to decide how they interconnect with other providers. This does result in spotty Internet connectivity between some sites on the Internet just like that old Vermont saying, "Well, you just can't get there from here".

If anyone is operating an important WWW server site I think they really need to look at one of these options:

  1. multihome your network to several backbone providers (this is technically quite difficult to do)

  2. colocate the servers with a company like above.net in San Jose who are located at a major Internet exchange point and are interconnected with several major Internet backbones through peering and transit arrangements.

  3. mirror the site on several servers located on different backbone provider networks. The servers could all be in the same city or even the same building. It is the network topology that matters.


From: mark@mail.ncee.org;
Sent at Thu, 30 Apr 1998 13:25:13 -0400;
Re:A Fragile Internet

I am a Digex customer in Washington, DC. Right now access to scripting.com is working fine. The only time recently I have seen a problem was when you mentioned an outage (last week?).

A couple months ago there were problems where Digex wasn't routing to Conxion at all (or any of the Microsoft download sites that Conxion hosts). Conxion was very responsive to me and I complained to Digex (loudly) and service was restored.

If Digex fails to provide routing to Conxion's customers I'll make sure that they know they will lose this customer.

From: gnu@toad.com (John Gilmore);
Sent at Thu, 30 Apr 1998 10:24:31 -0700;
Re:A Fragile Internet

Routing in the Internet is an achilles heel; it does not scale. As the net grows exponentially, the problem of routing packets to their destinations also grows exponentially. The net needs research to find ways of managing routing that need only linear growth in resources to handle exponential growth of the network. We have been trying to manage the problem administratively (by assigning IP addresses to computers in such a way that the routing tables remain relatively small), and by throwing bodies at the problem (network administrators who watch over and tune and fix the routing day-by-day).

Only after we grew the Internet out of the NSFnet -- once there was no single "backbone" -- was it generally realized that the way in which you assign IP addresses has a dramatic impact on the scale of the routing problem. IPv6 works to automate a lot of this address assignment, and increases the size of the addresses to make it easy to assign them in ways that minimize routing complexity.

When two networks kiss, even at a single point, there are many ways to route the packets among them. Over that wire each network can SEND any packet it wants. But the routing information inside the other guy's network determines what packets will be RECEIVED over that wire. And the routing info which that network offers to ITS neighbors further affects the picture. The process of sending packets away from you is vastly different from the process of channeling packets toward you for reciept. Business considerations also complicate the picture. The different business relationships include "guest" versus "customer" versus "peer"; and "transit" versus "my network only". Multi-homing (connecting for transit to several networks, for performance or reliability) adds more complexity.

Typically an ISP for a paying customer offers "transit" service: they'll send your packets to any other network, paying for any further transit themselves; and they'll advertise your address to the entire Internet so you can receive packets from anyone.

Sometimes ISPs offer each other "peering" service: they'll accept packets destined for their own customers, and will send you packets destined for your own customers. This offloads the networks of both ISP's: they send traffic into the other guy's network as close to its source as possible. But they won't advertise your routes to the whole Internet, just to their network; you still have to pay someone else for transit if you want to receive packets from anybody. This may be what's happening between Digex and Conxion.

(This is all vastly simplified, I'm just trying to hit the high points.)

From: nbornstein@plr.com (Niel M. Bornstein);
Sent at Wed, 29 Apr 1998 18:04:32 -0700;
Re:Calling All DTDs!

If all this is true, I understand why we disconnected on DTDs, because our database, the one that Frontier is built around, doesn't have rigid rules about what can appear where. There really isn't a need for DTDs in our world, but they will be showing up, along with the XML-based information we hope to store and manage and send out, perhaps to software that is more rigid than ours, so we're going to have to embrace DTDs, at some level.

Here's an idea: why not allow a DTD to define an object, or at least filter the input into a table? Right now in Frontier, you can create anything anywhere. But what if there was a DTD associated with a table (say, with a #dtd directive); then you would be restricted to creating only the types of data allowed by the DTD. Sometimes restrictions make you more powerful.

From: bryce@Colorado.EDU (Bryce);
Sent at Wed, 29 Apr 1998 18:04:09 -0700;
Re:Calling All DTDs!

Your questions about the purpose of DTD's reminds me strongly of the long-running contest between fans of "strong typing" and fans of "weak typing" in programming languages. You may be familiar with some of this. The weak-typing (sometimes called "dynamic-typing") folks were into LISP and SmallTalk. The strong-typing folks were into Ada.

The strong-typers championed "bug prevention" and "safety". The weak-typers championed "elegance" and "ease of use". Also, they each claimed each other's advantages for their own. :-)

Needless to say they both had good points. As time went on the strong-typing folks were able to improve their tech in order to gain some of the advantages of dynamic-typing whilst retaining the advantages of strong-typing. C++ and Java are the most influential representatives of that line, although sadly (to me) Java has so far declined to adopt some of the best typing features. If you're curious, the most advanced "strong but also useful" type systems out there nowadays are to be found in Eiffel, Haskell, and an extension of Java called "Pizza".

Anyway, all of this is mere historical perspective. Hopefully it amused you.

My opinion on DTDs, based solely on what i've understood about them from your article, is that they serve a purpose far beyond mere "fitting into a dumb database". The idea, i guess, is the exact same idea as types in a programming language: to distinguish some kinds of things from others, to identify things that aren't actually the kind of thing that they are labelled as being, and even further interesting applications (in object-oriented programming).

From: gnu@toad.com (John Gilmore);
Sent at Wed, 29 Apr 1998 17:49:41 -0700;
Re:Calling All DTDs!

About open source, I want to add that I like the idea. It's how I learned how to program, back in the 70s as a grad student at UW-Madison. I had the source to the original Unix. I was delighted with the airy open-ness of the code, how straightforward it was. It taught me how real software is built in ways that professors can't. I was the total beneficiary of open source as a young person.

Hi Dave. Unix wasn't open source. Unix had source code, and I'm glad you got to read it. (I did too, at Sun.) But it was proprietary, licensed software. You couldn't share that source code with your friends, unless they were Unix licensees too, or build binaries from it and do whatever you wanted with them.

Reading Unix source code was eye-opening. It was great to see how simple and easy-to-write all the little Unix utilities were. That experience gave me the confidence to write my own Unix utilities. Reading any kind of source code will teach you something. Just don't confuse it with open source, which also has other properties.

Source code is a great thing. I'm leading a lawsuit to verify that the First Amendment protects our right to publish source code, or communicate it privately with others. (The US Government claims it's "functional" and that they can put any kind of restrictions they want on whether we can publish it. We claim it's expressive, a work of authorship, and a way of communicating among scientists and engineers.) It's all documented at:


IBM mainframe OS's had something much closer to open source. They used to publish the source code on tapes and microfilm, and give it to anyone who ordered it. The idea was that it was useless to you unless you had an IBM mainframe. That idea went by the wayside when Amdahl and Fujitsu and etc started building IBM 370 clones, and they locked up the software.

Also, in those days, remember that the copyright status of computer software was not clear. AT&T didn't copyright Unix because they thought that would require them to publish it, and it might not be copyrightable anyway. They protected it as a trade secret, though the "secret" got out to hundreds of thousands of people, mostly students, before they started copyrighting later versions.

From: bhughes@choicemicro.com (Bruce Hughes);
Sent at Wed, 29 Apr 1998 17:49:28 -0700;
Re:Calling All DTDs!

Please count me as a vote for the touchy-feely stuff. I cried when I read your piece about Programmers, because it reminded me how much love I have for the work I do and the people who do it with integrity and passion. Keep it up.


From: jschoon@clemson.edu (Jack Schoon);
Sent at Wed, 29 Apr 1998 17:49:14 -0700;
Re:Calling All DTDs!

I felt compelled to comment on your last mailing mainly because you asked for it and I've worked with XML recently and think it's cool.

You mentioned that a DTD isn't required for Frontier because it is has a flexible database system. That's true because it isn't picky about what the data looks like. In other words, you could miss a field and Frontier-as a database won't choke.

However it's likely that in many applications, the 'logic' is going to be looking for certain fields to be present even if the 'storage' doesn't care. For example, it doesn't make sense to process addresses without a state.

So at some point, in the software system, some sort of structural validity checking is going to have to occur. The design question is where will you check for validity.

You could check validity from the scripts or logic that implement the functionality of the system. But this makes for more complicated code. And it isn't very safe to change code if the validity requirements change for some reason.

The DTD provides a means to push this validity checking forward and into a realm where it is easy to change and understand. It's easy to exchange ideas at the DTD level so if you are collaborating with less technical people, they can understand the DTD and point out errors while they are still easy to fix.

Another great thing about a DTD is that you, as a tool maker, only have to write one parser! If you can parse a DTD file, you can get enough information about the new language to parse it too.

I think the use of a DTD to define the parser is more robust than just working with the data file; trying to find where fields start and stop. It's kind of a lexx for the rest of us.

One last point about the DTD is that it can be used to describe markup languages at are not well-formed (I think). A well-formed language, like your example computer language, has a start and a stop tag for every unit of data. Some markup languages like HTML are not well formed because you can have stand alone tags like the <P> or <BR>. How is your parser going to know not to look for a </P> and a </BR>? The DTD tells it about these cases.

Anyway, good luck on your future work. Hopefully I've given you something to think about. I can certainly say that working with Frontier has given me some new software development perspectives.

From: joshua@jmcmichael.com (Joshua McMichael);
Sent at Wed, 29 Apr 1998 17:48:58 -0700;
Re:Calling All DTDs!

I love ya Dave!

I'm learning Frontier, and keeping up-to-date on all the new Internet tools coming down the pipe - but that's not the main reason for my feelings. It's because you validate my method of existence and learning! "Rolling around in examples until a light bulb goes on" is exactly how I learn. Throwing strawman arguments to friends just for discussion and input is also another preferred method. You're an example of how that can work (and you're even making money at it).

OK, now that my methods are validated... On to more exciting stuff!

This page was last built on Thursday, April 30, 1998 at 11:46:33 AM, with Frontier version 5.0.1. Mail to: dave@scripting.com. © copyright 1997-98 UserLand Software.