Files must be 100% valid XML. We're trying to move towards a more standard format, and to this end we have included several tags from the popular <scriptingNews> format. We have also ensured that this version is 100% valid XML. We did this by requiring that a DOCTYPE tag be included, and validating each RSS document against that DTD. This means that it is not enough for an RSS document to be "well-formed". It must also be "valid" with respect to its DTD.
No mixed content tags. We are specifically not including any tags that contain mixed content in RSS 0.91. This means that each tag either contains sub-tags only, or text only, not a combination. This is both because we want to keep the format simple, and because our current validation system is not able to handle this type of tag. We also are not allowing any HTML markup beyond the commonly used entities such as " A full list of these are defined in the RSS 0.91 DTD.
New tags for syndication community. Our validator will now allow several new tags through the system, though most of them will not actually be used by Netcenter. However, these may work when syndicating content to other sites. These tags are noted explicitly in the spec as "ignored."
RDF references removed. RSS was originally conceived as a metadata format providing a summary of a website. Two things have become clear: the first is that providers want more of a syndication format than a metadata format. The structure of an RDF file is very precise and must conform to the RDF data model in order to be valid. This is not easily human-understandable and can make it difficult to create useful RDF files. The second is that few tools are available for RDF generation, validation and processing. For these reasons, we have decided to go with a standard XML approach.
Tags in alphabetical order.
information about a particular channel. Everything pertaining to an individual channel is contained within this tag.
Currently displayed on "My Netscape". May use in other locations in the future.
none
copyright string
ignored
none
none
The day of the week, spelled out in English.
ignored
none
none
a plain text description of an item, channel, image, or textinput.
displayed as appropriate depending on context.
none
none
This tag should contain a URL that references a description of the channel.
ignored
none
none
Document Type Identifier. This is an XML tag that identifies where to find the definition for this format. It should follow the xml tag. The full DTD is here.
required to ensure document validity
none
Specifies the height of an image. Should be an integer value.
The value must be between 1 and 400. If ommitted, the default value is 31.
none
none
Specifies an hour of the day. Should be an integer value between 0 and 23. See skipHours.
ignored
none
none
Specifies an image associated with a channel.
Optionally (user preference) display an image along with the channel content.
none
An item that is associated with a channel. The item should represent a web-page, or subsection within a web page. It should have a unique URL associated with it. Each item must contain a title and a link. A description is optional.
generates a list of links. The description, if supplied, may optionally be viewed by the user as plain text beneath the link. Also, a maximum of 15 items per channel is enforced at this time.
none
Specifies the language of a channel. See supported language codes
used to assist user with determining correct page encoding
none
none
The last time the channel was modified.
ignored
none
none
This is a url that a user is expected to click on, as opposed to a <url> that is for loading a resource, such as an image.
must start with either "http://" or "ftp://". All other urls are considered invalid.
none
none
The email address of the managing editor of the site, the person to contact for editorial inquiries
ignored
none
none
The name of an object, corresponding to the "name" attribute of an HTML <INPUT> element. Currently, this only applies to textinput.
generates "name" attribute in html form
none
none
Date when channel was published.
ignored
none
none
ignored. May use in the future to dynamically decide page rating.
none
none
Identifies begin and end of rss content.
identifies content type
A list of <day>s of the week, in English, indicating the days of the week when your channel will not be updated. As with activeHours, if you know your channel will never be updated on Saturday or Sunday, for example
ignored
none
A list of <hour>s indicating the hours in the day, GMT, when the channel is unlikely to be updated. If this sub-item is omitted, the channel is assumed to be updated hourly.
ignored
none
An input field for the purpose of allowing users to submit queries back to the publisher's site. This element should have a title, a link (to a cgi or other processor), a description containing some instructions, and a name, to be used as the name in the HTML tag <input type=text name="[name]">
Displays form for submission back to publisher.
none
An identifying string for a resource. When used in an item, this is the name of the item's link. When used in an image, this is the Alt text for the image. When used in a channel, this is the channel's title. When used in a textinput, this is the the textinput's title.
displayed as appropriate depending on context.
none
none
Location to load a resource from. Note that this is slightly different from the link tag, which specifies where a user should be re-directed to if a resource is selected.
must start with either "http://" or "ftp://". All other urls are considered invalid.
none
none
The email address of the webmaster for the site, the person to contact if there are technical problems with the channel.
ignored
none
none
Specifies the width of an image. Should be an integer value.
The value must be between 1 and 144. If ommitted, the default value is 88.
none
none
Identifies this as an XML document and specifies encoding. see w3c Note that this must be on the first line of the document.
required for XML compliance.
none
These are the language codes that are accepted by Netcenter. Other language codes may be available as specified by the w3c, but these are guaranteed to work with most browsers. Netcenter will currently reject other language codes, however other sites may accept them.
af # Afrikaans sq # Albanian eu # Basque be # Belarusian bg # Bulgarian ca # Catalan zh-cn # Chinese (Simplified) zh-tw # Chinese (Traditional) hr # Croatian cs # Czech da # Danish nl # Dutch nl-be # Dutch (Belgium) nl-nl # Dutch (Netherlands) en # English en-au # English (Australia) en-bz # English (Belize) en-ca # English (Canada) en-ie # English (Ireland) en-jm # English (Jamaica) en-nz # English (New Zealand) en-ph # English (Phillipines) en-za # English (South Africa) en-tt # English (Trinidad) en-gb # English (United Kingdom) en-us # English (United States) en-zw # English (Zimbabwe) fo # Faeroese fi # Finnish fr # French fr-be # French (Belgium) fr-ca # French (Canada) fr-fr # French (France) fr-lu # French (Luxembourg) fr-mc # French (Monaco) fr-ch # French (Switzerland) gl # Galician gd # Gaelic de # German de-at # German (Austria) de-de # German (Germany) de-li # German (Liechtenstein) de-lu # German (Luxembourg) de-ch # German (Switzerland) el # Greek hu # Hungarian is # Icelandic id # Indonesian ga # Irish it # Italian it-it # Italian (Italy) it-ch # Italian (Switzerland) ja # Japanese ko # Korean mk # Macedonian no # Norwegian pl # Polish pt # Portuguese pt-br # Portuguese (Brazil) pt-pt # Portuguese (Portugal) ro # Romanian ro-mo # Romanian (Moldova) ro-ro # Romanian (Romania) ru # Russian ru-mo # Russian (Moldova) ru-ru # Russian (Russia) sr # Serbian sk # Slovak sl # Slovenian es # Spanish es-ar # Spanish (Argentina) es-bo # Spanish (Bolivia) es-cl # Spanish (Chile) es-co # Spanish (Colombia) es-cr # Spanish (Costa Rica) es-do # Spanish (Dominican Republic) es-ec # Spanish (Ecuador) es-sv # Spanish (El Salvador) es-gt # Spanish (Guatemala) es-hn # Spanish (Honduras) es-mx # Spanish (Mexico) es-ni # Spanish (Nicaragua) es-pa # Spanish (Panama) es-py # Spanish (Paraguay) es-pe # Spanish (Peru) es-pr # Spanish (Puerto Rico) es-es # Spanish (Spain) es-uy # Spanish (Uruguay) es-ve # Spanish (Venezuela) sv # Swedish sv-fi # Swedish (Finland) sv-se # Swedish (Sweden) tr # Turkish uk # Ukranian
IANA standard name | MIME prefered name (if different from IANA) |
ANSI_X3.4-1968 | US-ASCII |
ISO_8859-1:1987 | ISO-8859-1 |
ISO_8859-2:1987 | ISO-8859-2 |
ISO_8859-5:1988 | ISO-8859-5 |
ISO_8859-7:1987 | ISO-8859-7 |
ISO_8859-9:1989 | ISO-8859-9 |
Shift_JIS | |
Extended_UNIX_Code_Packed_Format_for_Japanese | EUC-JP |
GB2312 | |
EUC-KR | |
Big5 | |
windows-1250 | |
windows-1251 | |
UTF-8 | |
x-mac-roman |
XML currently provides a limited amount of validation via DTD's. However, DTD's do not provide any support for common validation requirements, such as data types, length of strings, number of sub-elements, or pattern matching.
A standard has been proposed to solve this problem. XML Schemas looks like it will do all of this and more. Unfortunately, there are few, if any parsers available today that understand them.
As a proprietary, interim only solution, we have developed a very simplistic schema format that performs a second level of validation after the parser has read the XML document into memory. We are listing the schema used to validate RSS 0.91 files, so that there will be no ambiguity when validation fails.
Here are the basic rules:
Here is the schema for RSS 0.91.
Here is the DTD for the schema format.