In the server mess I had to take the FeedHose server off the air so I could reorganize things. Now I'm ready to turn it on again.
This server is for testing for interop only. Please don't build serious systems on it.
The default hose on this server is Hacker News.
The main entrypoint is the default page. It's meant to be called from an application. It's a long-poll, meaning it returns when something new is available from the indicated hose, or if it times out. In either case you're expected to loop back around and call it again.
Here's an example call that times out after three seconds, returns JSON-formatted news from Hacker News. (If no new stories came in during the three seconds, which is almost a certainty, you'll just get metadata.)
There are four optional parameters, all of which have defaults:
1. name, is the name of the hose you're inquiring about. If it's not specified the information returned is about the default hose (on this server, Hacker News).
2. timeout, is the number of seconds before the call times out. If it's not specified, the default value is 120 seconds.
3. format, indicates whether the return is in XML or JSON. The default is XML.
4. seed, a string returned from a previous call to the hose, passed back to the next call. If any new items came in while you were processing the the previous call, or if your client rebooted or otherwise went away for a long time, passing back the seed assures that you don't miss any news. If you don't mind possibly missing some news you don't have to specify the seed.
None of the parameters are case-sensitive. In other words, JSON is the same as json is the same as JsOn.
The values returned are straight from RSS 2.0. elements like title, link, description, pubDate, etc. There are three additional values that are passed back, the feedUrl, feedTitle and feedLink of the feed the item came from. Hoses can be made up of many feeds, and it's sometimes useful to know which feed an item came from.
There are two other entry-points:
1. recent, which just returns the latest stories. It's useful to see what you get back from the feednose while you're developing, or to show someone roughly what the hose looks like at a technical level. It looks for several optional parameters, as described above: name and format that determine which hose is queried and what format the returned text is in. There's an additional optional parameter, count. If not specified it's 3. The maximum count is 15.
2. seed, which returns the current seed without waiting and without returning any news. You can specify the name and format, as optional parameters, as described above.
As part of my server reorg, I now maintain a mirror of the static content of scripting.com on S3. The address of this archive is s3.scripting.com. If you go there you'll get an error message, so please no need to report it.
This backup includes the .htaccess files, sitemaps, and could possibly include other metadata. If you have any suggestions let me know.
Uploading the huge folder was a messy operation.
So here's a feature request for the Amazon team: Allow us to upload a zip archive containing the initial contents of a bucket. That way I can arrange it exactly as I want it, and have the upload take a relatively short period of time. All the tools I use are relatively slow at uploading tens of thousands of files in a single batch.
Even better of course would be if Amazon could understand the .htaccess files, and even understand a subset of the httpd.conf file. Then we'd really be cooking with gas!
As part of my server reorganization, I'm once again maintaining sitemaps on scripting.com.
There's a single sitemapindex file that links to all the sitemaps.
I also released the code that generates the sitemaps, running in the OPML Editor. Here's a source listing.
Update: When the nightly backup code runs it also produces a JSON file listing the URLs of files that changed since the last time it backed up. It's stored in a calendar-structured folder.