I've now had a chance to study the problem reported with River5 a few days ago. #
The first part of solving it was writing down concisely what the problem was. Carsten Senger did a great job, but he isn't responsible for the fix, I am. And I wrote the code and am familiar with how it's organized and how it got to be how it is. #
There are two kinds of rivers, ones associated with a list, and ones associated with a feed. The problem applies to both kinds of rivers, but is more likely to show up in the feed-based ones. #
When a new item comes in, it is added to the rivers of all the lists it's in, and in the river for the feed it came from. The river files are stored in files on disk. We cache them in memory. When we want to add an item to a river, we first check if it's in memory. If is, we add the item and we're done. If it's not in memory, we read it from disk, and then add the item to the river. This is where we run into trouble.#
The trouble is that there might be two or more new items from one feed for one river. The first item gets added okay. But when we try to add the second item, since reading the file takes so long, we will find it's not in the cache, so we start a second read. We add our item, but the first item probably isn't in the copy we loaded. It would be an amazing coincidence if it was. So no matter what, we just lost one of the new items from the river. If there are N new items in the first read, we will lose N-1 of them.#
The best solution is this -- create a queue for each river when the first read is initiated. Add its callback as the first and at that time only item in the queue. If a new read comes in while we're still reading the first one, add its callback to the queue. Once the file is read, call all the callbacks in the queue, concurrently, and delete the queue for that file.#
I also considered doing it brute force, simply reading all the rivers at startup before doing any feed reads. But I wanted to write the code. And when I did I was glad, it's really interesting how well JavaScript handles this kind of gymnastics. I laughed out loud a few times while putting it together. Code that makes you laugh is worth writing imho. 💥#