More from The Universe of Discourse
Last month I did a fairly complex piece of systems programming that worked surprisingly well. But it had one big bug that took me a day to track down. One reason I find the bug interesting is that it exemplifies the sort of challenges that come up in systems programming. The essence of systems programming is that your program is dealing with the state of a complex world, with many independent agents it can't control, all changing things around. Often one can write a program that puts down a wrench and then picks it up again without looking. In systems programming, the program may have to be prepared for the possibility that someone else has come along and moved the wrench. The other reason the bug is interesting is that although it was a big bug, fixing it required only a tiny change. I often struggle to communicate to nonprogrammers just how finicky and fussy programming is. Nonprogrammers, even people who have taken a programming class or two, are used to being harassed by crappy UIs (or by the compiler) about missing punctuation marks and trivially malformed inputs, and they think they understand how fussy programming is. But they usually do not. The issue is much deeper, and I think this is a great example that will help communicate the point. The job of my program, called sync-spam, was to move several weeks of accumulated email from system S to system T. Each message was probably spam, but its owner had not confirmed that yet, and the message was not yet old enough to be thrown away without confirmation. The probably-spam messages were stored on system S in a directory hierarchy with paths like this: /spam/2024-10-18/… where 2024-10-18 was the date the message had been received. Every message system S had received on October 18 was somewhere under /spam/2024-10-18. One directory, the one for the current date, was "active", and new messages were constantly being written to it by some other programs not directly related to mine. The directories for the older dates never changed. Once sync-spam had dealt with the backlog of old messages, it would continue to run, checking periodically for new messages in the active directory. The sync-spam program had a database that recorded, for each message, whether it had successfully sent that message from S to T, so that it wouldn't try to send the same message again. The program worked like this: Repeat forever: Scan the top-level spam directory for the available dates For each date D: Scan the directory for D and find the messages in it. Add to the database any messages not already recorded there. Query the database for the list of messages for date D that have not yet been sent to T For each such message: Attempt to send the message If the attempt was successful, record that in the database Wait some appropriate amount of time and continue. Okay, very good. The program would first attempt to deal with all the accumulated messages in roughly chronological order, processing the large backlog. Let's say that on November 1 it got around to scanning the active 2024-11-01 directory for the first time. There are many messages, and scanning takes several minutes, so by the time it finishes scanning, some new messages will be in the active directory that it hasn't seen. That's okay. The program will attempt to send the messages that it has seen. The next time it comes around to 2024-11-01 it will re-scan the directory and find the new messages that have appeared since the last time around. But scanning a date directory takes several minutes, so we would prefer not to do it if we don't have to. Since only the active directory ever changes, if the program is running on November 1, it can be sure that none of the directories from October will ever change again, so there is no point in its rescanning them. In fact, once we have located the messages in a date directory and recorded them in the database, there is no point in scanning it again unless it is the active directory, the one for today's date. So sync-spam had an elaboration that made it much more efficient. It was able to put a mark on a date directory that meant "I have completely scanned this directory and I know it will not change again". The algorithm was just as I said above, except with these elaborations. Repeat forever: Scan the top-level spam directory for the available dates For each date D: If the directory for D is marked as having already been scanned, we already know exactly what messages are in it, since they are already recorded in the database. Otherwise: Scan the directory for D and find the messages in it. Add to the database any messages not already recorded there. If D is not today's date, mark the directory for D as having been scanned completely, because we need not scan it again. Query the database for the list of messages for date D that have not yet been sent to T For each such message: Attempt to send the message If the attempt was successful, record that in the database Wait some appropriate amount of time and continue. It's important to not mark the active directory as having been completely scanned, because new messages are continually being deposited into it until the end of the day. I implemented this, we started it up, and it looked good. For several days it processed the backlog of unsent messages from September and October, and it successfully sent most of them. It eventually caught up to the active directory for the current date, 2024-11-01, scanned it, and sent most of the messages. Then it went back and started over again with the earliest date, attempting to send any messages that it hadn't sent the first time. But a couple of days later, we noticed that something was wrong. Directories 2024-11-02 and 2024-11-03 had been created and were well-stocked with the messages that had been received on those dates. The program had found the directories for those dates and had marked them as having been scanned, but there were no messages from those dates in its database. Now why do you suppose that is? (Spoilers will follow the horizontal line.) I investigate this in two ways. First, I made sync-spam's logging more detailed and looked at the results. While I was waiting for more logs to accumulate, I built a little tool that would generate a small, simulated spam directory on my local machine, and then I ran sync-spam against the simulated messages, to make sure it was doing what I expected. In the end, though, neither of these led directly to my solving the problem; I just had a sudden inspiration. This is very unusual for me. Still, I probably wouldn't have had the sudden inspiration if the information from the logging and the debugging hadn't been percolating around my head. Fortune favors the prepared mind. The problem was this: some other agent was creating the 2024-11-02 directory a bit prematurely, say at 11:55 PM on November 1. Then sync-spam came along in the last minutes of November 1 and started its main loop. It scanned the spam directory for available dates, and found 2024-11-02. It processed the unsent messages from the directories for earlier dates, then looked at 2024-11-02 for the first time. And then, at around 11:58, as per above it would: Scan the directory for 2024-11-02 and find the messages in it. Add to the database any messages not already recorded there. There weren't any yet, because it was still 11:58 on November 1. If 2024-11-02 is not today's date, mark the directory as having been scanned completely, because we need not scan it again. Since the 2024-11-02 directory was not the one for today's date — it was still 11:58 on November 1 — sync-spam recorded that it had scanned that directory completely and need not scan it again. Five minutes later, at 00:03 on November 2, there would be new messages in the 2024-11-02, which was now the active directory, but sync-spam wouldn't look for them, because it had already marked 2024-11-02 as having been scanned completely. This complex problem in this large program was completely fixed by changing: if ($date ne $self->current_date) { $self->mark_this_date_fully_scanned($date_dir); } to: if ($date lt $self->current_date) { $self->mark_this_date_fully_scanned($date_dir); } (ne and lt are Perl-speak for "not equal to" and "less than".) Many organizations have their own version of a certain legend, which tells how a famous person from the past was once called out of retirement to solve a technical problem that nobody else could understand. I first heard the General Electric version of the legend, in which Charles Proteus Steinmetz was called out of retirement to figure out why a large complex of electrical equipment was not working. In the story, Steinmetz walked around the room, looking briefly at each of the large complicated machines. Then, without a word, he took a piece of chalk from his pocket, marked one of the panels, and departed. When the puzzled engineers removed that panel, they found a failed component, and when that component was replaced, the problem was solved. Steinmetz's consulting bill for $10,000 arrived the following week. Shocked, the bean-counters replied that $10,000 seemed an exorbitant fee for making a single chalk mark, and, hoping to embarrass him into reducing the fee, asked him to itemize the bill. Steinmetz returned the itemized bill: One chalk mark $1.00 Knowing where to put it $9,999.00 TOTAL $10,000.00 This felt like one of those times. Any day when I can feel a connection with Charles Proteus Steinmetz is a good day. This episode also makes me think of the following variation on an old joke: A: Ask me what is the most difficult thing about systems programming. B: Okay, what is the most difficult thing ab— A: TIMING!
[ Previously: [1] [2] [3] ] A couple of years back I wrote: I live in southeastern Pennsylvania, so the Pennsylvania-New Jersey-Delaware triple point must be somewhere nearby. I sat up and got my phone so I could look at the map, and felt foolish. As you can see, the triple point is in the middle of the Delaware River, as of course it must be; the entire border between Pennsylvania and New Jersey, all the hundreds of miles from its northernmost point (near Port Jervis) to its southernmost (shown above), runs right down the middle of the Delaware. I briefly considered making a trip to get as close as possible, and photographing the point from land. That would not be too inconvenient. Nearby Marcus Hook is served by commuter rail. But Marcus Hook is not very attractive as a destination. Having been to Marcus Hook, it is hard for me to work up much enthusiasm for a return visit. I was recently passing by Marcus Hook on the way back from Annapolis, so I thought what the heck, I'd stop in and see if I could get a look in the direction of the tripoint. As you can see from this screencap, I was at least standing in the right place, pointed in the right direction. I didn't quite see the tripoint itself because this buoyancy-operated aquatic transport was in the way. I don't mind, it was more interesting to look at than open water would have been. Thanks to the Wonders of the Internet, I have learned that this is an LPG tanker. Hydrocarbons from hundreds of miles away are delivered to the refinery in Marcus Hook via rail, road, and pipeline, and then shipped out on vessels like this one. Infrastructure fans should check it out. I was pleased to find that Marcus Hook wasn't as dismal as I remembered, it's just a typical industrial small town. I thought maybe I should go back and look around some more. If you hoped I might have something more interesting or even profound to say here, sorry. Oh, I know. Here, I took this picture in Annapolis: Perhaps he who is worthy of honor does not die. But fame is fleeting. Even if he who is worthy of honor does get a plinth, the grateful populace may not want to shell out for a statue.
Marnanel Thurman reported the following item that they found in an 1875 book titled How to Entertain a Social Party: To Make a Loaf of Bread Dance on the Table. — Having a quill filled with quicksilver and stopped close, you secretly thrust it into a hot roll or loaf, which will put it in motion. (Bottom of page 46.) No further explanation is given. This may remind you of an episode from Huckleberry Finn: Well, then I happened to think how they always put quicksilver in loaves of bread and float them off, because they always go right to the drownded carcass and stop there. (Chapter 8.) When I first read this I assumed it was a local Southern superstition, characteristic of that place and time. But it seems not! According to this article by Dan Rolph of the Historical Society of Pennsylvania, the belief was longstanding and widespread, lasting from at least 1767 to 1872, and appearing also in London and in Pennsylvania. Details of the dancing bread trick are lacking. I guess the quicksilver stays inside the stopped-up quill. (Otherwise, there would be no need to “stop it close”.) Then perhaps on being heated by the bread, the quicksilver expands lengthwise as in a thermometer, and then… my imagination fails me. The procedure for making drowned-body-finding bread is quite different. Rolph's sources all agree: you poke in your finger and scoop out a bit of the inside, pour the quicksilver into the cavity, and then plug up the hole. So there's no quill; the quicksilver is just sloshing around loose in there. Huckleberry Finn agrees: I took out the plug and shook out the little dab of quicksilver… Does anyone have more information about this? Does hot bread filled with mercury really dance on the table, and if so why? Is the supersition about bread finding drowned bodies related to this, or is it a coincidence? Also, what song did the sirens sing, and by what name was Achilles called when he hid among women?
If you're an annoying know-it-all like me, I suggest that you try playing the following game when you attend a conference or a user group meetup or even a work meeting. The game is: If someone asks you a question, and you say “I don't know”, you score a point. That's it. That's the game. “I don't know” doesn't have to be perfectly truthful, only approximately truthful. I forgot, there is one other rule: If you follow up with something like “But if I had to guess…” you lose your point again.
More in comics
Click here to go see the bonus panel! Hovertext: If everyone gets killed because a neural network can't analyze itself, you owe me five bucks. Today's News:
Click here to go see the bonus panel! Hovertext: Also will pay you but still not consider it REAL job. Today's News:
Click here to go see the bonus panel! Hovertext: Have you noticed how effective San Francisco is at producing ways to drop out of reality through technology? Today's News: