Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
11
Benjamin Franklin wrote and published Poor Richard's Almanack annually from 1732 to 1758. Paper was expensive and printing difficult and time-consuming. The type would be inked, the sheet of paper laid on the press, the apprentices would press the sheet, by turning a big screw. Then the sheet was removed and hung up to dry. Then you can do another printing of the same page. Do this ten thousand times and you have ten thousand prints of a sheet. Do it ten thousand more to print a second sheet. Then print the second side of the first sheet ten thousand times and print the second side of the second sheet ten thousand times. Fold 20,000 sheets into eighths, cut and bind them into 10,000 thirty-two page pamphlets and you have your Almanacks. As a youth, Franklin was apprenticed to his brother James, also a printer, in Boston. Franklin liked the work, but James drank and beat him, so he ran away to Philadelphia. When James died, Benjamin sent his widowed sister-in-law Ann five...
4 months ago

More from The Universe of Discourse

A complex bug with a ⸢simple⸣ fix

Last month I did a fairly complex piece of systems programming that worked surprisingly well. But it had one big bug that took me a day to track down. One reason I find the bug interesting is that it exemplifies the sort of challenges that come up in systems programming. The essence of systems programming is that your program is dealing with the state of a complex world, with many independent agents it can't control, all changing things around. Often one can write a program that puts down a wrench and then picks it up again without looking. In systems programming, the program may have to be prepared for the possibility that someone else has come along and moved the wrench. The other reason the bug is interesting is that although it was a big bug, fixing it required only a tiny change. I often struggle to communicate to nonprogrammers just how finicky and fussy programming is. Nonprogrammers, even people who have taken a programming class or two, are used to being harassed by crappy UIs (or by the compiler) about missing punctuation marks and trivially malformed inputs, and they think they understand how fussy programming is. But they usually do not. The issue is much deeper, and I think this is a great example that will help communicate the point. The job of my program, called sync-spam, was to move several weeks of accumulated email from system S to system T. Each message was probably spam, but its owner had not confirmed that yet, and the message was not yet old enough to be thrown away without confirmation. The probably-spam messages were stored on system S in a directory hierarchy with paths like this: /spam/2024-10-18/… where 2024-10-18 was the date the message had been received. Every message system S had received on October 18 was somewhere under /spam/2024-10-18. One directory, the one for the current date, was "active", and new messages were constantly being written to it by some other programs not directly related to mine. The directories for the older dates never changed. Once sync-spam had dealt with the backlog of old messages, it would continue to run, checking periodically for new messages in the active directory. The sync-spam program had a database that recorded, for each message, whether it had successfully sent that message from S to T, so that it wouldn't try to send the same message again. The program worked like this: Repeat forever: Scan the top-level spam directory for the available dates For each date D: Scan the directory for D and find the messages in it. Add to the database any messages not already recorded there. Query the database for the list of messages for date D that have not yet been sent to T For each such message: Attempt to send the message If the attempt was successful, record that in the database Wait some appropriate amount of time and continue. Okay, very good. The program would first attempt to deal with all the accumulated messages in roughly chronological order, processing the large backlog. Let's say that on November 1 it got around to scanning the active 2024-11-01 directory for the first time. There are many messages, and scanning takes several minutes, so by the time it finishes scanning, some new messages will be in the active directory that it hasn't seen. That's okay. The program will attempt to send the messages that it has seen. The next time it comes around to 2024-11-01 it will re-scan the directory and find the new messages that have appeared since the last time around. But scanning a date directory takes several minutes, so we would prefer not to do it if we don't have to. Since only the active directory ever changes, if the program is running on November 1, it can be sure that none of the directories from October will ever change again, so there is no point in its rescanning them. In fact, once we have located the messages in a date directory and recorded them in the database, there is no point in scanning it again unless it is the active directory, the one for today's date. So sync-spam had an elaboration that made it much more efficient. It was able to put a mark on a date directory that meant "I have completely scanned this directory and I know it will not change again". The algorithm was just as I said above, except with these elaborations. Repeat forever: Scan the top-level spam directory for the available dates For each date D: If the directory for D is marked as having already been scanned, we already know exactly what messages are in it, since they are already recorded in the database. Otherwise: Scan the directory for D and find the messages in it. Add to the database any messages not already recorded there. If D is not today's date, mark the directory for D as having been scanned completely, because we need not scan it again. Query the database for the list of messages for date D that have not yet been sent to T For each such message: Attempt to send the message If the attempt was successful, record that in the database Wait some appropriate amount of time and continue. It's important to not mark the active directory as having been completely scanned, because new messages are continually being deposited into it until the end of the day. I implemented this, we started it up, and it looked good. For several days it processed the backlog of unsent messages from September and October, and it successfully sent most of them. It eventually caught up to the active directory for the current date, 2024-11-01, scanned it, and sent most of the messages. Then it went back and started over again with the earliest date, attempting to send any messages that it hadn't sent the first time. But a couple of days later, we noticed that something was wrong. Directories 2024-11-02 and 2024-11-03 had been created and were well-stocked with the messages that had been received on those dates. The program had found the directories for those dates and had marked them as having been scanned, but there were no messages from those dates in its database. Now why do you suppose that is? (Spoilers will follow the horizontal line.) I investigate this in two ways. First, I made sync-spam's logging more detailed and looked at the results. While I was waiting for more logs to accumulate, I built a little tool that would generate a small, simulated spam directory on my local machine, and then I ran sync-spam against the simulated messages, to make sure it was doing what I expected. In the end, though, neither of these led directly to my solving the problem; I just had a sudden inspiration. This is very unusual for me. Still, I probably wouldn't have had the sudden inspiration if the information from the logging and the debugging hadn't been percolating around my head. Fortune favors the prepared mind. The problem was this: some other agent was creating the 2024-11-02 directory a bit prematurely, say at 11:55 PM on November 1. Then sync-spam came along in the last minutes of November 1 and started its main loop. It scanned the spam directory for available dates, and found 2024-11-02. It processed the unsent messages from the directories for earlier dates, then looked at 2024-11-02 for the first time. And then, at around 11:58, as per above it would: Scan the directory for 2024-11-02 and find the messages in it. Add to the database any messages not already recorded there. There weren't any yet, because it was still 11:58 on November 1. If 2024-11-02 is not today's date, mark the directory as having been scanned completely, because we need not scan it again. Since the 2024-11-02 directory was not the one for today's date — it was still 11:58 on November 1 — sync-spam recorded that it had scanned that directory completely and need not scan it again. Five minutes later, at 00:03 on November 2, there would be new messages in the 2024-11-02, which was now the active directory, but sync-spam wouldn't look for them, because it had already marked 2024-11-02 as having been scanned completely. This complex problem in this large program was completely fixed by changing: if ($date ne $self->current_date) { $self->mark_this_date_fully_scanned($date_dir); } to: if ($date lt $self->current_date) { $self->mark_this_date_fully_scanned($date_dir); } (ne and lt are Perl-speak for "not equal to" and "less than".) Many organizations have their own version of a certain legend, which tells how a famous person from the past was once called out of retirement to solve a technical problem that nobody else could understand. I first heard the General Electric version of the legend, in which Charles Proteus Steinmetz was called out of retirement to figure out why a large complex of electrical equipment was not working. In the story, Steinmetz walked around the room, looking briefly at each of the large complicated machines. Then, without a word, he took a piece of chalk from his pocket, marked one of the panels, and departed. When the puzzled engineers removed that panel, they found a failed component, and when that component was replaced, the problem was solved. Steinmetz's consulting bill for $10,000 arrived the following week. Shocked, the bean-counters replied that $10,000 seemed an exorbitant fee for making a single chalk mark, and, hoping to embarrass him into reducing the fee, asked him to itemize the bill. Steinmetz returned the itemized bill: One chalk mark $1.00 Knowing where to put it $9,999.00 TOTAL $10,000.00 This felt like one of those times. Any day when I can feel a connection with Charles Proteus Steinmetz is a good day. This episode also makes me think of the following variation on an old joke: A: Ask me what is the most difficult thing about systems programming. B: Okay, what is the most difficult thing ab— A: TIMING!

4 months ago 22 votes
Another corner of Pennsylvania

[ Previously: [1] [2] [3] ] A couple of years back I wrote: I live in southeastern Pennsylvania, so the Pennsylvania-New Jersey-Delaware triple point must be somewhere nearby. I sat up and got my phone so I could look at the map, and felt foolish. As you can see, the triple point is in the middle of the Delaware River, as of course it must be; the entire border between Pennsylvania and New Jersey, all the hundreds of miles from its northernmost point (near Port Jervis) to its southernmost (shown above), runs right down the middle of the Delaware. I briefly considered making a trip to get as close as possible, and photographing the point from land. That would not be too inconvenient. Nearby Marcus Hook is served by commuter rail. But Marcus Hook is not very attractive as a destination. Having been to Marcus Hook, it is hard for me to work up much enthusiasm for a return visit. I was recently passing by Marcus Hook on the way back from Annapolis, so I thought what the heck, I'd stop in and see if I could get a look in the direction of the tripoint. As you can see from this screencap, I was at least standing in the right place, pointed in the right direction. I didn't quite see the tripoint itself because this buoyancy-operated aquatic transport was in the way. I don't mind, it was more interesting to look at than open water would have been. Thanks to the Wonders of the Internet, I have learned that this is an LPG tanker. Hydrocarbons from hundreds of miles away are delivered to the refinery in Marcus Hook via rail, road, and pipeline, and then shipped out on vessels like this one. Infrastructure fans should check it out. I was pleased to find that Marcus Hook wasn't as dismal as I remembered, it's just a typical industrial small town. I thought maybe I should go back and look around some more. If you hoped I might have something more interesting or even profound to say here, sorry. Oh, I know. Here, I took this picture in Annapolis: Perhaps he who is worthy of honor does not die. But fame is fleeting. Even if he who is worthy of honor does get a plinth, the grateful populace may not want to shell out for a statue.

4 months ago 12 votes
Dancing bread

Marnanel Thurman reported the following item that they found in an 1875 book titled How to Entertain a Social Party: To Make a Loaf of Bread Dance on the Table. — Having a quill filled with quicksilver and stopped close, you secretly thrust it into a hot roll or loaf, which will put it in motion. (Bottom of page 46.) No further explanation is given. This may remind you of an episode from Huckleberry Finn: Well, then I happened to think how they always put quicksilver in loaves of bread and float them off, because they always go right to the drownded carcass and stop there. (Chapter 8.) When I first read this I assumed it was a local Southern superstition, characteristic of that place and time. But it seems not! According to this article by Dan Rolph of the Historical Society of Pennsylvania, the belief was longstanding and widespread, lasting from at least 1767 to 1872, and appearing also in London and in Pennsylvania. Details of the dancing bread trick are lacking. I guess the quicksilver stays inside the stopped-up quill. (Otherwise, there would be no need to “stop it close”.) Then perhaps on being heated by the bread, the quicksilver expands lengthwise as in a thermometer, and then… my imagination fails me. The procedure for making drowned-body-finding bread is quite different. Rolph's sources all agree: you poke in your finger and scoop out a bit of the inside, pour the quicksilver into the cavity, and then plug up the hole. So there's no quill; the quicksilver is just sloshing around loose in there. Huckleberry Finn agrees: I took out the plug and shook out the little dab of quicksilver… Does anyone have more information about this? Does hot bread filled with mercury really dance on the table, and if so why? Is the supersition about bread finding drowned bodies related to this, or is it a coincidence? Also, what song did the sirens sing, and by what name was Achilles called when he hid among women?

4 months ago 11 votes
XKCD game theory question

(Source: XKCD “Exam numbers”.) This post is about the bottom center panel, “Game Theory final exam”. I don't know much about game theory and I haven't seen any other discussion of this question. But I have a strategy I think is plausible and I'm somewhat pleased with. (I assume that answers to the exam question must be real numbers — not  — and that “average” here is short for 'arithmetic mean'.) First, I believe the other players and I must find a way to agree on what the average will be, or else we are all doomed. We can't communicate, so we should choose a Schelling point and hope that everyone else chooses the same one. Fortunately, there is only one distinguished choice: zero. So I will try to make the average zero and I will hope that others are trying to do the same. If we succeed in doing this, any winning entry will therefore be . Not all players can win because the average must be . But can win, if the one other player writes . So my job is to decide whether I will be the loser. I should select a random integer between and . If it is zero, I have drawn a short straw, and will write . otherwise I write . (The straw-drawing analogy is perhaps misleading. Normally, exactly one straw is short. Here, any or all of the straws might be short.) If everyone follows this strategy, then I will win if exactly one person draws a short straw and if that one person isn't me. The former has a probability that rapidly approaches as increases, and the latter is . In an -person class, the probability of my winning is $$\left(\frac{n-1}n\right)^n$$ which is already better than when , and it increases slowly toward after that. Some miscellaneous thoughts: The whole thing depends on my idea that everyone will agree on as a Schelling point. Is that even how Schelling points work? Maybe I don't understand Schelling points. I like that the probability appears. It's surprising how often this comes up, often when multiple agents try to coordinate without communicating. For example, in ALOHAnet a number of ground stations independently try to send packets to a single satellite transceiver, but if more than one tries to send a packet at a particular time, the packets are garbled and must be retransmitted. At most of the available bandwidth can be used, the rest being lost to packet collisions. The first strategy I thought of was plausible but worse: flip a coin, and write down if it is heads and if it is tails. With this strategy I win if exactly of the class flips heads and if I do too. The probability of this happening is only $$\frac{n\choose n/2}{2^n}\cdot \frac12 \approx \frac1{\sqrt{2\pi n}}.$$ Unlike the other strategy, this decreases to zero as increases, and in no case is it better than the first strategy. It also fails badly if the class contains an odd number of people. Thanks to Brian Lee for figuring out the asymptotic value of so I didn't have to. Just because this was the best strategy I could think of in no way means that it is the best there is. There might have been something much smarter that I did not think of, and if there is then my strategy will sabotage everyone else. Game theorists do think of all sorts of weird strategies that you wouldn't expect could exist. I wrote an article about one a few years back. Going in the other direction, even if of the smartest people all agree on the smartest possible strategy, if the th person is Leeroy Jenkins, he is going to ruin it for everyone. If I were grading this exam, I might give full marks to anyone who wrote down either or , even if the average came out to something else. For a similar and also interesting but less slippery question, see Wikipedia's article on Guess ⅔ of the average. Much of the discussion there is directly relevant. For example, “For Nash equilibrium to be played, players would need to assume both that everyone else is rational and that there is common knowledge of rationality. However, this is a strong assumption.” LEEROY JENKINS\infty-\infty-5010-5022-50\frac161010n-1!! players (including Vidkun) win if exactly one of them rolls zero. Vidkun's chance of winning increases. Intuitively, the other players' chances of winning ought to decrease. But by how much? I think I keep messing up the calculation because I keep getting zero. If this were actually correct, it would be a fascinating paradox!

4 months ago 10 votes
I DON'T KNOW

If you're an annoying know-it-all like me, I suggest that you try playing the following game when you attend a conference or a user group meetup or even a work meeting. The game is: If someone asks you a question, and you say “I don't know”, you score a point. That's it. That's the game. “I don't know” doesn't have to be perfectly truthful, only approximately truthful. I forgot, there is one other rule: If you follow up with something like “But if I had to guess…” you lose your point again.

4 months ago 9 votes

More in comics

Saturday Morning Breakfast Cereal - Total

Click here to go see the bonus panel! Hovertext: If everyone gets killed because a neural network can't analyze itself, you owe me five bucks. Today's News:

9 hours ago 1 votes
AlphaMove
2 days ago 2 votes
Saturday Morning Breakfast Cereal - Origin Story

Click here to go see the bonus panel! Hovertext: Also will pay you but still not consider it REAL job. Today's News:

3 days ago 2 votes
Humidifier Review
4 days ago 3 votes
Saturday Morning Breakfast Cereal - Axial

Click here to go see the bonus panel! Hovertext: Have you noticed how effective San Francisco is at producing ways to drop out of reality through technology? Today's News:

5 days ago 3 votes