More from Computer Things
I'm making a more focused effort to juggle this year. Mostly boxes, but also classic balls too.1 I've gotten to the point where I can almost consistently do a five-ball cascade, which I thought was the cutoff to being a "good juggler". "Thought" because I now know a "good juggler" is one who can do the five-ball cascade with outside throws. I know this because I can't do the outside five-ball cascade... yet. But it's something I can see myself eventually mastering, unlike the slightly more difficult trick of the five-ball mess, which is impossible for mere mortals like me. In theory there is a spectrum of trick difficulties and skill levels. I could place myself on the axis like this: In practice, there are three tiers: Toddlers Good jugglers who practice hard Genetic freaks and actual wizards And the graph always, always looks like this: This is the jugglers curse, and it's a three-parter: The threshold between you and "good" is the next trick you cannot do. Everything below that level is trivial. Once you've gotten a trick down, you can never go back to not knowing it, to appreciating how difficult it was to learn in the first place.2 Everything above that level is just "impossible". You don't have the knowledge needed to recognize the different tiers.3 So as you get better, the stuff that was impossible becomes differentiable, and you can see that some of it is possible. And everything you learned becomes trivial. So you're never a good juggler until you learn "just one more hard trick". The more you know, the more you know you don't know and the less you know you know. This is supposed to be a software newsletter A monad is a monoid in the category of endofunctors, what's the problem? (src) I think this applies to any difficult topic? Most fields don't have the same stark spectral lines as juggling, but there's still tiers of difficulty to techniques, which get compressed the further in either direction they are from your current level. Like, I'm not good at formal methods. I've written two books on it but I've never mastered a dependently-typed language or a theorem prover. Those are equally hard. And I'm not good at modeling concurrent systems because I don't understand the formal definition of bisimulation and haven't implemented a Raft. Those are also equally hard, in fact exactly as hard as mastering a theorem prover. At the same time, the skills I've already developed are easy: properly using refinement is exactly as easy as writing a wrapped counter. Then I get surprised when I try to explain strong fairness to someone and they just don't get how □◇(ENABLED〈A〉ᵥ) is obviously different from ◇□(ENABLED 〈A〉ᵥ). Juggler's curse! Now I don't actually know if this is actually how everybody experiences expertise or if it's just my particular personality— I was a juggler long before I was a software developer. Then again, I'd argue that lots of people talk about one consequence of the juggler's curse: imposter syndrome. If you constantly think what you know is "trivial" and what you don't know is "impossible", then yeah, you'd start feeling like an imposter at work real quick. I wonder if part of the cause is that a lot of skills you have to learn are invisible. One of my favorite blog posts ever is In Defense of Blub Studies, which argues that software expertise comes through understanding "boring" topics like "what all of the error messages mean" and "how to use a debugger well". Blub is a critical part of expertise and takes a lot of hard work to learn, but it feels like trivia. So looking back on a skill I mastered, I might think it was "easy" because I'm not including all of the blub that I had to learn, too. The takeaway, of course, is that the outside five-ball cascade is objectively the cutoff between good jugglers and toddlers. Rant time: I love cigar box juggling. It's fun, it's creative, it's totally unlike any other kind of juggling. And it's so niche I straight up cannot find anybody in Chicago to practice with. I once went to a juggling convention and was the only person with a cigar box set there. ↩ This particular part of the juggler's curse is also called the curse of knowledge or "expert blindness". ↩ This isn't Dunning-Kruger, because DK says that people think they are better than they actually are, and also may not actually be real. ↩
First of all, I just released version 0.6 of Logic for Programmers! You can get it here. Release notes in the footnote.1 I've been thinking about my next project after the book's done. One idea is to do a survey of new formal specification languages. There's been a lot of new ones in the past few years (P, Quint, etc), plus some old ones I haven't critically examined (SPIN, mcrl2). I'm thinking of a brief overview of each, what's interesting about it, and some examples of the corresponding models. For this I'd want a set of "Rosetta" examples. Rosetta Code is a collection of programming tasks done in different languages. For example, "99 bottles of beer on the wall" in over 300 languages. If I wanted to make a Rosetta Code for specifications of concurrent systems, what examples would I use? What makes a good Rosetta examples? A good Rosetta example would be simple enough to understand and implement but also showcase the differences between the languages. A good example of a Rosetta example is leftpad for code verification. Proving leftpad correct is short in whatever verification language you use. But the proofs themselves are different enough that you can compare what it's like to use code contracts vs with dependent types, etc. A bad Rosetta example is "hello world". While it's good for showing how to run a language, it doesn't clearly differentiate languages. Haskell's "hello world" is almost identical to BASIC's "hello world". Rosetta examples don't have to be flashy, but I want mine to be flashy. Formal specification is niche enough that regardless of my medium, most of my audience hasn't use it and may be skeptical. I always have to be selling. This biases me away from using things like dining philosophers or two-phase commit. So with that in mind, three ideas: 1. Wrapped Counter A counter that starts at 1 and counts to N, after which it wraps around to 1 again. Why it's good This is a good introductory formal specification: it's a minimal possible stateful system without concurrency or nondeterminism. You can use it to talk about the basic structure of a spec, how a verifier works, etc. It also a good way of introducing "boring" semantics, like conditionals and arithmetic, and checking if the language does anything unusual with them. Alloy, for example, defaults to 4-bit signed integers, so you run into problems if you set N too high.2 At the same time, wrapped counters are a common building block of complex systems. Lots of things can be represented this way: N=1 is a flag or blinker, N=3 is a traffic light, N=24 is a clock, etc. The next example is better for showing basic safety and liveness properties, but this will do in a pinch. 2. Threads A counter starts at 0. N threads each, simultaneously try to update the counter. They do this nonatomically: first they read the value of the counter and store that in a thread-local tmp, then they increment tmp, then they set the counter to tmp. The expected behavior is that the final value of the counter will be N. Why it's good The system as described is bugged. If two threads interleave the setlocal commands, one thread update can "clobber" the other and the counter can go backwards. To my surprise, most people do not see this error. So it's a good showcase of how the language actually finds real bugs, and how it can verify fixes. As to actual language topics: the spec covers concurrency and track process-local state. A good spec language should make it possible to adjust N without having to add any new variables. And it "naturally" introduces safety, liveness, and action properties. Finally, the thread spec is endlessly adaptable. I've used variations of it to teach refinement, resource starvation, fairness, livelocks, and hyperproperties. Tweak it a bit and you get dining philosophers. 3. Bounded buffer We have a bounded buffer with maximum length X. We have R reader and W writer processes. Before writing, writers first check if the buffer is full. If full, the writer goes to sleep. Otherwise, the writer wakes up a random sleeping process, then pushes an arbitrary value. Readers work the same way, except they pop from the buffer (and go to sleep if the buffer is empty). The only way for a sleeping process to wake up is if another process successfully performs a read or write. Why it's good This shows process-local nondeterminism (in choosing which sleeping process to wake up), different behavior for different types of processes, and deadlocks: it's possible for every reader and writer to be asleep at the same time. The beautiful thing about this example: the spec can only deadlock if X . This is the kind of bug you'd struggle to debug in real code. An in fact, people did struggle: even when presented with a minimal code sample and told there was a bug, many testing experts couldn't find it. Whereas a formal model of the same code finds the bug in seconds. If a spec language can model the bounded buffer, then it's good enough for production systems. On top of that, the bug happens regardless of what writers actually put in the buffer, so you can abstract that all away. This example can demonstrate that you can leave implementation details out of a spec and still find critical errors. Caveat This is all with a heavy TLA+ bias. I've modeled all of these systems in TLA+ and it works pretty well for them. That is to say, none of these do things TLA+ is bad at: reachability, subtyping, transitive closures, unbound spaces, etc. I imagine that as I cover more specification languages I'll find new Rosettas. Exercises are more compact, answers now show name of exercise in title "Conditionals" chapter has new section on nested conditionals "Crash course" chapter significantly rewritten Starting migrating to use consistently use == for equality and = for definition. Not everything is migrated yet "Beyond Logic" appendix does a slightly better job of covering HOL and constructive logic Addressed various reader feedback Two new exercises ↩ You can change the int size in a model run, so this is more "surprising footgun and inconvenience" than "fundamental limit of the specification language." Something still good to know! ↩
Happy new year everyone! I released the first Logic for Programmers alpha six months ago. There's since been four new versions since then, with the November release putting us in beta. Between work and holidays I didn't make much progress in December, but there will be a 0.6 release in the next week or two. People have asked me if the book will ever be available in print, and my answer to that is "when it's done". To keep "when it's done" from being "never", I'm committing myself to have the book finished by July. That means roughly six more releases between now and the official First Edition. Then I will start looking for a way to get it printed. The Current State and What Needs to be Done Right now the book is 26,000 words. For the most part, the structure is set— I don't plan to reorganize the chapters much. But I still need to fix shortcomings identified by the reader feedback. In particular, a few topics need more on real world applications, and the Alloy chapter is pretty weak. There's also a bunch of notes and todos and "fix this"s I need to go over. I also need to rewrite the introduction and predicate logic chapters. Those haven't changed much since 0.1 and I need to go over them very carefully. After that comes copyediting. Ugh, Copyediting Copyediting means going through the entire book to make word and sentence sentence level changes to the flow. An example would be changing From To I said predicates are just “boolean functions”. That isn’t quite true. It's easy to think of predicates as just "boolean" functions, but there is a subtle and important difference. It's a tiny difference but it reads slightly better to me and makes the book slghtly better. Now repeat that for all 3000-odd sentences in the book and I'm done with copyediting! For the first pass, anyway. Copyediting is miserable. Some of the changes I need to make come from reader feedback, but most will come from going through it line-by-line with a copyeditor. Someone's kindly offered to do some of this for free, but I want to find a professional too. If you know anybody, let me know. Formatting The book, if I'm being honest, looks ugly. I'm using the default sphinx/latex combination for layout and typesetting. My thinking is it's not worth making the book pretty until it's worth reading. But I also want the book, when it's eventually printed, to look nice. At the very least it shouldn't have "self-published" vibes. I've found someone who's been giving me excellent advice on layout and I'm slowly mastering the LaTeX formatting arcana. It's gonna take a few iterations to get things right. Front cover Currently the front cover is this: It works but gives "programmer spent ten minutes in Inkscape" vibes. I have a vision in my head for what would be nicer. A few people have recommended using Fiverr. So far the results haven't been that good, Fixing Epub Ugh I thought making an epub version would be kinder for phone reading, but it's such a painful format to develop for. Did you know that epub backlinks work totally different on kindle vs other ereaders? Did you know the only way to test if you got em working right is to load them up in a virtual kindle? The feedback loops are miserable. So I've been treating epub as a second-class citizen for now and only fixing the worst errors (like math not rendering properly), but that'll have to change as the book finalizes. What comes next? After 1.0, I get my book an ISBN and figure out how to make print copies. The margin on print is way lower than ebooks, especially if it's on-demand: the net royalties for Amazon direct publishing would be 7 dollars on a 20-dollar book (as opposed to Leanpub's 16 dollars). Would having a print version double the sales? I hope so! Either way, a lot of people have been asking about print version so I want to make that possible. (I also want to figure out how to give people who already have the ebook a discount on print, but I don't know if that's feasible.) Then, I dunno, maybe make a talk or a workshop I can pitch to conferences. Once I have that I think I can call LfP complete... at least until the second edition. Anyway none of that is actually technical so here's a quick fun thing. I spent a good chunk of my break reading the mCRL2 book. mCRL2 defines an "algebra" for "communicating processes". As a very broad explanation, that's defining what it means to "add" and "multiply" two processes. What's interesting is that according to their definition, the algebra follows the distributive law, but only if you multiply on the right. eg // VALID (a+b)*c = a*c + b*c // INVALID a*(b+c) = a*b + a*c This is the first time I've ever seen this in practice! Juries still out on the rest of the language. Videos and Stuff My DDD Europe talk is now out! What We Know We Don't Know is about empirical software engineering in general, and software engineering research on Domain Driven Design in particular. I was interviewed in the last video on Craft vs Cruft's "Year of Formal Methods". Check it out!
Channukah's next week and that means my favorite pastime, complaining about how Dreidel is a bad game. Last year I formally modeled it in PRISM to prove the game's not fun. But because I limited the model to only a small case, I couldn't prove the game was truly bad. It's time to finish the job. The Story so far You can read the last year's newsletter here but here are the high-level notes. The Game of Dreidel Every player starts with N pieces (usually chocolate coins). This is usually 10-15 pieces per player. At the beginning of the game, and whenever the pot is empty, every play antes one coin into the pot. Turns consist of spinning the dreidel. Outcomes are: נ (Nun): nothing happens. ה (He): player takes half the pot, rounded up. ג (Gimmel): player takes the whole pot, everybody antes. ש (Shin): player adds one of their coins to the pot. If a player ever has zero coins, they are eliminated. Play continues until only one player remains. If you don't have a dreidel, you can instead use a four-sided die, but for the authentic experience you should wait eight seconds before looking at your roll. PRISM PRISM is a probabilistic modeling language, meaning you can encode a system with random chances of doing things and it can answer questions like "on average, how many spins does it take before one player loses" (64, for 4 players/10 coins) and "what's the more likely to knock the first player out, shin or ante" (ante is 2.4x more likely). You can see last year's model here. The problem with PRISM is that it is absurdly inexpressive: it's a thin abstraction for writing giant stochastic matrices and lacks basic affordances like lists or functions. I had to hardcode every possible roll for every player. This meant last year's model had two limits. First, it only handles four players, and I would have to write a new model for three or five players. Second, I made the game end as soon as one player lost: formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0); To fix both of these things, I thought I'd have to treat PRISM as a compilation target, writing a program that took a player count and output the corresponding model. But then December got super busy and I ran out of time to write a program. Instead, I stuck with four hardcoded players and extended the old model to run until victory. The new model These are all changes to last year's model. First, instead of running until one player is out of money, we run until three players are out of money. - formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0); + formula done = + ((p1=0) & (p2=0) & (p3=0)) | + ((p1=0) & (p2=0) & (p4=0)) | + ((p1=0) & (p3=0) & (p4=0)) | + ((p2=0) & (p3=0) & (p4=0)); Next, we change the ante formula. Instead of adding four coins to the pot and subtracting a coin from each player, we add one coin for each player left. min(p1, 1) is 1 if player 1 is still in the game, and 0 otherwise. + formula ante_left = min(p1, 1) + min(p2, 1) + min(p3, 1) + min(p4, 1); We also have to make sure anteing doesn't end a player with negative money. - [ante] (pot = 0) & !done -> (pot'=pot+4) & (p1' = p1-1) & (p2' = p2-1) & (p3' = p3-1) & (p4' = p4-1); + [ante] (pot = 0) & !done -> (pot'=pot+ante_left) & (p1' = max(p1-1, 0)) & (p2' = max(p2-1, 0)) & (p3' = max(p3-1, 0)) & (p4' = max(p4-1, 0)); Finally, we have to add logic for a player being "out". Instead of moving to the next player after each turn, we move to the next player still in the game. Also, if someone starts their turn without any coins (f.ex if they just anted their last coin), we just skip their turn. + formula p1n = (p2 > 0 ? 2 : p3 > 0 ? 3 : 4); + [lost] ((pot != 0) & !done & (turn = 1) & (p1 = 0)) -> (turn' = p1n); - [spin] ((pot != 0) & !done & (turn = 1)) -> + [spin] ((pot != 0) & !done & (turn = 1) & (p1 != 0)) -> 0.25: (p1' = p1-1) & (pot' = min(pot+1, maxval)) - & (turn' = 2) //shin + & (turn' = p1n) //shin We make similar changes for all of the other players. You can see the final model here. Querying the model So now we have a full game of Dreidel that runs until the player ends. And now, finally, we can see the average number of spins a 4 player game will last. ./prism dreidel.prism -const M=10 -pf 'R=? [F done]' In English: each player starts with ten coins. R=? means "expected value of the 'reward'", where 'reward' in this case means number of spins. [F done] weights the reward over all behaviors that reach ("Finally") the done state. Result: 760.5607582661091 Time for model checking: 384.17 seconds. So there's the number: 760 spins.1 At 8 seconds a spin, that's almost two hours for one game. …Jesus, look at that runtime. Six minutes to test one query. PRISM has over a hundred settings that affect model checking, with descriptions like "Pareto curve threshold" and "Use Backwards Pseudo SOR". After looking through them all, I found this perfect combination of configurations that gets the runtime to a more manageable level: ./prism dreidel.prism -const M=10 -pf 'R=? [F done]' + -heuristic speed Result: 760.816255997373 Time for model checking: 13.44 seconds. Yes, that's a literal "make it faster" flag. Anyway, that's only the "average" number of spins, weighted across all games. Dreidel has a very long tail. To find that out, we'll use a variation on our query: const C0; P=? [F P=? is the Probability something happens. F means we Finally reach state done in at most C0 steps. By passing in different values of C0 we can get a sense of how long a game takes. Since "steps" includes passes and antes, this will overestimate the length of the game. But antes take time too and it should only "pass" on a player once per player, so this should still be a good metric for game length. ./prism dreidel.prism -const M=10 -const C0=1000:1000:5000 -pf 'const C0; P=? [F A full 10% of games don't finish in 2000 steps, and 2% pass the 3000 step barrier. At 8 seconds a roll/ante, 3000 steps is over six hours. Dreidel is a bad game. More fun properties As a sanity check, let's confirm last year's result, that it takes an average of 64ish spins before one player is out. In that model, we just needed to get the total reward. Now we instead want to get the reward until the first state where any of the players have zero coins. 2 ./prism dreidel.prism -const M=10 -pf 'R=? [F (p1=0 | p2=0 | p3=0 | p4=0)]' -heuristic speed Result: 63.71310116083396 Time for model checking: 2.017 seconds. Yep, looks good. With our new model we can also get the average point where two players are out and two players are left. PRISM's lack of abstraction makes expressing the condition directly a little painful, but we can cheat and look for the first state where ante_left .3 ./prism dreidel.prism -const M=10 -pf 'R=? [F (ante_left It takes twice as long to eliminate the second player as it takes to eliminate the first, and the remaining two players have to go for another 600 spins. Dreidel is a bad game. The future There's two things I want to do next with this model. The first is script up something that can generate the PRISM model for me, so I can easily adjust the number of players to 3 or 5. The second is that PRISM has a filter-query feature I don't understand but I think it could be used for things like "if a player gets 75% of the pot, what's the probability they lose anyway". Otherwise you have to write wonky queries like (P =? [F p1 = 30 & (F p1 = 0)]) / (P =? [F p1 = 0]).4 But I'm out of time again, so this saga will have to conclude next year. I'm also faced with the terrible revelation that I might be the biggest non-academic user of PRISM. Logic for Programmers Khanukah Sale Still going on! You can get LFP for 40% off here from now until the end of Xannukkah (Jan 2).5 I'm in the Raku Advent Calendar! My piece is called counting up concurrencies. It's about using Raku to do some combinatorics! Read the rest of the blog too, it's great This is different from the original anti-Dreidel article: Ben got 860 spins. That's the average spins if you round down on He, not up. Rounding up on He leads to a shorter game because it means He can empty the pot, which means more antes, and antes are what knocks most players out. ↩ PRISM calls this "co-safe LTL reward" and does not explain what that means, nor do most of the papers I found referencing "co-safe LTL". Eventually I found one that defined it as "any property that only uses X, U, F". ↩ Here's the exact point where I realize I could have defined done as ante_left = 1. Also checking for F (ante_left = 2) gives an expected number of spins as "infinity". I have no idea why. ↩ 10% chances at 4 players / 10 coins. And it takes a minute even with fast mode enabled. ↩ This joke was funnier before I made the whole newsletter about Chanukahh. ↩
More in programming
AGI is coming—whether we’re ready or not. I’ve been convinced of this trajectory since GPT-3’s release, but recent developments have significantly accelerated my timelines. The first major shift was OpenAI’s breakthrough in test-time compute and its newly demonstrated scaling law.
AI Updates There is a lot of chatter about 2025 being the year of agentic frameworks. To me, this means a system in which a subset can allow AI models to take independent actions based on their environment, typically interacting with external APIs or interfaces. The terminology around this concept is still evolving, and definitions […]
This blog post is another one in the ‘writing things down to structure my thinking on where I want my career to go’ series. I will get back to writing technical and automation blog posts soon, but I need to finish my contract testing course first. One of the things I like to do most in life is traveling and seeing new places. Well, seeing new places, mostly, as the novelty of waiting, flying and staying in hotel rooms has definitely worn off by now. I am in the privileged position (really, that is what it is: I’m privileged, and I fully realize that) that I get to scratch this travel itch professionally on a regular basis these days. Over the last few years, I have been invited to contribute to meetups and conferences abroad, and I also get to run in-house training sessions with companies outside the Netherlands a couple of times per year. Most of this traveling takes place within Europe, but for the last three years, I have been able to travel outside of Europe once every year (South Africa in 2022, Canada in 2023 and the United States in 2024), and needless to say I have enjoyed those opportunities very much. To give you an idea of the amount of traveling I do: for 2025, I now have four work-related trips abroad scheduled, and I am pretty sure at least a few more will be added to that before the year ends (it’s only just February…). That might not be much travel by some people’s standards, but for me, it is. And it seems the number of opportunities I get for traveling increase year over year, to the point where I have to say ‘no’ to several of these opportunities. Say no? Why? I thought you just said you loved to travel? Yes, that’s true. I do love to travel. But I also love spending time at home with my family, and that comes first. Always. Now, my sons are getting older, and being away from home for a few days doesn’t put as much pressure on them and on my wife as it did a few years ago. Still, I always need to find a balance between spending time with them and spending time at work. I am away from home for work not just when I’m abroad. I run evening training sessions with clients here in the Netherlands on a regular basis, too, as well as training sessions in my evenings for clients in different time zones, mainly US-based clients. And all that adds up. I try to only be away from home one night per week, but often, it’s two. When I travel abroad, it’s even more than that. Again, I’m not complaining. Not at all. It is an absolute privilege to get to travel for work and get paid to do that, but I cannot do that indefinitely, and that’s why I have made a decision: With a few exceptions (more on those below), I am going to say ‘no’ to conferences abroad from now on. This is a tough decision for me to make, but sometimes that’s exactly what you need to do. Tough, because I have very fond memories of all the conferences and meetups abroad I have contributed to. My first one, Romanian Testing Conference in 2017. My first keynote abroad, UKStar in 2019. My first one outside of Europe, Targeting Quality in 2023. They were all amazing, because of the travel and sightseeing (when time allowed), but also because of all the people I have met at these conferences. Yet, I can meet at least some of these people at conferences here in the Netherlands, too. Test Automation Days, the TestNet events, the Dutch Testing Day and TestMass all provide a great opportunity for me to catch up with my network. Sometimes, international conferences come to the Netherlands, too, like AutomationSTAR this year. And then there are plenty of smaller meetups here in the Netherlands (and Belgium) where I can meet and catch up with people as well. Plus, the money. I am not going to be a hypocrite and say that money doesn’t play into this. For the reasons mentioned above, I have a limited number of opportunities to travel every year, and I prefer to spend those on in-house training sessions with clients abroad, simply because the pay is much better. Even when a conference compensates flights and hotel (as they should) and offer a speaker or workshop facilitator fee (a nice bonus), it will be significantly less of a payday than when I run a training session with a client. That’s not the fault of those conferences, not at all, especially when they’re compensating their speakers fairly, but this is simply a matter of numbers and budgets. At the moment, I have one, maybe two contributions to conferences abroad coming up, and I gave them my word, so I’ll be there. That’s the SAST 30-year anniversary conference in October, plus one other conference that I’m talking to but haven’t received a ‘yes’ or ‘no’ from yet. Other than that, if conferences reach out to me, it’s likely to be a ‘no’ from now on, unless: the event pays a fee comparable to my rate for in-house training I can combine the event with paid in-house training (for example with a sponsor) it is a country or region I really, really want to visit, either for personal reasons or because I want to grow my professional network there I don’t see the first one happening soon, and the list of destinations for the third one is very short (Norway, Canada, New Zealand, that’s pretty much it), so unless we can arrange paid in-house training alongside the conference, the answer will be a ‘no’ from me. Will this reduce the number of travel opportunities for me? Maybe. Maybe not. Again, I see the number of requests I get for in-house training abroad growing, too, and if that dies down, it’ll be a sign for me that I’ll have to work harder to create those opportunities. For 2025, things are looking pretty good, with trips for training to Romania, North Macedonia and Denmark already scheduled, and several leads for more in the pipeline. And if the number of opportunities does go down, that’s fine, too. I’m happy to spend that time with family, working on other things, or riding my bike. And I’m sure there will be a few opportunities to speak at online meetups, events and webinars, too.
This post covers why companies are considering reincorporating from Delaware to Nevada & Texas
A few weeks ago I ran a terminal survey (you can read the results here) and at the end I asked: What’s the most frustrating thing about using the terminal for you? 1600 people answered, and I decided to spend a few days categorizing all the responses. Along the way I learned that classifying qualitative data is not easy but I gave it my best shot. I ended up building a custom tool to make it faster to categorize everything. As with all of my surveys the methodology isn’t particularly scientific. I just posted the survey to Mastodon and Twitter, ran it for a couple of days, and got answers from whoever happened to see it and felt like responding. Here are the top categories of frustrations! I think it’s worth keeping in mind while reading these comments that 40% of people answering this survey have been using the terminal for 21+ years 95% of people answering the survey have been using the terminal for at least 4 years These comments aren’t coming from total beginners. Here are the categories of frustrations! The number in brackets is the number of people with that frustration. Honestly I don’t how how interesting this is to other people – I’m just writing this up for myself because I’m trying to write a zine about the terminal and I wanted to get a sense for what people are having trouble with. remembering syntax (115) People talked about struggles remembering: the syntax for CLI tools like awk, jq, sed, etc the syntax for redirects keyboard shortcuts for tmux, text editing, etc One example comment: There are just so many little “trivia” details to remember for full functionality. Even after all these years I’ll sometimes forget where it’s 2 or 1 for stderr, or forget which is which for > and >>. switching terminals is hard (91) People talked about struggling with switching systems (for example home/work computer or when SSHing) and running into: OS differences in keyboard shortcuts (like Linux vs Mac) systems which don’t have their preferred text editor (“no vim” or “only vim”) different versions of the same command (like Mac OS grep vs GNU grep) no tab completion a shell they aren’t used to (“the subtle differences between zsh and bash”) as well as differences inside the same system like pagers being not consistent with each other (git diff pagers, other pagers). One example comment: I got used to fish and vi mode which are not available when I ssh into servers, containers. color (85) Lots of problems with color, like: programs setting colors that are unreadable with a light background color finding a colorscheme they like (and getting it to work consistently across different apps) color not working inside several layers of SSH/tmux/etc not liking the defaults not wanting color at all and struggling to turn it off This comment felt relatable to me: Getting my terminal theme configured in a reasonable way between the terminal emulator and fish (I did this years ago and remember it being tedious and fiddly and now feel like I’m locked into my current theme because it works and I dread touching any of that configuration ever again). keyboard shortcuts (84) Half of the comments on keyboard shortcuts were about how on Linux/Windows, the keyboard shortcut to copy/paste in the terminal is different from in the rest of the OS. Some other issues with keyboard shortcuts other than copy/paste: using Ctrl-W in a browser-based terminal and closing the window the terminal only supports a limited set of keyboard shortcuts (no Ctrl-Shift-, no Super, no Hyper, lots of ctrl- shortcuts aren’t possible like Ctrl-,) the OS stopping you from using a terminal keyboard shortcut (like by default Mac OS uses Ctrl+left arrow for something else) issues using emacs in the terminal backspace not working (2) other copy and paste issues (75) Aside from “the keyboard shortcut for copy and paste is different”, there were a lot of OTHER issues with copy and paste, like: copying over SSH how tmux and the terminal emulator both do copy/paste in different ways dealing with many different clipboards (system clipboard, vim clipboard, the “middle click” keyboard on Linux, tmux’s clipboard, etc) and potentially synchronizing them random spaces added when copying from the terminal pasting multiline commands which automatically get run in a terrifying way wanting a way to copy text without using the mouse discoverability (55) There were lots of comments about this, which all came down to the same basic complaint – it’s hard to discover useful tools or features! This comment kind of summed it all up: How difficult it is to learn independently. Most of what I know is an assorted collection of stuff I’ve been told by random people over the years. steep learning curve (44) A lot of comments about it generally having a steep learning curve. A couple of example comments: After 15 years of using it, I’m not much faster than using it than I was 5 or maybe even 10 years ago. and That I know I could make my life easier by learning more about the shortcuts and commands and configuring the terminal but I don’t spend the time because it feels overwhelming. history (42) Some issues with shell history: history not being shared between terminal tabs (16) limits that are too short (4) history not being restored when terminal tabs are restored losing history because the terminal crashed not knowing how to search history One example comment: It wasted a lot of time until I figured it out and still annoys me that “history” on zsh has such a small buffer; I have to type “history 0” to get any useful length of history. bad documentation (37) People talked about: documentation being generally opaque lack of examples in man pages programs which don’t have man pages Here’s a representative comment: Finding good examples and docs. Man pages often not enough, have to wade through stack overflow scrollback (36) A few issues with scrollback: programs printing out too much data making you lose scrollback history resizing the terminal messes up the scrollback lack of timestamps GUI programs that you start in the background printing stuff out that gets in the way of other programs’ outputs One example comment: When resizing the terminal (in particular: making it narrower) leads to broken rewrapping of the scrollback content because the commands formatted their output based on the terminal window width. “it feels outdated” (33) Lots of comments about how the terminal feels hampered by legacy decisions and how users often end up needing to learn implementation details that feel very esoteric. One example comment: Most of the legacy cruft, it would be great to have a green field implementation of the CLI interface. shell scripting (32) Lots of complaints about POSIX shell scripting. There’s a general feeling that shell scripting is difficult but also that switching to a different less standard scripting language (fish, nushell, etc) brings its own problems. Shell scripting. My tolerance to ditch a shell script and go to a scripting language is pretty low. It’s just too messy and powerful. Screwing up can be costly so I don’t even bother. more issues Some more issues that were mentioned at least 10 times: (31) inconsistent command line arguments: is it -h or help or –help? (24) keeping dotfiles in sync across different systems (23) performance (e.g. “my shell takes too long to start”) (20) window management (potentially with some combination of tmux tabs, terminal tabs, and multiple terminal windows. Where did that shell session go?) (17) generally feeling scared/uneasy (“The debilitating fear that I’m going to do some mysterious Bad Thing with a command and I will have absolutely no idea how to fix or undo it or even really figure out what happened”) (16) terminfo issues (“Having to learn about terminfo if/when I try a new terminal emulator and ssh elsewhere.”) (16) lack of image support (sixel etc) (15) SSH issues (like having to start over when you lose the SSH connection) (15) various tmux/screen issues (for example lack of integration between tmux and the terminal emulator) (15) typos & slow typing (13) the terminal getting messed up for various reasons (pressing Ctrl-S, cating a binary, etc) that’s all! I’m not going to make a lot of commentary on these results, but here are a couple of categories that feel related to me: remembering syntax & history (often the thing you need to remember is something you’ve run before!) discoverability & the learning curve (the lack of discoverability is definitely a big part of what makes it hard to learn)