More from Computer Things
I occasionally receive emails asking me to look at the writer's new language/library/tool. Sometimes it's in an area I know well, like formal methods. Other times, I'm a complete stranger to the field. Regardless, I'm generally happy to check it out. When starting out, this is the biggest question I'm looking to answer: What does this technology make easy that's normally hard? What justifies me learning and migrating to a new thing as opposed to fighting through my problems with the tools I already know? The new thing has to have some sort of value proposition, which could be something like "better performance" or "more secure". The most universal value and the most direct to show is "takes less time and mental effort to do something". I can't accurately judge two benchmarks, but I can see two demos or code samples and compare which one feels easier to me. Examples Functional programming What drew me originally to functional programming was higher order functions. # Without HOFs out = [] for x in input { if test(x) { out.append(x) } } # With HOFs filter(test, input) We can also compare the easiness of various tasks between examples within the same paradigm. If I know FP via Clojure, what could be appealing about Haskell or F#? For one, null safety is a lot easier when I've got option types. Array Programming Array programming languages like APL or J make certain classes of computation easier. For example, finding all of the indices where two arrays differ. Here it is in Python: x = [1, 4, 2, 3, 4, 1, 0, 0, 0, 4] y = [2, 3, 1, 1, 2, 3, 2, 0, 2, 4] >>> [i for i, (a, b) in enumerate(zip(x, y)) if a == b] [7, 9] And here it is in J: x =: 1 4 2 3 4 1 0 0 0 4 y =: 2 3 1 1 2 3 2 0 2 4 I. x = y 7 9 Not every tool is meant for every programmer, because you might not have any of the problems a tool makes easier. What comes up more often for you: filtering a list or finding all the indices where two lists differ? Statistically speaking, functional programming is more useful to you than array programming. But I have this problem enough to justify learning array programming. LLMs I think a lot of the appeal of LLMs is they make a lot of specialist tasks easy for nonspecialists. One thing I recently did was convert some rst list tables to csv tables. Normally I'd have to do write some tricky parsing and serialization code to automatically convert between the two. With LLMs, it's just Convert the following rst list-table into a csv-table: [table] "Easy" can trump "correct" as a value. The LLM might get some translations wrong, but it's so convenient I'd rather manually review all the translations for errors than write specialized script that is correct 100% of the time. Let's not take this too far A college friend once claimed that he cracked the secret of human behavior: humans do whatever makes them happiest. "What about the martyr who dies for their beliefs?" "Well, in their last second of life they get REALLY happy." We can do the same here, fitting every value proposition into the frame of "easy". CUDA makes it easier to do matrix multiplication. Rust makes it easier to write low-level code without memory bugs. TLA+ makes it easier to find errors in your design. Monads make it easier to sequence computations in a lazy environment. Making everything about "easy" obscures other reason for adopting new things. That whole "simple vs easy" thing Sometimes people think that "simple" is better than "easy", because "simple" is objective and "easy" is subjective. This comes from the famous talk Simple Made Easy. I'm not sure I agree that simple is better or more objective: the speaker claims that polymorphism and typeclasses are "simpler" than conditionals, and I doubt everybody would agree with that. The problem is that "simple" is used to mean both "not complicated" and "not complex". And everybody agrees that "complicated" and "complex" are different, even if they can't agree what the difference is. This idea should probably expanded be expanded into its own newsletter. It's also a lot harder to pitch a technology on being "simpler". Simplicity by itself doesn't make a tool better equipped to solve problems. Simplicity can unlock other benefits, like compositionality or tractability, that provide the actual value. And often that value is in the form of "makes some tasks easier".
First of all, I just released version 0.6 of Logic for Programmers! You can get it here. Release notes in the footnote.1 I've been thinking about my next project after the book's done. One idea is to do a survey of new formal specification languages. There's been a lot of new ones in the past few years (P, Quint, etc), plus some old ones I haven't critically examined (SPIN, mcrl2). I'm thinking of a brief overview of each, what's interesting about it, and some examples of the corresponding models. For this I'd want a set of "Rosetta" examples. Rosetta Code is a collection of programming tasks done in different languages. For example, "99 bottles of beer on the wall" in over 300 languages. If I wanted to make a Rosetta Code for specifications of concurrent systems, what examples would I use? What makes a good Rosetta examples? A good Rosetta example would be simple enough to understand and implement but also showcase the differences between the languages. A good example of a Rosetta example is leftpad for code verification. Proving leftpad correct is short in whatever verification language you use. But the proofs themselves are different enough that you can compare what it's like to use code contracts vs with dependent types, etc. A bad Rosetta example is "hello world". While it's good for showing how to run a language, it doesn't clearly differentiate languages. Haskell's "hello world" is almost identical to BASIC's "hello world". Rosetta examples don't have to be flashy, but I want mine to be flashy. Formal specification is niche enough that regardless of my medium, most of my audience hasn't use it and may be skeptical. I always have to be selling. This biases me away from using things like dining philosophers or two-phase commit. So with that in mind, three ideas: 1. Wrapped Counter A counter that starts at 1 and counts to N, after which it wraps around to 1 again. Why it's good This is a good introductory formal specification: it's a minimal possible stateful system without concurrency or nondeterminism. You can use it to talk about the basic structure of a spec, how a verifier works, etc. It also a good way of introducing "boring" semantics, like conditionals and arithmetic, and checking if the language does anything unusual with them. Alloy, for example, defaults to 4-bit signed integers, so you run into problems if you set N too high.2 At the same time, wrapped counters are a common building block of complex systems. Lots of things can be represented this way: N=1 is a flag or blinker, N=3 is a traffic light, N=24 is a clock, etc. The next example is better for showing basic safety and liveness properties, but this will do in a pinch. 2. Threads A counter starts at 0. N threads each, simultaneously try to update the counter. They do this nonatomically: first they read the value of the counter and store that in a thread-local tmp, then they increment tmp, then they set the counter to tmp. The expected behavior is that the final value of the counter will be N. Why it's good The system as described is bugged. If two threads interleave the setlocal commands, one thread update can "clobber" the other and the counter can go backwards. To my surprise, most people do not see this error. So it's a good showcase of how the language actually finds real bugs, and how it can verify fixes. As to actual language topics: the spec covers concurrency and track process-local state. A good spec language should make it possible to adjust N without having to add any new variables. And it "naturally" introduces safety, liveness, and action properties. Finally, the thread spec is endlessly adaptable. I've used variations of it to teach refinement, resource starvation, fairness, livelocks, and hyperproperties. Tweak it a bit and you get dining philosophers. 3. Bounded buffer We have a bounded buffer with maximum length X. We have R reader and W writer processes. Before writing, writers first check if the buffer is full. If full, the writer goes to sleep. Otherwise, the writer wakes up a random sleeping process, then pushes an arbitrary value. Readers work the same way, except they pop from the buffer (and go to sleep if the buffer is empty). The only way for a sleeping process to wake up is if another process successfully performs a read or write. Why it's good This shows process-local nondeterminism (in choosing which sleeping process to wake up), different behavior for different types of processes, and deadlocks: it's possible for every reader and writer to be asleep at the same time. The beautiful thing about this example: the spec can only deadlock if X . This is the kind of bug you'd struggle to debug in real code. An in fact, people did struggle: even when presented with a minimal code sample and told there was a bug, many testing experts couldn't find it. Whereas a formal model of the same code finds the bug in seconds. If a spec language can model the bounded buffer, then it's good enough for production systems. On top of that, the bug happens regardless of what writers actually put in the buffer, so you can abstract that all away. This example can demonstrate that you can leave implementation details out of a spec and still find critical errors. Caveat This is all with a heavy TLA+ bias. I've modeled all of these systems in TLA+ and it works pretty well for them. That is to say, none of these do things TLA+ is bad at: reachability, subtyping, transitive closures, unbound spaces, etc. I imagine that as I cover more specification languages I'll find new Rosettas. Exercises are more compact, answers now show name of exercise in title "Conditionals" chapter has new section on nested conditionals "Crash course" chapter significantly rewritten Starting migrating to use consistently use == for equality and = for definition. Not everything is migrated yet "Beyond Logic" appendix does a slightly better job of covering HOL and constructive logic Addressed various reader feedback Two new exercises ↩ You can change the int size in a model run, so this is more "surprising footgun and inconvenience" than "fundamental limit of the specification language." Something still good to know! ↩
Happy new year everyone! I released the first Logic for Programmers alpha six months ago. There's since been four new versions since then, with the November release putting us in beta. Between work and holidays I didn't make much progress in December, but there will be a 0.6 release in the next week or two. People have asked me if the book will ever be available in print, and my answer to that is "when it's done". To keep "when it's done" from being "never", I'm committing myself to have the book finished by July. That means roughly six more releases between now and the official First Edition. Then I will start looking for a way to get it printed. The Current State and What Needs to be Done Right now the book is 26,000 words. For the most part, the structure is set— I don't plan to reorganize the chapters much. But I still need to fix shortcomings identified by the reader feedback. In particular, a few topics need more on real world applications, and the Alloy chapter is pretty weak. There's also a bunch of notes and todos and "fix this"s I need to go over. I also need to rewrite the introduction and predicate logic chapters. Those haven't changed much since 0.1 and I need to go over them very carefully. After that comes copyediting. Ugh, Copyediting Copyediting means going through the entire book to make word and sentence sentence level changes to the flow. An example would be changing From To I said predicates are just “boolean functions”. That isn’t quite true. It's easy to think of predicates as just "boolean" functions, but there is a subtle and important difference. It's a tiny difference but it reads slightly better to me and makes the book slghtly better. Now repeat that for all 3000-odd sentences in the book and I'm done with copyediting! For the first pass, anyway. Copyediting is miserable. Some of the changes I need to make come from reader feedback, but most will come from going through it line-by-line with a copyeditor. Someone's kindly offered to do some of this for free, but I want to find a professional too. If you know anybody, let me know. Formatting The book, if I'm being honest, looks ugly. I'm using the default sphinx/latex combination for layout and typesetting. My thinking is it's not worth making the book pretty until it's worth reading. But I also want the book, when it's eventually printed, to look nice. At the very least it shouldn't have "self-published" vibes. I've found someone who's been giving me excellent advice on layout and I'm slowly mastering the LaTeX formatting arcana. It's gonna take a few iterations to get things right. Front cover Currently the front cover is this: It works but gives "programmer spent ten minutes in Inkscape" vibes. I have a vision in my head for what would be nicer. A few people have recommended using Fiverr. So far the results haven't been that good, Fixing Epub Ugh I thought making an epub version would be kinder for phone reading, but it's such a painful format to develop for. Did you know that epub backlinks work totally different on kindle vs other ereaders? Did you know the only way to test if you got em working right is to load them up in a virtual kindle? The feedback loops are miserable. So I've been treating epub as a second-class citizen for now and only fixing the worst errors (like math not rendering properly), but that'll have to change as the book finalizes. What comes next? After 1.0, I get my book an ISBN and figure out how to make print copies. The margin on print is way lower than ebooks, especially if it's on-demand: the net royalties for Amazon direct publishing would be 7 dollars on a 20-dollar book (as opposed to Leanpub's 16 dollars). Would having a print version double the sales? I hope so! Either way, a lot of people have been asking about print version so I want to make that possible. (I also want to figure out how to give people who already have the ebook a discount on print, but I don't know if that's feasible.) Then, I dunno, maybe make a talk or a workshop I can pitch to conferences. Once I have that I think I can call LfP complete... at least until the second edition. Anyway none of that is actually technical so here's a quick fun thing. I spent a good chunk of my break reading the mCRL2 book. mCRL2 defines an "algebra" for "communicating processes". As a very broad explanation, that's defining what it means to "add" and "multiply" two processes. What's interesting is that according to their definition, the algebra follows the distributive law, but only if you multiply on the right. eg // VALID (a+b)*c = a*c + b*c // INVALID a*(b+c) = a*b + a*c This is the first time I've ever seen this in practice! Juries still out on the rest of the language. Videos and Stuff My DDD Europe talk is now out! What We Know We Don't Know is about empirical software engineering in general, and software engineering research on Domain Driven Design in particular. I was interviewed in the last video on Craft vs Cruft's "Year of Formal Methods". Check it out!
Channukah's next week and that means my favorite pastime, complaining about how Dreidel is a bad game. Last year I formally modeled it in PRISM to prove the game's not fun. But because I limited the model to only a small case, I couldn't prove the game was truly bad. It's time to finish the job. The Story so far You can read the last year's newsletter here but here are the high-level notes. The Game of Dreidel Every player starts with N pieces (usually chocolate coins). This is usually 10-15 pieces per player. At the beginning of the game, and whenever the pot is empty, every play antes one coin into the pot. Turns consist of spinning the dreidel. Outcomes are: נ (Nun): nothing happens. ה (He): player takes half the pot, rounded up. ג (Gimmel): player takes the whole pot, everybody antes. ש (Shin): player adds one of their coins to the pot. If a player ever has zero coins, they are eliminated. Play continues until only one player remains. If you don't have a dreidel, you can instead use a four-sided die, but for the authentic experience you should wait eight seconds before looking at your roll. PRISM PRISM is a probabilistic modeling language, meaning you can encode a system with random chances of doing things and it can answer questions like "on average, how many spins does it take before one player loses" (64, for 4 players/10 coins) and "what's the more likely to knock the first player out, shin or ante" (ante is 2.4x more likely). You can see last year's model here. The problem with PRISM is that it is absurdly inexpressive: it's a thin abstraction for writing giant stochastic matrices and lacks basic affordances like lists or functions. I had to hardcode every possible roll for every player. This meant last year's model had two limits. First, it only handles four players, and I would have to write a new model for three or five players. Second, I made the game end as soon as one player lost: formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0); To fix both of these things, I thought I'd have to treat PRISM as a compilation target, writing a program that took a player count and output the corresponding model. But then December got super busy and I ran out of time to write a program. Instead, I stuck with four hardcoded players and extended the old model to run until victory. The new model These are all changes to last year's model. First, instead of running until one player is out of money, we run until three players are out of money. - formula done = (p1=0) | (p2=0) | (p3=0) | (p4=0); + formula done = + ((p1=0) & (p2=0) & (p3=0)) | + ((p1=0) & (p2=0) & (p4=0)) | + ((p1=0) & (p3=0) & (p4=0)) | + ((p2=0) & (p3=0) & (p4=0)); Next, we change the ante formula. Instead of adding four coins to the pot and subtracting a coin from each player, we add one coin for each player left. min(p1, 1) is 1 if player 1 is still in the game, and 0 otherwise. + formula ante_left = min(p1, 1) + min(p2, 1) + min(p3, 1) + min(p4, 1); We also have to make sure anteing doesn't end a player with negative money. - [ante] (pot = 0) & !done -> (pot'=pot+4) & (p1' = p1-1) & (p2' = p2-1) & (p3' = p3-1) & (p4' = p4-1); + [ante] (pot = 0) & !done -> (pot'=pot+ante_left) & (p1' = max(p1-1, 0)) & (p2' = max(p2-1, 0)) & (p3' = max(p3-1, 0)) & (p4' = max(p4-1, 0)); Finally, we have to add logic for a player being "out". Instead of moving to the next player after each turn, we move to the next player still in the game. Also, if someone starts their turn without any coins (f.ex if they just anted their last coin), we just skip their turn. + formula p1n = (p2 > 0 ? 2 : p3 > 0 ? 3 : 4); + [lost] ((pot != 0) & !done & (turn = 1) & (p1 = 0)) -> (turn' = p1n); - [spin] ((pot != 0) & !done & (turn = 1)) -> + [spin] ((pot != 0) & !done & (turn = 1) & (p1 != 0)) -> 0.25: (p1' = p1-1) & (pot' = min(pot+1, maxval)) - & (turn' = 2) //shin + & (turn' = p1n) //shin We make similar changes for all of the other players. You can see the final model here. Querying the model So now we have a full game of Dreidel that runs until the player ends. And now, finally, we can see the average number of spins a 4 player game will last. ./prism dreidel.prism -const M=10 -pf 'R=? [F done]' In English: each player starts with ten coins. R=? means "expected value of the 'reward'", where 'reward' in this case means number of spins. [F done] weights the reward over all behaviors that reach ("Finally") the done state. Result: 760.5607582661091 Time for model checking: 384.17 seconds. So there's the number: 760 spins.1 At 8 seconds a spin, that's almost two hours for one game. …Jesus, look at that runtime. Six minutes to test one query. PRISM has over a hundred settings that affect model checking, with descriptions like "Pareto curve threshold" and "Use Backwards Pseudo SOR". After looking through them all, I found this perfect combination of configurations that gets the runtime to a more manageable level: ./prism dreidel.prism -const M=10 -pf 'R=? [F done]' + -heuristic speed Result: 760.816255997373 Time for model checking: 13.44 seconds. Yes, that's a literal "make it faster" flag. Anyway, that's only the "average" number of spins, weighted across all games. Dreidel has a very long tail. To find that out, we'll use a variation on our query: const C0; P=? [F P=? is the Probability something happens. F means we Finally reach state done in at most C0 steps. By passing in different values of C0 we can get a sense of how long a game takes. Since "steps" includes passes and antes, this will overestimate the length of the game. But antes take time too and it should only "pass" on a player once per player, so this should still be a good metric for game length. ./prism dreidel.prism -const M=10 -const C0=1000:1000:5000 -pf 'const C0; P=? [F A full 10% of games don't finish in 2000 steps, and 2% pass the 3000 step barrier. At 8 seconds a roll/ante, 3000 steps is over six hours. Dreidel is a bad game. More fun properties As a sanity check, let's confirm last year's result, that it takes an average of 64ish spins before one player is out. In that model, we just needed to get the total reward. Now we instead want to get the reward until the first state where any of the players have zero coins. 2 ./prism dreidel.prism -const M=10 -pf 'R=? [F (p1=0 | p2=0 | p3=0 | p4=0)]' -heuristic speed Result: 63.71310116083396 Time for model checking: 2.017 seconds. Yep, looks good. With our new model we can also get the average point where two players are out and two players are left. PRISM's lack of abstraction makes expressing the condition directly a little painful, but we can cheat and look for the first state where ante_left .3 ./prism dreidel.prism -const M=10 -pf 'R=? [F (ante_left It takes twice as long to eliminate the second player as it takes to eliminate the first, and the remaining two players have to go for another 600 spins. Dreidel is a bad game. The future There's two things I want to do next with this model. The first is script up something that can generate the PRISM model for me, so I can easily adjust the number of players to 3 or 5. The second is that PRISM has a filter-query feature I don't understand but I think it could be used for things like "if a player gets 75% of the pot, what's the probability they lose anyway". Otherwise you have to write wonky queries like (P =? [F p1 = 30 & (F p1 = 0)]) / (P =? [F p1 = 0]).4 But I'm out of time again, so this saga will have to conclude next year. I'm also faced with the terrible revelation that I might be the biggest non-academic user of PRISM. Logic for Programmers Khanukah Sale Still going on! You can get LFP for 40% off here from now until the end of Xannukkah (Jan 2).5 I'm in the Raku Advent Calendar! My piece is called counting up concurrencies. It's about using Raku to do some combinatorics! Read the rest of the blog too, it's great This is different from the original anti-Dreidel article: Ben got 860 spins. That's the average spins if you round down on He, not up. Rounding up on He leads to a shorter game because it means He can empty the pot, which means more antes, and antes are what knocks most players out. ↩ PRISM calls this "co-safe LTL reward" and does not explain what that means, nor do most of the papers I found referencing "co-safe LTL". Eventually I found one that defined it as "any property that only uses X, U, F". ↩ Here's the exact point where I realize I could have defined done as ante_left = 1. Also checking for F (ante_left = 2) gives an expected number of spins as "infinity". I have no idea why. ↩ 10% chances at 4 players / 10 coins. And it takes a minute even with fast mode enabled. ↩ This joke was funnier before I made the whole newsletter about Chanukahh. ↩
More in programming
I occasionally receive emails asking me to look at the writer's new language/library/tool. Sometimes it's in an area I know well, like formal methods. Other times, I'm a complete stranger to the field. Regardless, I'm generally happy to check it out. When starting out, this is the biggest question I'm looking to answer: What does this technology make easy that's normally hard? What justifies me learning and migrating to a new thing as opposed to fighting through my problems with the tools I already know? The new thing has to have some sort of value proposition, which could be something like "better performance" or "more secure". The most universal value and the most direct to show is "takes less time and mental effort to do something". I can't accurately judge two benchmarks, but I can see two demos or code samples and compare which one feels easier to me. Examples Functional programming What drew me originally to functional programming was higher order functions. # Without HOFs out = [] for x in input { if test(x) { out.append(x) } } # With HOFs filter(test, input) We can also compare the easiness of various tasks between examples within the same paradigm. If I know FP via Clojure, what could be appealing about Haskell or F#? For one, null safety is a lot easier when I've got option types. Array Programming Array programming languages like APL or J make certain classes of computation easier. For example, finding all of the indices where two arrays differ. Here it is in Python: x = [1, 4, 2, 3, 4, 1, 0, 0, 0, 4] y = [2, 3, 1, 1, 2, 3, 2, 0, 2, 4] >>> [i for i, (a, b) in enumerate(zip(x, y)) if a == b] [7, 9] And here it is in J: x =: 1 4 2 3 4 1 0 0 0 4 y =: 2 3 1 1 2 3 2 0 2 4 I. x = y 7 9 Not every tool is meant for every programmer, because you might not have any of the problems a tool makes easier. What comes up more often for you: filtering a list or finding all the indices where two lists differ? Statistically speaking, functional programming is more useful to you than array programming. But I have this problem enough to justify learning array programming. LLMs I think a lot of the appeal of LLMs is they make a lot of specialist tasks easy for nonspecialists. One thing I recently did was convert some rst list tables to csv tables. Normally I'd have to do write some tricky parsing and serialization code to automatically convert between the two. With LLMs, it's just Convert the following rst list-table into a csv-table: [table] "Easy" can trump "correct" as a value. The LLM might get some translations wrong, but it's so convenient I'd rather manually review all the translations for errors than write specialized script that is correct 100% of the time. Let's not take this too far A college friend once claimed that he cracked the secret of human behavior: humans do whatever makes them happiest. "What about the martyr who dies for their beliefs?" "Well, in their last second of life they get REALLY happy." We can do the same here, fitting every value proposition into the frame of "easy". CUDA makes it easier to do matrix multiplication. Rust makes it easier to write low-level code without memory bugs. TLA+ makes it easier to find errors in your design. Monads make it easier to sequence computations in a lazy environment. Making everything about "easy" obscures other reason for adopting new things. That whole "simple vs easy" thing Sometimes people think that "simple" is better than "easy", because "simple" is objective and "easy" is subjective. This comes from the famous talk Simple Made Easy. I'm not sure I agree that simple is better or more objective: the speaker claims that polymorphism and typeclasses are "simpler" than conditionals, and I doubt everybody would agree with that. The problem is that "simple" is used to mean both "not complicated" and "not complex". And everybody agrees that "complicated" and "complex" are different, even if they can't agree what the difference is. This idea should probably expanded be expanded into its own newsletter. It's also a lot harder to pitch a technology on being "simpler". Simplicity by itself doesn't make a tool better equipped to solve problems. Simplicity can unlock other benefits, like compositionality or tractability, that provide the actual value. And often that value is in the form of "makes some tasks easier".
Ask an engineering leader about their incident response protocol and they’ll tell you about their severity scale. “The first thing we do is we assign a severity to the incident,” they’ll say, “so the right people will get notified.” And this is sensible. In order to figure out whom to get involved, decision makers need … Continue reading Incident SEV scales are a waste of time
Thou shalt not suffer a flaky test to live, because it’s annoying, counterproductive, and dangerous: one day it might fail for real, and you won’t notice. Here’s what to do.
The ware for January 2025 is shown below. Thanks to brimdavis for contributing this ware! …back in the day when you would get wares that had “blue wires” in them… One thing I wonder about this ware is…where are the ROMs? Perhaps I’ll find out soon! Happy year of the snake!
Explore how JSDOM's browser simulation works, and learn front-end testing approaches using Vitest Browser Mode for direct browser testing and native APIs