That boolean should probably be something else

from ntietz.com blog - technically a blog [alt+shift+b] in programming

One of the first types we learn about is the boolean. It's pretty natural to use, because boolean logic underpins much of modern computing. And yet, it's one of the types we should probably be using a lot less of. In almost every single instance when you use a boolean, it should be something else. The trick is figuring out what "something else" is. Doing this is worth the effort. It tells you a lot about your system, and it will improve your design (even if you end up using a boolean). There are a few possible types that come up often, hiding as booleans. Let's take a look at each of these, as well as the case where using a boolean does make sense. This isn't exhaustive—[1]there are surely other types that can make sense, too. Datetimes A lot of boolean data is representing a temporal event having happened. For example, websites often have you confirm your email. This may be stored as a boolean column, is_confirmed, in the database. It makes a lot of sense. But, you're throwing away...

2 months ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Comments

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from ntietz.com blog - technically a blog

Visualizing distributions with pepperoni pizza (and javascript)

There's a pizza shop near me that serves a normal pizza. I mean, they distribute the toppings in a normal way. They're not uniform at all. The toppings are random, but not the way I want. The colloquial understanding of "random" is kind of the Platonic ideal of a pizza: slightly chaotic but things are more or less spread out over the whole piece in a regular way. If you take a slice you'll get more of less the same amount of pepperoni as any other slice. And every bite will have roughly the same amount of pepperoni as every other bite. I think it would look something like this. Regenerate this pie! This pizza to me is pretty much the canonical mental pizza. It looks pretty random, but you know what you're gonna get. And it is random! Here's how we made it, with the visualiztion part glossed over. First, we make a helper function, since Math.random() gives us values from 0 to 1, but we want values from -1 to 1. // return a uniform random value in [-1, 1] function randUniform() { return 2*Math.random() - 1; } Then, we make a simple function that gives us the coordinates of where to put a pepperoni piece, from the uniform distribution. function uniformPepperoniPosition() { var [centerX, centerY, radius] = pepperoniBounds(); let x = radius*2; let y = radius*2; while (x**2 + y**2 >= radius**2) { x = randUniform() * radius; y = randUniform() * radius; } return [x+centerX, y+centerY]; } And we cap it off with placing 300 fresh pieces of pepperoni on this pie, before we send it into the oven. (It's an outrageous amount of very small pepperoni, chosen in both axes for ease of visualizing the distribution rather than realism.) function drawUniformPizza() { drawBackground(); drawPizzaCrust(); drawCheese(); var [_, _, radius] = pepperoniBounds(); for (let p = 0; p < 300; p++) { let [x,y] = uniformPepperoniPosition(); drawPepperoni(x, y); } } But it's not what my local pizza shop's pizza's look like. That's because they're not using the same probability distribution. This pizza is using a uniform distribution. That means that for any given pepperoni, every single position on the pizza is equally likely for it to land on. These are normal pizzas We are using a uniform distribution here, but there are plenty of other distributions we could use as well. One of the other most familiar distributions is normal distribution. This is the distribution that has the normal "bell curve" that we are used to seeing. And this is probably what people are talking about most of the time when they talk about how many standard deviations something is away from something else. So what would it look like if we did a normal distribution on a pizza? The very first thing we need to answer that is a way of getting the values from the normal distribution. This isn't included with JavaScript by default, but we can implement it pretty simply using the Box-Muller transform. This might be a scary name, but it's really easy to use. Is a way of generating numbers in the normal distribution using number sampled from the uniform distribution. We can implement it like this: function randNormal() { let theta = 2*Math.PI*Math.random(); let r = Math.sqrt(-2*Math.log(Math.random())); let x = r * Math.cos(theta); let y = r * Math.sin(theta); return [x,y]; } Then we can make a pretty simple function again which gives us coordinates for where to place pepperoni in this distribution. The only little weird thing here is that I scale the radius down by a factor of 3. Without this, the pizza ends up a little bit indistinguishable from the uniform distribution, but the scaling is arbitrary and you can do whatever you want. function normalPepperoniPosition() { var [centerX, centerY, radius] = pepperoniBounds(); let x = radius*2; let y = radius*2; while (x**2 + y**2 >= radius**2) { [x,y] = randNormal(); x = x * radius/3; y = y * radius/3; } return [x + centerX, y + centerX]; } And then once again we cap it off with a 300 piece pepperoni pizza. function drawNormalPizza() { drawBackground(); drawPizzaCrust(); drawCheese(); for (let p = 0; p < 300; p++) { let [x,y] = normalPepperoniPosition(); drawPepperoni(x, y); } } Regenerate this pie! Ouch. It's not my platonic ideal of a pizza, that's for sure. It also looks closer to the pizzas my local shop serves, but it's missing something... See, this one is centered around, you know, the center. Theirs are not that. They're more chaotic with a few handfuls of toppings. What if we did the normal distributions, but multiple times, with different centers? First we have to update our position picking function to accept a center for the cluster. We'll do this by passing in the center and generating coordinates around those, while still checking that we're within the bounds of the circle formed by the crust of the pizza. function normal(cx, cy) { var [centerX, centerY, radius] = pepperoniBounds(); let x = radius*2; let y = radius*2; while ((x-centerX)**2 + (y-centerY)**2 >= radius**2) { [x,y] = randNormal(); x = cx + x * radius/3; y = cy + y * radius/3; } return [x, y]; } And then instead of one single loop for all 300 pieces, we can do 3 loops of 100 pieces each, with different (randomly chosen) centers for each. function drawClusterPizza() { const settings = initializeCanvas("drawing-3"); drawBackground(settings); drawPizzaCrust(settings); drawCheese(settings); var [centerX, centerY, radius] = pepperoniBounds(settings); for (let c = 0; c < 3; c++) { let [cx, cy] = uniform(settings, centerX, centerY, 1); console.log(cx, cy); for (let p = 0; p < 100; p++) { let [x, y] = normal(settings, cx, cy, 4); drawPepperoni(settings, x, y); } } } Regenerate this pie! That looks more like it. Well, probably. This one is more chaotic, and sometimes things work out okay, but other times they're weird. Just like the real pizzas. Click that "regenerate" button a few times to see a few examples! Okay, but when do you want one? So, this is all great. But, when would we want this? I mean, first of all, boring. We don't need a reason except that it's fun! But, there's one valid use case that a medical professional and I came up with[1]: hot honey[2]. The ideal pepperoni pizza just might be one that has uniformly distributed pepperoni with normally distributed hot honey or hot sauce. You'd start with more intense heat, then it would taper off as you go toward the crust, so you maintain the heat without getting overwhelmed by it. The room to play here is endless! We can come up with a lot of other fun distributions and map them in similar ways. Unfortunately, we probably can't make a Poisson pizza, since that's a distribution for discrete variables. I really do talk about weird things with all my medical providers. And everyone else I meet. I don't know, life's too short to go "hey, this is a professional interaction, let's not chatter on and on about whatever irrelevant topic is on our mind." ↩ The pizza topping, not my pet name. ↩

2 weeks ago • 20 votes

Covers as a way of learning music and code

When you're just getting started with music, you have so many skills to learn. You have to be able to play your instrument and express yourself through it. You need to know the style you're playing, and its idioms and conventions. You may want to record your music, and need all the skills that come along with it. Music is, mostly, subjective: there's not an objective right or wrong way to do things. And that can make it really hard! Each of these skills is then couched in this subjectivity of trying to see if it's good enough. Playing someone else's music, making a cover, is great because it can make it objective. It gives you something to check against. When you're playing your own music, you're in charge of the entire thing. You didn't play a wrong note, because, well, you've just changed the piece! But when you play someone else's music, now there's an original and you can try to get as close to it as possible. Recreating it gives you a lot of practice in figuring out what someone did and how they did it. It also lets you peek into why they did it. Maybe a particular chord voicing is hard for you to play. Okay, let's simplify it and play an easier voicing. How does it sound now? How does it sound with the harder one? Play around with those differences and you start to see the why behind it all. * * * The same thing holds true for programming. One of my friends is a C++ programmer[1] and he was telling me about how he learned C++ and data structures really well early on: He reimplemented parts of the Boost library. This code makes heavy use of templates, a hard thing in C++. And it provides fundamental data structures with robust implementations and good performance[2]. What he would do is look at the library and pick a slice of it to implement. He'd look at what the API for it is, how it was implemented, what it was doing under the hood. Then he'd go ahead and try to do it himself, without any copy-pasting and without real-time copying from the other screen. Sometimes, he'd run into things which didn't make sense. Why is this a doubly-linked list here, when it seems a singly-linked list would do just fine? And in those moments, if you can't find a reason? You get to go down that path, make it the singly-linked version, and then find out later: oh, ohhh. Ohhhh, they did that for a reason. It lets you run into some of the hard problems, grapple with them, and understand why the original was written how it was. You get to study with some really strong programmers, by proxy via their codebase. Their code is your tutor and your guide for understanding how to write similar things in the future. * * * There's a lot of judgment out there about doing original works. This kind of judgment of covers and of reimplementing things that already exist, just to learn. So many people have internalized this, and I've heard countless times "I want to make a new project, but everything I think of, someone else has already done!" And to that, I say: do it anyway[3]. If someone else has done it, that's great. That means that you had an idea so good that someone else thought it was a good idea, too. And that means that, because someone else has done it, you have a reference now. You can compare notes, and you can see how they did it, and you can learn. I'm a recovering C++ programmer myself, and had some unpleasant experiences associated with the language. This friend is a game developer, and his industry is one where C++ makes a lot of sense to use because of the built-up code around it. ↩ He said they're not perfect, but that they're really good and solid and you know a lot of people thought for a long time about how to do them. You get to follow in their footsteps and benefit from all that hard thinking time. ↩ But: you must always give credit when you are using someone else's work. If you're reimplementing someone else's library, or covering someone's song, don't claim it's your own original invention. ↩

a month ago • 34 votes

Proving that every program halts

One of the best known hard problems in computer science is the halting problem. In fact, it's widely thought[1] that you cannot write a program that will, for any arbitrary program as input, tell you correctly whether or not it will terminate. This is written from the framing of computers, though: can we do better with a human in the loop? It turns out, we can. And we can use a method that's generalizable, which many people can follow for many problems. Not everyone can use the method, which you'll see why in a bit. But lots of people can apply this proof technique. Let's get started. * * * We'll start by formalizing what we're talking about, just a little bit. I'm not going to give the full formal proof—that will be reserved for when this is submitted to a prestigious conference next year. We will call the set of all programs P. We want to answer, for any p in P, whether or not p will eventually halt. We will call this h(p) and h(p) = true if p eventually finished and false otherwise. Actually, scratch that. Let's simplify it and just say that yes, every program does halt eventually, so h(p) = true for all p. That makes our lives easier. Now we need to get from our starting assumptions, the world of logic we live in, to the truth of our statement. We'll call our goal, that h(p) = true for all p, the statement H. Now let's start with some facts. Fact one: I think it's always an appropriate time to play the saxophone. *honk*! Fact two: My wife thinks that it's sometimes inappropriate to play the saxophone, such as when it's "time for bed" or "I was in the middle of a sentence![2] We'll give the statement "It's always an appropriate time to play the saxophone" the name A. We know that I believe A is true. And my wife believes that A is false. So now we run into the snag: Fact three: The wife is always right. This is a truism in American culture, useful for settling debates. It's also useful here for solving major problems in computer science because, babe, we're both the wife. We're both right! So now that we're both right, we know that A and !A are both true. And we're in luck, we can apply a whole lot of fancy classical logic here. Since A and !A we know that A is true and we also know that !A is true. From A being true, we can conclude that A or H is true. And then we can apply disjunctive syllogism[3] which says that if A or H is true and !A is true, then H must be true. This makes sense, because if you've excluded one possibility then the other must be true. And we do have !A, so that means: H is true! There we have it. We've proved our proposition, H, which says that for any program p, p will eventually halt. The previous logic is, mostly, sound. It uses the principle of explosion, though I prefer to call it "proof by married lesbian." * * * Of course, we know that this is wrong. It falls apart with our assumptions. We built the system on contradictory assumptions to begin with, and this is something we avoid in logic[4]. If we allow contradictions, then we can prove truly anything. I could have also proved (by married lesbian) that no program will terminate. This has been a silly traipse through logic. If you want a good journey through logic, I'd recommend Hillel Wayne's Logic for Programmers. I'm sure that, after reading it, you'll find absolutely no flaws in my logic here. After all, I'm the wife, so I'm always right. It's widely thought because it's true, but we don't have to let that keep us from a good time. ↩ I fact checked this with her, and she does indeed hold this belief. ↩ I had to look this up, my uni logic class was a long time ago. ↩ The real conclusion to draw is that, because of proof by contradiction, it's certainly not true that the wife is always right. Proved that one via married lesbians having arguments. Or maybe gay relationships are always magical and happy and everyone lives happily ever after, who knows. ↩

2 months ago • 41 votes

Taking a break

I've been publishing at least one blog post every week on this blog for about 2.5 years. I kept it up even when I was very sick last year with Lyme disease. It's time for me to take a break and reset. This is the right time, because the world is very difficult for me to move through right now and I'm just burnt out. I need to focus my energy on things that give me energy and right now, that's not writing and that's not tech. I'll come back to this, and it might look a little different. This is my last post for at least a month. It might be longer, if I still need more time, but I won't return before the end of May. I know I need at least that long to heal, and I also need that time to focus on music. I plan to play a set at West Philly Porchfest, so this whole month I'll be prepping that set. If you want to follow along with my music, you can find it on my bandcamp (only one track, but I'll post demos of the others that I prepare for Porchfest as they come together). And if you want to reach out, my inbox is open. Be kind to yourself. Stay well, drink some water. See you in a while.

4 months ago • 33 votes

More in programming

Apologies and forgiveness

The first in a series of posts about doing things the right way

12 hours ago • 4 votes

Understanding Bazel remote caching

A deep dive into the Action Cache, the CAS, and the security issues that arise from using Bazel with a remote cache but without remote execution

13 hours ago • 4 votes

Trying to Make Sense of Casing Conventions on the Web

(I present to you my stream of consciousness on the topic of casing as it applies to the web platform.) I’m reading about the new command and commandfor attributes — which I’m super excited about, declarative behavior invocation in HTML? YES PLEASE!! — and one thing that strikes me is the casing in these APIs. For example, the command attribute has a variety of values in HTML which correspond to APIs in JavaScript. The show-popover attribute value maps to .showPopover() in JavaScript. hide-popover maps to .hidePopover(), etc. So what we have is: lowercase in attribute names e.g. commandfor="..." kebab-case in attribute values e.g. show-popover camelCase for JS counterparts e.g. showPopover() After thinking about this a little more, I remember that HTML attributes names are case insensitive, so the browser will normalize them to lowercase during parsing. Given that, I suppose you could write commandFor="..." but it’s effectively the same. Ok, lowercase attribute names in HTML makes sense. The related popover attributes follow the same convention: popovertarget popovertargetaction And there are many other attribute names in HTML that are lowercase, e.g.: maxlength novalidate contenteditable autocomplete formenctype So that all makes sense. But wait, there are some attribute names with hyphens in them, like aria-label="..." and data-value="...". So why isn’t it command-for="..."? Well, upon further reflection, I suppose those attributes were named that way for extensibility’s sake: they are essentially wildcard attributes that represent a family of attributes that are all under the same namespace: aria-* and data-*. But wait, isn’t that an argument for doing popover-target and popover-target-action? Or command and command-for? But wait (I keep saying that) there are kebab-case attribute names in HTML — like http-equiv on the <meta> tag, or accept-charset on the form tag — but those seem more like legacy exceptions. It seems like the only answer here is: there is no rule. Naming is driven by convention and decisions are made on a case-by-case basis. But if I had to summarize, it would probably be that the default casing for new APIs tends to follow the rules I outlined at the start (and what’s reflected in the new command APIs): lowercase for HTML attributes names kebab-case for HTML attribute values camelCase for JS counterparts Let’s not even get into SVG attribute names We need one of those “bless this mess” signs that we can hang over the World Wide Web. Email · Mastodon · Bluesky

2 days ago • 6 votes

The Angels and Demons of Nondeterminism

Greetings everyone! You might have noticed that it's September and I don't have the next version of Logic for Programmers ready. As penance, here's ten free copies of the book. So a few months ago I wrote a newsletter about how we use nondeterminism in formal methods. The overarching idea: Nondeterminism is when multiple paths are possible from a starting state. A system preserves a property if it holds on all possible paths. If even one path violates the property, then we have a bug. An intuitive model of this is that for this is that when faced with a nondeterministic choice, the system always makes the worst possible choice. This is sometimes called demonic nondeterminism and is favored in formal methods because we are paranoid to a fault. The opposite would be angelic nondeterminism, where the system always makes the best possible choice. A property then holds if any possible path satisfies that property.1 This is not as common in FM, but it still has its uses! "Players can access the secret level" or "We can always shut down the computer" are reachability properties, that something is possible even if not actually done. In broader computer science research, I'd say that angelic nondeterminism is more popular, due to its widespread use in complexity analysis and programming languages. Complexity Analysis P is the set of all "decision problems" (basically, boolean functions) can be solved in polynomial time: there's an algorithm that's worst-case in O(n), O(n²), O(n³), etc.2 NP is the set of all problems that can be solved in polynomial time by an algorithm with angelic nondeterminism.3 For example, the question "does list l contain x" can be solved in O(1) time by a nondeterministic algorithm: fun is_member(l: List[T], x: T): bool { if l == [] {return false}; guess i in 0..<(len(l)-1); return l[i] == x; } Say call is_member([a, b, c, d], c). The best possible choice would be to guess i = 2, which would correctly return true. Now call is_member([a, b], d). No matter what we guess, the algorithm correctly returns false. and just return false. Ergo, O(1). NP stands for "Nondeterministic Polynomial". (And I just now realized something pretty cool: you can say that P is the set of all problems solvable in polynomial time under demonic nondeterminism, which is a nice parallel between the two classes.) Computer scientists have proven that angelic nondeterminism doesn't give us any more "power": there are no problems solvable with AN that aren't also solvable deterministically. The big question is whether AN is more efficient: it is widely believed, but not proven, that there are problems in NP but not in P. Most famously, "Is there any variable assignment that makes this boolean formula true?" A polynomial AN algorithm is again easy: fun SAT(f(x1, x2, …: bool): bool): bool { N = num_params(f) for i in 1..=num_params(f) { guess x_i in {true, false} } return f(x_1, x_2, …) } The best deterministic algorithms we have to solve the same problem are worst-case exponential with the number of boolean parameters. This a real frustrating problem because real computers don't have angelic nondeterminism, so problems like SAT remain hard. We can solve most "well-behaved" instances of the problem in reasonable time, but the worst-case instances get intractable real fast. Means of Abstraction We can directly turn an AN algorithm into a (possibly much slower) deterministic algorithm, such as by backtracking. This makes AN a pretty good abstraction over what an algorithm is doing. Does the regex (a+b)\1+ match "abaabaabaab"? Yes, if the regex engine nondeterministically guesses that it needs to start at the third letter and make the group aab. How does my PL's regex implementation find that match? I dunno, backtracking or NFA construction or something, I don't need to know the deterministic specifics in order to use the nondeterministic abstraction. Neel Krishnaswami has a great definition of 'declarative language': "any language with a semantics has some nontrivial existential quantifiers in it". I'm not sure if this is identical to saying "a language with an angelic nondeterministic abstraction", but they must be pretty close, and all of his examples match: SQL's selects and joins Parsing DSLs Logic programming's unification Constraint solving On top of that I'd add CSS selectors and planner's actions; all nondeterministic abstractions over a deterministic implementation. He also says that the things programmers hate most in declarative languages are features that "that expose the operational model": constraint solver search strategies, Prolog cuts, regex backreferences, etc. Which again matches my experiences with angelic nondeterminism: I dread features that force me to understand the deterministic implementation. But they're necessary, since P probably != NP and so we need to worry about operational optimizations. Eldritch Nondeterminism If you need to know the ratio of good/bad paths, the number of good paths, or probability, or anything more than "there is a good path" or "there is a bad path", you are beyond the reach of heaven or hell. Angelic and demonic nondeterminism are duals: angelic returns "yes" if some choice: correct and demonic returns "no" if !all choice: correct, which is the same as some choice: !correct. ↩ Pet peeve about Big-O notation: O(n²) is the set of all algorithms that, for sufficiently large problem sizes, grow no faster that quadratically. "Bubblesort has O(n²) complexity" should be written Bubblesort in O(n²), not Bubblesort = O(n²). ↩ To be precise, solvable in polynomial time by a Nondeterministic Turing Machine, a very particular model of computation. We can broadly talk about P and NP without framing everything in terms of Turing machines, but some details of complexity classes (like the existence "weak NP-hardness") kinda need Turing machines to make sense. ↩

2 days ago • 6 votes

Announcing the 2025 TokyoDev Developers Survey

The 2025 edition of the TokyoDev Developer Survey is now live! If you’re a software developer living in Japan, please take a few minutes to participate. All questions are optional, and it should take less than 10 minutes to complete. The survey will remain open until September 30th. Last year, we received over 800 responses. Highlights included: Median compensation remained stable. The pay gap between international and Japanese companies narrowed to 47%. Fewer respondents had the option to work fully remotely. For 2025, we’ve added several new questions, including a dedicated section on one of the most talked-about topics in development today: AI. The survey is completely anonymous, and only aggregated results will be shared—never personally identifiable information. The more responses we get, the deeper and more meaningful our insights will be. Please help by taking the survey and sharing it with your peers!

2 days ago • 9 votes

New here?