More from Tony Finch's blog
What does it mean when someone writes that a programming language is “strongly typed”? I’ve known for many years that “strongly typed” is a poorly-defined term. Recently I was prompted on Lobsters to explain why it’s hard to understand what someone means when they use the phrase. I came up with more than five meanings! how strong? The various meanings of “strongly typed” are not clearly yes-or-no. Some developers like to argue that these kinds of integrity checks must be completely perfect or else they are entirely worthless. Charitably (it took me a while to think of a polite way to phrase this), that betrays a lack of engineering maturity. Software engineers, like any engineers, have to create working systems from imperfect materials. To do so, we must understand what guarantees we can rely on, where our mistakes can be caught early, where we need to establish processes to catch mistakes, how we can control the consequences of our mistakes, and how to remediate when somethng breaks because of a mistake that wasn’t caught. strong how? So, what are the ways that a programming language can be strongly or weakly typed? In what ways are real programming languages “mid”? Statically typed as opposed to dynamically typed? Many languages have a mixture of the two, such as run time polymorphism in OO languages (e.g. Java), or gradual type systems for dynamic languages (e.g. TypeScript). Sound static type system? It’s common for static type systems to be deliberately unsound, such as covariant subtyping in arrays or functions (Java, again). Gradual type systems migh have gaping holes for usability reasons (TypeScript, again). And some type systems might be unsound due to bugs. (There are a few of these in Rust.) Unsoundness isn’t a disaster, if a programmer won’t cause it without being aware of the risk. For example: in Lean you can write “sorry” as a kind of “to do” annotation that deliberately breaks soundness; and Idris 2 has type-in-type so it accepts Girard’s paradox. Type safe at run time? Most languages have facilities for deliberately bypassing type safety, with an “unsafe” library module or “unsafe” language features, or things that are harder to spot. It can be more or less difficult to break type safety in ways that the programmer or language designer did not intend. JavaScript and Lua are very safe, treating type safety failures as security vulnerabilities. Java and Rust have controlled unsafety. In C everything is unsafe. Fewer weird implicit coercions? There isn’t a total order here: for instance, C has implicit bool/int coercions, Rust does not; Rust has implicit deref, C does not. There’s a huge range in how much coercions are a convenience or a source of bugs. For example, the PHP and JavaScript == operators are made entirely of WAT, but at least you can use === instead. How fancy is the type system? To what degree can you model properties of your program as types? Is it convenient to parse, not validate? Is the Curry-Howard correspondance something you can put into practice? Or is it only capable of describing the physical layout of data? There are probably other meanings, e.g. I have seen “strongly typed” used to mean that runtime representations are abstract (you can’t see the underlying bytes); or in the past it sometimes meant a language with a heavy type annotation burden (as a mischaracterization of static type checking). how to type So, when you write (with your keyboard) the phrase “strongly typed”, delete it, and come up with a more precise description of what you really mean. The desiderata above are partly overlapping, sometimes partly orthogonal. Some of them you might care about, some of them not. But please try to communicate where you draw the line and how fuzzy your line is.
Previously, I wrote some sketchy ideas for what I call a p-fast trie, which is basically a wide fan-out variant of an x-fast trie. It allows you to find the longest matching prefix or nearest predecessor or successor of a query string in a set of names in O(log k) time, where k is the key length. My initial sketch was more complicated and greedy for space than necessary, so here’s a simplified revision. (“p” now stands for prefix.) layout A p-fast trie stores a lexicographically ordered set of names. A name is a sequence of characters from some small-ish character set. For example, DNS names can be represented as a set of about 50 letters, digits, punctuation and escape characters, usually one per byte of name. Names that are arbitrary bit strings can be split into chunks of 6 bits to make a set of 64 characters. Every unique prefix of every name is added to a hash table. An entry in the hash table contains: A shared reference to the closest name lexicographically greater than or equal to the prefix. Multiple hash table entries will refer to the same name. A reference to a name might instead be a reference to a leaf object containing the name. The length of the prefix. To save space, each prefix is not stored separately, but implied by the combination of the closest name and prefix length. A bitmap with one bit per possible character, corresponding to the next character after this prefix. For every other prefix that matches this prefix and is one character longer than this prefix, a bit is set in the bitmap corresponding to the last character of the longer prefix. search The basic algorithm is a longest-prefix match. Look up the query string in the hash table. If there’s a match, great, done. Otherwise proceed by binary chop on the length of the query string. If the prefix isn’t in the hash table, reduce the prefix length and search again. (If the empty prefix isn’t in the hash table then there are no names to find.) If the prefix is in the hash table, check the next character of the query string in the bitmap. If its bit is set, increase the prefix length and search again. Otherwise, this prefix is the answer. predecessor Instead of putting leaf objects in a linked list, we can use a more complicated search algorithm to find names lexicographically closest to the query string. It’s tricky because a longest-prefix match can land in the wrong branch of the implicit trie. Here’s an outline of a predecessor search; successor requires more thought. During the binary chop, when we find a prefix in the hash table, compare the complete query string against the complete name that the hash table entry refers to (the closest name greater than or equal to the common prefix). If the name is greater than the query string we’re in the wrong branch of the trie, so reduce the length of the prefix and search again. Otherwise search the set bits in the bitmap for one corresponding to the greatest character less than the query string’s next character; if there is one remember it and the prefix length. This will be the top of the sub-trie containing the predecessor, unless we find a longer match. If the next character’s bit is set in the bitmap, continue searching with a longer prefix, else stop. When the binary chop has finished, we need to walk down the predecessor sub-trie to find its greatest leaf. This must be done one character at a time – there’s no shortcut. thoughts In my previous note I wondered how the number of search steps in a p-fast trie compares to a qp-trie. I have some old numbers measuring the average depth of binary, 4-bit, 5-bit, 6-bit and 4-bit, 5-bit, dns qp-trie variants. A DNS-trie varies between 7 and 15 deep on average, depending on the data set. The number of steps for a search matches the depth for exact-match lookups, and is up to twice the depth for predecessor searches. A p-fast trie is at most 9 hash table probes for DNS names, and unlikely to be more than 7. I didn’t record the average length of names in my benchmark data sets, but I guess they would be 8–32 characters, meaning 3–5 probes. Which is far fewer than a qp-trie, though I suspect a hash table probe takes more time than chasing a qp-trie pointer. (But this kind of guesstimate is notoriously likely to be wrong!) However, a predecessor search might need 30 probes to walk down the p-fast trie, which I think suggests a linked list of leaf objects is a better option.
Here’s a sketch of an idea that might or might not be a good idea. Dunno if it’s similar to something already described in the literature – if you know of something, please let me know via the links in the footer! The gist is to throw away the tree and interior pointers from a qp-trie. Instead, the p-fast trie is stored using a hash map organized into stratified levels, where each level corresponds to a prefix of the key. Exact-match lookups are normal O(1) hash map lookups. Predecessor / successor searches use binary chop on the length of the key. Where a qp-trie search is O(k), where k is the length of the key, a p-fast trie search is O(log k). This smaller O(log k) bound is why I call it a “p-fast trie” by analogy with the x-fast trie, which has O(log log N) query time. (The “p” is for popcount.) I’m not sure if this asymptotic improvement is likely to be effective in practice; see my thoughts towards the end of this note. layout A p-fast trie consists of: Leaf objects, each of which has a name. Each leaf object refers to its successor forming a circular linked list. (The last leaf refers to the first.) Multiple interior nodes refer to each leaf object. A hash map containing every (strict) prefix of every name in the trie. Each prefix maps to a unique interior node. Names are treated as bit strings split into chunks of (say) 6 bits, and prefixes are whole numbers of chunks. An interior node contains a (1<<6) == 64 wide bitmap with a bit set for each chunk where prefix+chunk matches a key. Following the bitmap is a popcount-compressed array of references to the leaf objects that are the closest predecessor of the corresponding prefix+chunk key. Prefixes are strictly shorter than names so that we can avoid having to represent non-values after the end of a name, and so that it’s OK if one name is a prefix of another. The size of chunks and bitmaps might change; 6 is a guess that I expect will work OK. For restricted alphabets you can use something like my DNS trie name preparation trick to squash 8-bit chunks into sub-64-wide bitmaps. In Rust where cross-references are problematic, there might have to be a hash map that owns the leaf objects, so that the p-fast trie can refer to them by name. Or use a pool allocator and refer to leaf objects by numerical index. search To search, start by splitting the query string at its end into prefix + final chunk of bits. Look up the prefix in the hash map and check the chunk’s bit in the bitmap. If it’s set, you can return the corresponding leaf object because it’s either an exact match or the nearest predecessor. If it isn’t found, and you want the predecessor or successor, continue with a binary chop on the length of the query string. Look up the chopped prefix in the hash map. The next chunk is the chunk of bits in the query string immediately after the prefix. If the prefix is present and the next chunk’s bit is set, remember the chunk’s leaf pointer, make the prefix longer, and try again. If the prefix is present and the next chunk’s bit is not set and there’s a lesser bit that is set, return the leaf pointer for the lesser bit. Otherwise make the prefix shorter and try again. If the prefix isn’t present, make the prefix shorter and try again. When the binary chop bottoms out, return the longest-matching leaf you remembered. The leaf’s key and successor bracket the query string. modify When inserting a name, all its prefixes must be added to the hash map from longest to shortest. At the point where it finds that the prefix already exists, the insertion routine needs to walk down the (implicit) tree of successor keys, updating pointers that refer to the new leaf’s predecessor so they refer to the new leaf instead. Similarly, when deleting a name, remove every prefix from longest to shortest from the hash map where they only refer to this leaf. At the point where the prefix has sibling nodes, walk down the (implicit) tree of successor keys, updating pointers that refer to the deleted leaf so they refer to its predecessor instead. I can’t “just” use a concurrent hash map and expect these algorithms to be thread-safe, because they require multiple changes to the hashmaps. I wonder if the search routine can detect when the hash map is modified underneath it and retry. thoughts It isn’t obvious how a p-fast trie might compare to a qp-trie in practice. A p-fast trie will use a lot more memory than a qp-trie because it requires far more interior nodes. They need to exist so that the random-access binary chop knows whether to shorten or lengthen the prefix. To avoid wasting space the hash map keys should refer to names in leaf objects, instead of making lots of copies. This is probably tricky to get right. In a qp-trie the costly part of the lookup is less than O(k) because non-branching interior nodes are omitted. How does that compare to a p-fast trie’s O(log k)? Exact matches in a p-fast trie are just a hash map lookup. If they are worth optimizing then a qp-trie could also be augmented with a hash map. Many steps of a qp-trie search are checking short prefixes of the key near the root of the tree, which should be well cached. By contrast, a p-fast trie search will typically skip short prefixes and instead bounce around longer prefixes, which suggests its cache behaviour won’t be so friendly. A qp-trie predecessor/successor search requires two traversals, one to find the common prefix of the key and another to find the prefix’s predecessor/successor. A p-fast trie requires only one.
Here are a few tangentially-related ideas vaguely near the theme of comparison operators. comparison style clamp style clamp is median clamp in range range style style clash? comparison style Some languages such as BCPL, Icon, Python have chained comparison operators, like if min <= x <= max: ... In languages without chained comparison, I like to write comparisons as if they were chained, like, if min <= x && x <= max { // ... } A rule of thumb is to prefer less than (or equal) operators and avoid greater than. In a sequence of comparisons, order values from (expected) least to greatest. clamp style The clamp() function ensures a value is between some min and max, def clamp(min, x, max): if x < min: return min if max < x: return max return x I like to order its arguments matching the expected order of the values, following my rule of thumb for comparisons. (I used that flavour of clamp() in my article about GCRA.) But I seem to be unusual in this preference, based on a few examples I have seen recently. clamp is median Last month, Fabian Giesen pointed out a way to resolve this difference of opinion: A function that returns the median of three values is equivalent to a clamp() function that doesn’t care about the order of its arguments. This version is written so that it returns NaN if any of its arguments is NaN. (When an argument is NaN, both of its comparisons will be false.) fn med3(a: f64, b: f64, c: f64) -> f64 { match (a <= b, b <= c, c <= a) { (false, false, false) => f64::NAN, (false, false, true) => b, // a > b > c (false, true, false) => a, // c > a > b (false, true, true) => c, // b <= c <= a (true, false, false) => c, // b > c > a (true, false, true) => a, // c <= a <= b (true, true, false) => b, // a <= b <= c (true, true, true) => b, // a == b == c } } When two of its arguments are constant, med3() should compile to the same code as a simple clamp(); but med3()’s misuse-resistance comes at a small cost when the arguments are not known at compile time. clamp in range If your language has proper range types, there is a nicer way to make clamp() resistant to misuse: fn clamp(x: f64, r: RangeInclusive<f64>) -> f64 { let (&min,&max) = (r.start(), r.end()); if x < min { return min } if max < x { return max } return x; } let x = clamp(x, MIN..=MAX); range style For a long time I have been fond of the idea of a simple counting for loop that matches the syntax of chained comparisons, like for min <= x <= max: ... By itself this is silly: too cute and too ad-hoc. I’m also dissatisfied with the range or slice syntax in basically every programming language I’ve seen. I thought it might be nice if the cute comparison and iteration syntaxes were aspects of a more generally useful range syntax, but I couldn’t make it work. Until recently when I realised I could make use of prefix or mixfix syntax, instead of confining myself to infix. So now my fantasy pet range syntax looks like >= min < max // half-open >= min <= max // inclusive And you might use it in a pattern match if x is >= min < max { // ... } Or as an iterator for x in >= min < max { // ... } Or to take a slice xs[>= min < max] style clash? It’s kind of ironic that these range examples don’t follow the left-to-right, lesser-to-greater rule of thumb that this post started off with. (x is not lexically between min and max!) But that rule of thumb is really intended for languages such as C that don’t have ranges. Careful stylistic conventions can help to avoid mistakes in nontrivial conditional expressions. It’s much better if language and library features reduce the need for nontrivial conditions and catch mistakes automatically.
More in programming
Linus Torvalds, Creator of Git and Linux, on reducing cognitive load
Understanding how the architecture of a remote build system for Bazel helps implement verifiable action execution and end-to-end builds
You heard there was money in tech. You never cared about technology. You are an entryist piece of shit. But you won’t leave willingly. Give it all away to everyone for free. Then you’ll have no reason to be here.
Debates, at their finest, are about exploring topics together in search for truth. That probably sounds hopelessly idealistic to anyone who've ever perused a comment section on the internet, but ideals are there to remind us of what's possible, to inspire us to reach higher — even if reality falls short. I've been reaching for those debating ideals for thirty years on the internet. I've argued with tens of thousands of people, first on Usenet, then in blog comments, then Twitter, now X, and also LinkedIn — as well as a million other places that have come and gone. It's mostly been about technology, but occasionally about society and morality too. There have been plenty of heated moments during those three decades. It doesn't take much for a debate between strangers on this internet to escalate into something far lower than a "search for truth", and I've often felt willing to settle for just a cordial tone! But for the majority of that time, I never felt like things might escalate beyond the keyboards and into the real world. That was until we had our big blow-up at 37signals back in 2021. I suddenly got to see a different darkness from the most vile corners of the internet. Heard from those who seem to prowl for a mob-sanctioned opportunity to threaten and intimidate those they disagree with. It fundamentally changed me. But I used the experience as a mirror to reflect on the ways my own engagement with the arguments occasionally felt too sharp, too personal. And I've since tried to refocus way more of my efforts on the positive and the productive. I'm by no means perfect, and the internet often tempts the worst in us, but I resist better now than I did then. What I cannot come to terms with, though, is the modern equation of words with violence. The growing sense of permission that if the disagreement runs deep enough, then violence is a justified answer to settle it. That sounds so obvious that we shouldn't need to state it in a civil society, but clearly it is not. Not even in technology. Not even in programming. There are plenty of factions here who've taken to justify their violent fantasies by referring to their ideological opponents as "nazis", "fascists", or "racists". And then follow that up with a call to "punch a nazi" or worse. When you hear something like that often enough, it's easy to grow glib about it. That it's just a saying. They don't mean it. But I'm afraid many of them really do. Which brings us to Charlie Kirk. And the technologists who name drinks at their bar after his mortal wound just hours after his death, to name but one of the many, morbid celebrations of the famous conservative debater's death. It's sickening. Deeply, profoundly sickening. And my first instinct was exactly what such people would delight in happening. To watch the rest of us recoil, then retract, and perhaps even eject. To leave the internet for a while or forever. But I can't do that. We shouldn't do that. Instead, we should double down on the opposite. Continue to show up with our ideals held high while we debate strangers in that noble search for the truth. Where we share our excitement, our enthusiasm, and our love of technology, country, and humanity. I think that's what Charlie Kirk did so well. Continued to show up for the debate. Even on hostile territory. Not because he thought he was ever going to convince everyone, but because he knew he'd always reach some with a good argument, a good insight, or at least a different perspective. You could agree or not. Counter or be quiet. But the earnest exploration of the topics in a live exchange with another human is as fundamental to our civilization as Socrates himself. Don't give up, don't give in. Keep debating.
In my old age I’ve mostly given up trying to convince anyone of anything. Most people do not care to find the truth, they care about what pumps their bags. Some people go as far as to believe that perception is reality and that truth is a construction. I hope there’s a special place in hell for those people. It’s why the world wasted $10B+ on self driving car companies that obviously made no sense. There’s a much bigger market for truths that pump bags vs truths that don’t. So here’s your new truth that there’s no market for. Do you believe a compiler can code? If so, then go right on believing that AI can code. But if you don’t, then AI is no better than a compiler, and arguably in its current form, worse. The best model of a programming AI is a compiler. You give it a prompt, which is “the code”, and it outputs a compiled version of that code. Sometimes you’ll use it interactively, giving updates to the prompt after it has returned code, but you find that, like most IDEs, this doesn’t work all that well and you are often better off adjusting the original prompt and “recompiling”. While noobs and managers are excited that the input language to this compiler is English, English is a poor language choice for many reasons. It’s not precise in specifying things. The only reason it works for many common programming workflows is because they are common. The minute you try to do new things, you need to be as verbose as the underlying language. AI workflows are, in practice, highly non-deterministic. While different versions of a compiler might give different outputs, they all promise to obey the spec of the language, and if they don’t, there’s a bug in the compiler. English has no similar spec. Prompts are highly non local, changes made in one part of the prompt can affect the entire output. tl;dr, you think AI coding is good because compilers, languages, and libraries are bad. This isn’t to say “AI” technology won’t lead to some extremely good tools. But I argue this comes from increased amounts of search and optimization and patterns to crib from, not from any magic “the AI is doing the coding”. You are still doing the coding, you are just using a different programming language. That anyone uses LLMs to code is a testament to just how bad tooling and languages are. And that LLMs can replace developers at companies is a testament to how bad that company’s codebase and hiring bar is. AI will eventually replace programming jobs in the same way compilers replaced programming jobs. In the same way spreadsheets replaced accounting jobs. But the sooner we start thinking about it as a tool in a workflow and a compiler—through a lens where tons of careful thought has been put in—the better. I can’t believe anyone bought those vibe coding crap things for billions. Many people in self driving accused me of just being upset that I didn’t get the billions, and I’m sure it’s the same thoughts this time. Is your way of thinking so fucking broken that you can’t believe anyone cares more about the actual truth than make believe dollars? From this study, AI makes you feel 20% more productive but in reality makes you 19% slower. How many more billions are we going to waste on this? Or we could, you know, do the hard work and build better programming languages, compilers, and libraries. But that can’t be hyped up for billions.