More from Tony Finch's blog
What does it mean when someone writes that a programming language is “strongly typed”? I’ve known for many years that “strongly typed” is a poorly-defined term. Recently I was prompted on Lobsters to explain why it’s hard to understand what someone means when they use the phrase. I came up with more than five meanings! how strong? The various meanings of “strongly typed” are not clearly yes-or-no. Some developers like to argue that these kinds of integrity checks must be completely perfect or else they are entirely worthless. Charitably (it took me a while to think of a polite way to phrase this), that betrays a lack of engineering maturity. Software engineers, like any engineers, have to create working systems from imperfect materials. To do so, we must understand what guarantees we can rely on, where our mistakes can be caught early, where we need to establish processes to catch mistakes, how we can control the consequences of our mistakes, and how to remediate when somethng breaks because of a mistake that wasn’t caught. strong how? So, what are the ways that a programming language can be strongly or weakly typed? In what ways are real programming languages “mid”? Statically typed as opposed to dynamically typed? Many languages have a mixture of the two, such as run time polymorphism in OO languages (e.g. Java), or gradual type systems for dynamic languages (e.g. TypeScript). Sound static type system? It’s common for static type systems to be deliberately unsound, such as covariant subtyping in arrays or functions (Java, again). Gradual type systems migh have gaping holes for usability reasons (TypeScript, again). And some type systems might be unsound due to bugs. (There are a few of these in Rust.) Unsoundness isn’t a disaster, if a programmer won’t cause it without being aware of the risk. For example: in Lean you can write “sorry” as a kind of “to do” annotation that deliberately breaks soundness; and Idris 2 has type-in-type so it accepts Girard’s paradox. Type safe at run time? Most languages have facilities for deliberately bypassing type safety, with an “unsafe” library module or “unsafe” language features, or things that are harder to spot. It can be more or less difficult to break type safety in ways that the programmer or language designer did not intend. JavaScript and Lua are very safe, treating type safety failures as security vulnerabilities. Java and Rust have controlled unsafety. In C everything is unsafe. Fewer weird implicit coercions? There isn’t a total order here: for instance, C has implicit bool/int coercions, Rust does not; Rust has implicit deref, C does not. There’s a huge range in how much coercions are a convenience or a source of bugs. For example, the PHP and JavaScript == operators are made entirely of WAT, but at least you can use === instead. How fancy is the type system? To what degree can you model properties of your program as types? Is it convenient to parse, not validate? Is the Curry-Howard correspondance something you can put into practice? Or is it only capable of describing the physical layout of data? There are probably other meanings, e.g. I have seen “strongly typed” used to mean that runtime representations are abstract (you can’t see the underlying bytes); or in the past it sometimes meant a language with a heavy type annotation burden (as a mischaracterization of static type checking). how to type So, when you write (with your keyboard) the phrase “strongly typed”, delete it, and come up with a more precise description of what you really mean. The desiderata above are partly overlapping, sometimes partly orthogonal. Some of them you might care about, some of them not. But please try to communicate where you draw the line and how fuzzy your line is.
Previously, I wrote some sketchy ideas for what I call a p-fast trie, which is basically a wide fan-out variant of an x-fast trie. It allows you to find the longest matching prefix or nearest predecessor or successor of a query string in a set of names in O(log k) time, where k is the key length. My initial sketch was more complicated and greedy for space than necessary, so here’s a simplified revision. (“p” now stands for prefix.) layout A p-fast trie stores a lexicographically ordered set of names. A name is a sequence of characters from some small-ish character set. For example, DNS names can be represented as a set of about 50 letters, digits, punctuation and escape characters, usually one per byte of name. Names that are arbitrary bit strings can be split into chunks of 6 bits to make a set of 64 characters. Every unique prefix of every name is added to a hash table. An entry in the hash table contains: A shared reference to the closest name lexicographically greater than or equal to the prefix. Multiple hash table entries will refer to the same name. A reference to a name might instead be a reference to a leaf object containing the name. The length of the prefix. To save space, each prefix is not stored separately, but implied by the combination of the closest name and prefix length. A bitmap with one bit per possible character, corresponding to the next character after this prefix. For every other prefix that matches this prefix and is one character longer than this prefix, a bit is set in the bitmap corresponding to the last character of the longer prefix. search The basic algorithm is a longest-prefix match. Look up the query string in the hash table. If there’s a match, great, done. Otherwise proceed by binary chop on the length of the query string. If the prefix isn’t in the hash table, reduce the prefix length and search again. (If the empty prefix isn’t in the hash table then there are no names to find.) If the prefix is in the hash table, check the next character of the query string in the bitmap. If its bit is set, increase the prefix length and search again. Otherwise, this prefix is the answer. predecessor Instead of putting leaf objects in a linked list, we can use a more complicated search algorithm to find names lexicographically closest to the query string. It’s tricky because a longest-prefix match can land in the wrong branch of the implicit trie. Here’s an outline of a predecessor search; successor requires more thought. During the binary chop, when we find a prefix in the hash table, compare the complete query string against the complete name that the hash table entry refers to (the closest name greater than or equal to the common prefix). If the name is greater than the query string we’re in the wrong branch of the trie, so reduce the length of the prefix and search again. Otherwise search the set bits in the bitmap for one corresponding to the greatest character less than the query string’s next character; if there is one remember it and the prefix length. This will be the top of the sub-trie containing the predecessor, unless we find a longer match. If the next character’s bit is set in the bitmap, continue searching with a longer prefix, else stop. When the binary chop has finished, we need to walk down the predecessor sub-trie to find its greatest leaf. This must be done one character at a time – there’s no shortcut. thoughts In my previous note I wondered how the number of search steps in a p-fast trie compares to a qp-trie. I have some old numbers measuring the average depth of binary, 4-bit, 5-bit, 6-bit and 4-bit, 5-bit, dns qp-trie variants. A DNS-trie varies between 7 and 15 deep on average, depending on the data set. The number of steps for a search matches the depth for exact-match lookups, and is up to twice the depth for predecessor searches. A p-fast trie is at most 9 hash table probes for DNS names, and unlikely to be more than 7. I didn’t record the average length of names in my benchmark data sets, but I guess they would be 8–32 characters, meaning 3–5 probes. Which is far fewer than a qp-trie, though I suspect a hash table probe takes more time than chasing a qp-trie pointer. (But this kind of guesstimate is notoriously likely to be wrong!) However, a predecessor search might need 30 probes to walk down the p-fast trie, which I think suggests a linked list of leaf objects is a better option.
Here’s a sketch of an idea that might or might not be a good idea. Dunno if it’s similar to something already described in the literature – if you know of something, please let me know via the links in the footer! The gist is to throw away the tree and interior pointers from a qp-trie. Instead, the p-fast trie is stored using a hash map organized into stratified levels, where each level corresponds to a prefix of the key. Exact-match lookups are normal O(1) hash map lookups. Predecessor / successor searches use binary chop on the length of the key. Where a qp-trie search is O(k), where k is the length of the key, a p-fast trie search is O(log k). This smaller O(log k) bound is why I call it a “p-fast trie” by analogy with the x-fast trie, which has O(log log N) query time. (The “p” is for popcount.) I’m not sure if this asymptotic improvement is likely to be effective in practice; see my thoughts towards the end of this note. layout A p-fast trie consists of: Leaf objects, each of which has a name. Each leaf object refers to its successor forming a circular linked list. (The last leaf refers to the first.) Multiple interior nodes refer to each leaf object. A hash map containing every (strict) prefix of every name in the trie. Each prefix maps to a unique interior node. Names are treated as bit strings split into chunks of (say) 6 bits, and prefixes are whole numbers of chunks. An interior node contains a (1<<6) == 64 wide bitmap with a bit set for each chunk where prefix+chunk matches a key. Following the bitmap is a popcount-compressed array of references to the leaf objects that are the closest predecessor of the corresponding prefix+chunk key. Prefixes are strictly shorter than names so that we can avoid having to represent non-values after the end of a name, and so that it’s OK if one name is a prefix of another. The size of chunks and bitmaps might change; 6 is a guess that I expect will work OK. For restricted alphabets you can use something like my DNS trie name preparation trick to squash 8-bit chunks into sub-64-wide bitmaps. In Rust where cross-references are problematic, there might have to be a hash map that owns the leaf objects, so that the p-fast trie can refer to them by name. Or use a pool allocator and refer to leaf objects by numerical index. search To search, start by splitting the query string at its end into prefix + final chunk of bits. Look up the prefix in the hash map and check the chunk’s bit in the bitmap. If it’s set, you can return the corresponding leaf object because it’s either an exact match or the nearest predecessor. If it isn’t found, and you want the predecessor or successor, continue with a binary chop on the length of the query string. Look up the chopped prefix in the hash map. The next chunk is the chunk of bits in the query string immediately after the prefix. If the prefix is present and the next chunk’s bit is set, remember the chunk’s leaf pointer, make the prefix longer, and try again. If the prefix is present and the next chunk’s bit is not set and there’s a lesser bit that is set, return the leaf pointer for the lesser bit. Otherwise make the prefix shorter and try again. If the prefix isn’t present, make the prefix shorter and try again. When the binary chop bottoms out, return the longest-matching leaf you remembered. The leaf’s key and successor bracket the query string. modify When inserting a name, all its prefixes must be added to the hash map from longest to shortest. At the point where it finds that the prefix already exists, the insertion routine needs to walk down the (implicit) tree of successor keys, updating pointers that refer to the new leaf’s predecessor so they refer to the new leaf instead. Similarly, when deleting a name, remove every prefix from longest to shortest from the hash map where they only refer to this leaf. At the point where the prefix has sibling nodes, walk down the (implicit) tree of successor keys, updating pointers that refer to the deleted leaf so they refer to its predecessor instead. I can’t “just” use a concurrent hash map and expect these algorithms to be thread-safe, because they require multiple changes to the hashmaps. I wonder if the search routine can detect when the hash map is modified underneath it and retry. thoughts It isn’t obvious how a p-fast trie might compare to a qp-trie in practice. A p-fast trie will use a lot more memory than a qp-trie because it requires far more interior nodes. They need to exist so that the random-access binary chop knows whether to shorten or lengthen the prefix. To avoid wasting space the hash map keys should refer to names in leaf objects, instead of making lots of copies. This is probably tricky to get right. In a qp-trie the costly part of the lookup is less than O(k) because non-branching interior nodes are omitted. How does that compare to a p-fast trie’s O(log k)? Exact matches in a p-fast trie are just a hash map lookup. If they are worth optimizing then a qp-trie could also be augmented with a hash map. Many steps of a qp-trie search are checking short prefixes of the key near the root of the tree, which should be well cached. By contrast, a p-fast trie search will typically skip short prefixes and instead bounce around longer prefixes, which suggests its cache behaviour won’t be so friendly. A qp-trie predecessor/successor search requires two traversals, one to find the common prefix of the key and another to find the prefix’s predecessor/successor. A p-fast trie requires only one.
Here’s a story from nearly 10 years ago. the bug I think it was my friend Richard Kettlewell who told me about a bug he encountered with Let’s Encrypt in its early days in autumn 2015: it was failing to validate mail domains correctly. the context At the time I had previously been responsible for Cambridge University’s email anti-spam system for about 10 years, and in 2014 I had been given responsibility for Cambridge University’s DNS. So I knew how Let’s Encrypt should validate mail domains. Let’s Encrypt was about one year old. Unusually, the code that runs their operations, Boulder, is free software and open to external contributors. Boulder is written in Golang, and I had not previously written any code in Golang. But its reputation is to be easy to get to grips with. So, in principle, the bug was straightforward for me to fix. How difficult would it be as a Golang newbie? And what would Let’s Encrypt’s contribution process be like? the hack I cloned the Boulder repository and had a look around the code. As is pretty typical, there are a couple of stages to fixing a bug in an unfamiliar codebase: work out where the problem is try to understand if the obvious fix could be better In this case, I remember discovering a relatively substantial TODO item that intersected with the bug. I can’t remember the details, but I think there were wider issues with DNS lookups in Boulder. I decided it made sense to fix the immediate problem without getting involved in things that would require discussion with Let’s Encrypt staff. I faffed around with the code and pushed something that looked like it might work. A fun thing about this hack is that I never got a working Boulder test setup on my workstation (or even Golang, I think!) – I just relied on the Let’s Encrypt cloud test setup. The feedback time was very slow, but it was tolerable for a simple one-off change. the fix My pull request was small, +48-14. After a couple of rounds of review and within a few days, it was merged and put into production! A pleasing result. the upshot I thought Golang (at least as it was used in the Boulder codebase) was as easy to get to grips with as promised. I did not touch it again until several years later, because there was no need to, but it seemed fine. I was very impressed by the Let’s Encrypt continuous integration and automated testing setup, and by their low-friction workflow for external contributors. One of my fastest drive-by patches to get into worldwide production. My fix was always going to be temporary, and all trace of it was overwritten years ago. It’s good when “temporary” turns out to be true! the point I was reminded of this story in the pub this evening, and I thought it was worth writing down. It demonstrated to me that Let’s Encrypt really were doing all the good stuff they said they were doing. So thank you to Let’s Encrypt for providing an exemplary service and for giving me a happy little anecdote.
More in programming
The best engineering teams take control of their tools. They help develop the frameworks and libraries they depend on, and they do this by running production code on edge — the unreleased next version. That's where progress is made, that's where participation matters most. This sounds scary at first. Edge? Isn't that just another word for danger? What if there's a bug?! Yes, what if? Do you think bugs either just magically appear or disappear? No, they're put there by programmers and removed by the very same. If you want bug-free frameworks and libraries, you have to work for it, but if you do, the reward for your responsibility is increased engineering excellence. Take Rails 8.1, as an example. We just released the first beta version at Rails World, but Shopify, GitHub, 37signals, and a handful of other frontier teams have already been running this code in production for almost a year. Of course, there were bugs along the way, but good automated testing and diligent programmers caught virtually all of them before they went to production. It didn't always used to be this way. Once upon a time, I felt like I had one of the only teams running Rails on edge in production. But now two of the most important web apps in the world are doing the same! At an incredible scale and criticality. This has allowed both of them, and the few others with the same frontier ambition, to foster a truly elite engineering culture. One that isn't just a consumer of open source software, but a real-time co-creator. This is a step function in competence and prowess for any team. It's also an incredible motivation boost. When your programmers are able to directly influence the tools they're working with, they're far more likely to do so, and thus they go deeper, learn more, and create connections to experts in the same situation elsewhere. But this requires being able to immediately use the improvements or bug fixes they help devise. It doesn't work if you sit around waiting patiently for the next release before you dare dive in. Far more companies could do this. Far more companies should do this. Whether it's with Ruby, Rails, Omarchy, or whatever you're using, your team could level up by getting more involved, taking responsibility for finding issues on edge, and reaping the reward of excellence in the process. So what are you waiting on?
Here on a summer night in the grass and lilac smell Drunk on the crickets and the starry sky, Oh what fine stories we could tell With this moonlight to tell them by. A summer night, and you, and paradise, So lovely and so filled with grace, Above your head, the universe has hung its … Continue reading Dreams of Late Summer →
The first in a series of posts about doing things the right way
A deep dive into the Action Cache, the CAS, and the security issues that arise from using Bazel with a remote cache but without remote execution
(I present to you my stream of consciousness on the topic of casing as it applies to the web platform.) I’m reading about the new command and commandfor attributes — which I’m super excited about, declarative behavior invocation in HTML? YES PLEASE!! — and one thing that strikes me is the casing in these APIs. For example, the command attribute has a variety of values in HTML which correspond to APIs in JavaScript. The show-popover attribute value maps to .showPopover() in JavaScript. hide-popover maps to .hidePopover(), etc. So what we have is: lowercase in attribute names e.g. commandfor="..." kebab-case in attribute values e.g. show-popover camelCase for JS counterparts e.g. showPopover() After thinking about this a little more, I remember that HTML attributes names are case insensitive, so the browser will normalize them to lowercase during parsing. Given that, I suppose you could write commandFor="..." but it’s effectively the same. Ok, lowercase attribute names in HTML makes sense. The related popover attributes follow the same convention: popovertarget popovertargetaction And there are many other attribute names in HTML that are lowercase, e.g.: maxlength novalidate contenteditable autocomplete formenctype So that all makes sense. But wait, there are some attribute names with hyphens in them, like aria-label="..." and data-value="...". So why isn’t it command-for="..."? Well, upon further reflection, I suppose those attributes were named that way for extensibility’s sake: they are essentially wildcard attributes that represent a family of attributes that are all under the same namespace: aria-* and data-*. But wait, isn’t that an argument for doing popover-target and popover-target-action? Or command and command-for? But wait (I keep saying that) there are kebab-case attribute names in HTML — like http-equiv on the <meta> tag, or accept-charset on the form tag — but those seem more like legacy exceptions. It seems like the only answer here is: there is no rule. Naming is driven by convention and decisions are made on a case-by-case basis. But if I had to summarize, it would probably be that the default casing for new APIs tends to follow the rules I outlined at the start (and what’s reflected in the new command APIs): lowercase for HTML attributes names kebab-case for HTML attribute values camelCase for JS counterparts Let’s not even get into SVG attribute names We need one of those “bless this mess” signs that we can hang over the World Wide Web. Email · Mastodon · Bluesky