Full Width [alt+shift+f] FOCUS MODE Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
31
Here are a couple of algorithms for generating uniformly distributed floating point numbers 0.0 <= n < 1.0 using an unbiased random bit generator and IEEE 754 double precision arithmetic. Both of them depend on details of how floating point numbers work, so before getting into the algorithms I’ll review IEEE 754. The first algorithm uses bit hacking and type punning. The second uses a hexadecimal floating point literal. They are both fun and elegant in their own ways, but these days the second one is better. ieee 754 exponent fraction the problem conversion bithacking solution hexadecimal solution favourite solution I wrote some follow-up notes on more random floating point numbers discussing another model of uniformity, and making better use of floating point’s dynamic range. ieee 754 A floating point number has three parts: a sign bit an exponent a significand aka mantissa aka fraction It works like a binary version of decimal scientific notation, so its value is +/- 1.fff...fff *...
over a year ago

Comments

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Tony Finch's blog

strongly typed?

What does it mean when someone writes that a programming language is “strongly typed”? I’ve known for many years that “strongly typed” is a poorly-defined term. Recently I was prompted on Lobsters to explain why it’s hard to understand what someone means when they use the phrase. I came up with more than five meanings! how strong? The various meanings of “strongly typed” are not clearly yes-or-no. Some developers like to argue that these kinds of integrity checks must be completely perfect or else they are entirely worthless. Charitably (it took me a while to think of a polite way to phrase this), that betrays a lack of engineering maturity. Software engineers, like any engineers, have to create working systems from imperfect materials. To do so, we must understand what guarantees we can rely on, where our mistakes can be caught early, where we need to establish processes to catch mistakes, how we can control the consequences of our mistakes, and how to remediate when somethng breaks because of a mistake that wasn’t caught. strong how? So, what are the ways that a programming language can be strongly or weakly typed? In what ways are real programming languages “mid”? Statically typed as opposed to dynamically typed? Many languages have a mixture of the two, such as run time polymorphism in OO languages (e.g. Java), or gradual type systems for dynamic languages (e.g. TypeScript). Sound static type system? It’s common for static type systems to be deliberately unsound, such as covariant subtyping in arrays or functions (Java, again). Gradual type systems migh have gaping holes for usability reasons (TypeScript, again). And some type systems might be unsound due to bugs. (There are a few of these in Rust.) Unsoundness isn’t a disaster, if a programmer won’t cause it without being aware of the risk. For example: in Lean you can write “sorry” as a kind of “to do” annotation that deliberately breaks soundness; and Idris 2 has type-in-type so it accepts Girard’s paradox. Type safe at run time? Most languages have facilities for deliberately bypassing type safety, with an “unsafe” library module or “unsafe” language features, or things that are harder to spot. It can be more or less difficult to break type safety in ways that the programmer or language designer did not intend. JavaScript and Lua are very safe, treating type safety failures as security vulnerabilities. Java and Rust have controlled unsafety. In C everything is unsafe. Fewer weird implicit coercions? There isn’t a total order here: for instance, C has implicit bool/int coercions, Rust does not; Rust has implicit deref, C does not. There’s a huge range in how much coercions are a convenience or a source of bugs. For example, the PHP and JavaScript == operators are made entirely of WAT, but at least you can use === instead. How fancy is the type system? To what degree can you model properties of your program as types? Is it convenient to parse, not validate? Is the Curry-Howard correspondance something you can put into practice? Or is it only capable of describing the physical layout of data? There are probably other meanings, e.g. I have seen “strongly typed” used to mean that runtime representations are abstract (you can’t see the underlying bytes); or in the past it sometimes meant a language with a heavy type annotation burden (as a mischaracterization of static type checking). how to type So, when you write (with your keyboard) the phrase “strongly typed”, delete it, and come up with a more precise description of what you really mean. The desiderata above are partly overlapping, sometimes partly orthogonal. Some of them you might care about, some of them not. But please try to communicate where you draw the line and how fuzzy your line is.

a week ago 19 votes
p-fast trie, but smaller

Previously, I wrote some sketchy ideas for what I call a p-fast trie, which is basically a wide fan-out variant of an x-fast trie. It allows you to find the longest matching prefix or nearest predecessor or successor of a query string in a set of names in O(log k) time, where k is the key length. My initial sketch was more complicated and greedy for space than necessary, so here’s a simplified revision. (“p” now stands for prefix.) layout A p-fast trie stores a lexicographically ordered set of names. A name is a sequence of characters from some small-ish character set. For example, DNS names can be represented as a set of about 50 letters, digits, punctuation and escape characters, usually one per byte of name. Names that are arbitrary bit strings can be split into chunks of 6 bits to make a set of 64 characters. Every unique prefix of every name is added to a hash table. An entry in the hash table contains: A shared reference to the closest name lexicographically greater than or equal to the prefix. Multiple hash table entries will refer to the same name. A reference to a name might instead be a reference to a leaf object containing the name. The length of the prefix. To save space, each prefix is not stored separately, but implied by the combination of the closest name and prefix length. A bitmap with one bit per possible character, corresponding to the next character after this prefix. For every other prefix that matches this prefix and is one character longer than this prefix, a bit is set in the bitmap corresponding to the last character of the longer prefix. search The basic algorithm is a longest-prefix match. Look up the query string in the hash table. If there’s a match, great, done. Otherwise proceed by binary chop on the length of the query string. If the prefix isn’t in the hash table, reduce the prefix length and search again. (If the empty prefix isn’t in the hash table then there are no names to find.) If the prefix is in the hash table, check the next character of the query string in the bitmap. If its bit is set, increase the prefix length and search again. Otherwise, this prefix is the answer. predecessor Instead of putting leaf objects in a linked list, we can use a more complicated search algorithm to find names lexicographically closest to the query string. It’s tricky because a longest-prefix match can land in the wrong branch of the implicit trie. Here’s an outline of a predecessor search; successor requires more thought. During the binary chop, when we find a prefix in the hash table, compare the complete query string against the complete name that the hash table entry refers to (the closest name greater than or equal to the common prefix). If the name is greater than the query string we’re in the wrong branch of the trie, so reduce the length of the prefix and search again. Otherwise search the set bits in the bitmap for one corresponding to the greatest character less than the query string’s next character; if there is one remember it and the prefix length. This will be the top of the sub-trie containing the predecessor, unless we find a longer match. If the next character’s bit is set in the bitmap, continue searching with a longer prefix, else stop. When the binary chop has finished, we need to walk down the predecessor sub-trie to find its greatest leaf. This must be done one character at a time – there’s no shortcut. thoughts In my previous note I wondered how the number of search steps in a p-fast trie compares to a qp-trie. I have some old numbers measuring the average depth of binary, 4-bit, 5-bit, 6-bit and 4-bit, 5-bit, dns qp-trie variants. A DNS-trie varies between 7 and 15 deep on average, depending on the data set. The number of steps for a search matches the depth for exact-match lookups, and is up to twice the depth for predecessor searches. A p-fast trie is at most 9 hash table probes for DNS names, and unlikely to be more than 7. I didn’t record the average length of names in my benchmark data sets, but I guess they would be 8–32 characters, meaning 3–5 probes. Which is far fewer than a qp-trie, though I suspect a hash table probe takes more time than chasing a qp-trie pointer. (But this kind of guesstimate is notoriously likely to be wrong!) However, a predecessor search might need 30 probes to walk down the p-fast trie, which I think suggests a linked list of leaf objects is a better option.

a month ago 23 votes
p-fast trie: lexically ordered hash map

Here’s a sketch of an idea that might or might not be a good idea. Dunno if it’s similar to something already described in the literature – if you know of something, please let me know via the links in the footer! The gist is to throw away the tree and interior pointers from a qp-trie. Instead, the p-fast trie is stored using a hash map organized into stratified levels, where each level corresponds to a prefix of the key. Exact-match lookups are normal O(1) hash map lookups. Predecessor / successor searches use binary chop on the length of the key. Where a qp-trie search is O(k), where k is the length of the key, a p-fast trie search is O(log k). This smaller O(log k) bound is why I call it a “p-fast trie” by analogy with the x-fast trie, which has O(log log N) query time. (The “p” is for popcount.) I’m not sure if this asymptotic improvement is likely to be effective in practice; see my thoughts towards the end of this note. layout A p-fast trie consists of: Leaf objects, each of which has a name. Each leaf object refers to its successor forming a circular linked list. (The last leaf refers to the first.) Multiple interior nodes refer to each leaf object. A hash map containing every (strict) prefix of every name in the trie. Each prefix maps to a unique interior node. Names are treated as bit strings split into chunks of (say) 6 bits, and prefixes are whole numbers of chunks. An interior node contains a (1<<6) == 64 wide bitmap with a bit set for each chunk where prefix+chunk matches a key. Following the bitmap is a popcount-compressed array of references to the leaf objects that are the closest predecessor of the corresponding prefix+chunk key. Prefixes are strictly shorter than names so that we can avoid having to represent non-values after the end of a name, and so that it’s OK if one name is a prefix of another. The size of chunks and bitmaps might change; 6 is a guess that I expect will work OK. For restricted alphabets you can use something like my DNS trie name preparation trick to squash 8-bit chunks into sub-64-wide bitmaps. In Rust where cross-references are problematic, there might have to be a hash map that owns the leaf objects, so that the p-fast trie can refer to them by name. Or use a pool allocator and refer to leaf objects by numerical index. search To search, start by splitting the query string at its end into prefix + final chunk of bits. Look up the prefix in the hash map and check the chunk’s bit in the bitmap. If it’s set, you can return the corresponding leaf object because it’s either an exact match or the nearest predecessor. If it isn’t found, and you want the predecessor or successor, continue with a binary chop on the length of the query string. Look up the chopped prefix in the hash map. The next chunk is the chunk of bits in the query string immediately after the prefix. If the prefix is present and the next chunk’s bit is set, remember the chunk’s leaf pointer, make the prefix longer, and try again. If the prefix is present and the next chunk’s bit is not set and there’s a lesser bit that is set, return the leaf pointer for the lesser bit. Otherwise make the prefix shorter and try again. If the prefix isn’t present, make the prefix shorter and try again. When the binary chop bottoms out, return the longest-matching leaf you remembered. The leaf’s key and successor bracket the query string. modify When inserting a name, all its prefixes must be added to the hash map from longest to shortest. At the point where it finds that the prefix already exists, the insertion routine needs to walk down the (implicit) tree of successor keys, updating pointers that refer to the new leaf’s predecessor so they refer to the new leaf instead. Similarly, when deleting a name, remove every prefix from longest to shortest from the hash map where they only refer to this leaf. At the point where the prefix has sibling nodes, walk down the (implicit) tree of successor keys, updating pointers that refer to the deleted leaf so they refer to its predecessor instead. I can’t “just” use a concurrent hash map and expect these algorithms to be thread-safe, because they require multiple changes to the hashmaps. I wonder if the search routine can detect when the hash map is modified underneath it and retry. thoughts It isn’t obvious how a p-fast trie might compare to a qp-trie in practice. A p-fast trie will use a lot more memory than a qp-trie because it requires far more interior nodes. They need to exist so that the random-access binary chop knows whether to shorten or lengthen the prefix. To avoid wasting space the hash map keys should refer to names in leaf objects, instead of making lots of copies. This is probably tricky to get right. In a qp-trie the costly part of the lookup is less than O(k) because non-branching interior nodes are omitted. How does that compare to a p-fast trie’s O(log k)? Exact matches in a p-fast trie are just a hash map lookup. If they are worth optimizing then a qp-trie could also be augmented with a hash map. Many steps of a qp-trie search are checking short prefixes of the key near the root of the tree, which should be well cached. By contrast, a p-fast trie search will typically skip short prefixes and instead bounce around longer prefixes, which suggests its cache behaviour won’t be so friendly. A qp-trie predecessor/successor search requires two traversals, one to find the common prefix of the key and another to find the prefix’s predecessor/successor. A p-fast trie requires only one.

a month ago 26 votes
clamp / median / range

Here are a few tangentially-related ideas vaguely near the theme of comparison operators. comparison style clamp style clamp is median clamp in range range style style clash? comparison style Some languages such as BCPL, Icon, Python have chained comparison operators, like if min <= x <= max: ... In languages without chained comparison, I like to write comparisons as if they were chained, like, if min <= x && x <= max { // ... } A rule of thumb is to prefer less than (or equal) operators and avoid greater than. In a sequence of comparisons, order values from (expected) least to greatest. clamp style The clamp() function ensures a value is between some min and max, def clamp(min, x, max): if x < min: return min if max < x: return max return x I like to order its arguments matching the expected order of the values, following my rule of thumb for comparisons. (I used that flavour of clamp() in my article about GCRA.) But I seem to be unusual in this preference, based on a few examples I have seen recently. clamp is median Last month, Fabian Giesen pointed out a way to resolve this difference of opinion: A function that returns the median of three values is equivalent to a clamp() function that doesn’t care about the order of its arguments. This version is written so that it returns NaN if any of its arguments is NaN. (When an argument is NaN, both of its comparisons will be false.) fn med3(a: f64, b: f64, c: f64) -> f64 { match (a <= b, b <= c, c <= a) { (false, false, false) => f64::NAN, (false, false, true) => b, // a > b > c (false, true, false) => a, // c > a > b (false, true, true) => c, // b <= c <= a (true, false, false) => c, // b > c > a (true, false, true) => a, // c <= a <= b (true, true, false) => b, // a <= b <= c (true, true, true) => b, // a == b == c } } When two of its arguments are constant, med3() should compile to the same code as a simple clamp(); but med3()’s misuse-resistance comes at a small cost when the arguments are not known at compile time. clamp in range If your language has proper range types, there is a nicer way to make clamp() resistant to misuse: fn clamp(x: f64, r: RangeInclusive<f64>) -> f64 { let (&min,&max) = (r.start(), r.end()); if x < min { return min } if max < x { return max } return x; } let x = clamp(x, MIN..=MAX); range style For a long time I have been fond of the idea of a simple counting for loop that matches the syntax of chained comparisons, like for min <= x <= max: ... By itself this is silly: too cute and too ad-hoc. I’m also dissatisfied with the range or slice syntax in basically every programming language I’ve seen. I thought it might be nice if the cute comparison and iteration syntaxes were aspects of a more generally useful range syntax, but I couldn’t make it work. Until recently when I realised I could make use of prefix or mixfix syntax, instead of confining myself to infix. So now my fantasy pet range syntax looks like >= min < max // half-open >= min <= max // inclusive And you might use it in a pattern match if x is >= min < max { // ... } Or as an iterator for x in >= min < max { // ... } Or to take a slice xs[>= min < max] style clash? It’s kind of ironic that these range examples don’t follow the left-to-right, lesser-to-greater rule of thumb that this post started off with. (x is not lexically between min and max!) But that rule of thumb is really intended for languages such as C that don’t have ranges. Careful stylistic conventions can help to avoid mistakes in nontrivial conditional expressions. It’s much better if language and library features reduce the need for nontrivial conditions and catch mistakes automatically.

2 months ago 36 votes
Golang and Let's Encrypt: a free software story

Here’s a story from nearly 10 years ago. the bug I think it was my friend Richard Kettlewell who told me about a bug he encountered with Let’s Encrypt in its early days in autumn 2015: it was failing to validate mail domains correctly. the context At the time I had previously been responsible for Cambridge University’s email anti-spam system for about 10 years, and in 2014 I had been given responsibility for Cambridge University’s DNS. So I knew how Let’s Encrypt should validate mail domains. Let’s Encrypt was about one year old. Unusually, the code that runs their operations, Boulder, is free software and open to external contributors. Boulder is written in Golang, and I had not previously written any code in Golang. But its reputation is to be easy to get to grips with. So, in principle, the bug was straightforward for me to fix. How difficult would it be as a Golang newbie? And what would Let’s Encrypt’s contribution process be like? the hack I cloned the Boulder repository and had a look around the code. As is pretty typical, there are a couple of stages to fixing a bug in an unfamiliar codebase: work out where the problem is try to understand if the obvious fix could be better In this case, I remember discovering a relatively substantial TODO item that intersected with the bug. I can’t remember the details, but I think there were wider issues with DNS lookups in Boulder. I decided it made sense to fix the immediate problem without getting involved in things that would require discussion with Let’s Encrypt staff. I faffed around with the code and pushed something that looked like it might work. A fun thing about this hack is that I never got a working Boulder test setup on my workstation (or even Golang, I think!) – I just relied on the Let’s Encrypt cloud test setup. The feedback time was very slow, but it was tolerable for a simple one-off change. the fix My pull request was small, +48-14. After a couple of rounds of review and within a few days, it was merged and put into production! A pleasing result. the upshot I thought Golang (at least as it was used in the Boulder codebase) was as easy to get to grips with as promised. I did not touch it again until several years later, because there was no need to, but it seemed fine. I was very impressed by the Let’s Encrypt continuous integration and automated testing setup, and by their low-friction workflow for external contributors. One of my fastest drive-by patches to get into worldwide production. My fix was always going to be temporary, and all trace of it was overwritten years ago. It’s good when “temporary” turns out to be true! the point I was reminded of this story in the pub this evening, and I thought it was worth writing down. It demonstrated to me that Let’s Encrypt really were doing all the good stuff they said they were doing. So thank you to Let’s Encrypt for providing an exemplary service and for giving me a happy little anecdote.

2 months ago 49 votes

More in programming

Stumbling upon

Something like a channel changer, for the web. That's what the idea was at first. But it led to a whole new path of discovery that even the site's creators couldn't have predicted. The post Stumbling upon appeared first on The History of the Web.

12 hours ago 5 votes
btrfs on a Raspberry Pi

I’m something of a filesystem geek, I guess. I first wrote about ZFS on Linux 14 years ago, and even before I used ZFS, I had used ext2/3/4, jfs, reiserfs, xfs, and no doubt some others. I’ve also used btrfs. I last posted about it in 2014, when I noted it has some advantages over … Continue reading btrfs on a Raspberry Pi →

10 hours ago 2 votes
By the way, what are AA?
yesterday 4 votes
Some Love For Python

I really enjoyed watching Python: The Documentary (from CultRepo, formerly Honeypot, same makers as the TypeScript documentary). Personally, I don’t write much Python and am not involved in the broader Python community. That said, I love how this documentary covers a lot of the human problems in tech and not just the technical history of Python as language. For example: How do you handle succession from a pivotal creator? How do you deal with poor representation? How do you fund and steer open projects? How do you build community? How do you handle the fallout of major version changes? And honestly, all the stories around these topics as told from the perspective of Python feel like lessons to learn from. Here are a few things that stood out to me. Guido van Rossum, Creator of Python, Sounds Cool The film interviews Drew Houston, Founder/CEO at Dropbox, because he hired Python’s creator Guido van Rossum for a stint. This is what Drew had to say about his time working with Guido: It’s hard for me to think of someone who has had more impact with lower ego [than Guido] For tech, that’s saying something! Now that is a legacy if you ask me. The Python Community Sounds Cool Brett Cannon famously gave a talk at a Python conference where he said he “came for the language, but stayed for the community”. In the documentary they interview him and he adds: The community is the true strength of Pyhon. It’s not just the language, it’s the people. ❤️ This flies in the face of the current era we’re in, where it’s the technology that matters. How it disrupts or displaces people is insignificant next to the fantastic capabilities it purports to wield. But here’s this language surrounded by people who acknowledge that the community around the language is its true strength. People are the true strength. Let me call this out again, in case it’s not sinking in: Here’s a piece of technology where the people around it seem to acknowledge that the technology itself is only secondary to the people it was designed to serve. How incongruous is that belief with so many other pieces of technology we’ve seen through the years? What else do we have, if not each other? That’s something worth amplifying. Mariatta, Python Core Developer, Sounds Cool I absolutely loved the story of @mariatta@fosstodon.org. If you’re not gonna watch the documentary, at least watch the ~8 minutes of her story. Watched it? Ok, here’s my quick summary: She loves to program, but everywhere she looks it’s men. At work. At conferences. On core teams. She hears about pyladies and wants to go to Pycon where she can meet them. She goes to Pycon and sees Guido van Rossum stand up and say he wants 2 core contributors to Python that are female. She thinks, “Oh that’s cool! I’m not good enough for that, but I bet they’ll find someone awesome.” The next year she goes to the conference and finds out they’re still looking for those 2 core contributors. She thinks “Why not me?” and fires off an email to Guido. Here’s her recollection on composing that email: I felt really scared. I didn’t feel like I deserved mentorship from Guido van Rossum. I really hesitated to send this email to him, but in the end I realized I want to try. This was a great opportunity for me. I hit the send button. And later, her feelings on becoming the first female core contributor to Python: When you don’t have role models you can relate to, you don’t believe you can do it. ❤️ Mad respect. I love her story. As Jessica McKellar says in the film, Mariatta’s is an inspiring story and “a vision of what is possible in other communities”. Python Is Refreshing I’ve spent years in “webdev” circles — and there are some great ones — but this Python documentary was, to me, a tall, refreshing glass of humanity. Go Python! Email · Mastodon · Bluesky

2 days ago 3 votes
Benjie's Humanoid Olympic Games

A gauntlet thrown

2 days ago 10 votes