Full Width [alt+shift+f] Shortcuts [alt+shift+k] TRY SIMPLE MODE
Sign Up [alt+shift+s] Log In [alt+shift+l]
43
This is a reminder that random load balancing is unevenly distributed. If we distribute a set of items randomly across a set of servers (e.g. by hashing, or by randomly selecting a server), the average number of items on each server is num_items / num_servers. It is easy to assume this means each server has close to the same number of items. However, since we are selecting servers at random, they will have different numbers of items, and the imbalance can be important. For load balancing, a reasonable model is that each server has fixed capacity (e.g. it can serve 3000 requests/second, or store 100 items, etc.). We need to divide the total workload over the servers, so that each server stays below its capacity. This means the number of servers is determined by the most loaded server, not the average. This is a classic balls in bins problem that has been well studied, and there are some interesting theoretical results. However, I wanted some specific numbers, so I wrote a small...
a year ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Evan Jones - Software Engineer | Computer Scientist

Setenv is not Thread Safe and C Doesn't Want to Fix It

You can't safely use the C setenv() or unsetenv() functions in a program that uses threads. Those functions modify global state, and can cause other threads calling getenv() to crash. This also causes crashes in other languages that use those C standard library functions, such as Go's os.Setenv (Go issue) and Rust's std::env::set_var() (Rust issue). I ran into this in a Go program, because Go's built-in DNS resolver can call C's getaddrinfo(), which uses environment variables. This cost me 2 days to track down and file the Go bug. Sadly, this problem has been known for decades. For example, an article from January 2017 said: "None of this is new, but we do re-discover it roughly every five years. See you in 2022." This was only one year off! (She wrote an update in October 2023 after I emailed her about my Go bug.) This is a flaw in the POSIX standard, which extends the C Standard to allow modifying environment varibles. The most infuriating part is that many people who could influence the standard or maintain the C libraries don't see this as a problem. The argument is that the specification clearly documents that setenv() cannot be used with threads. Therefore, if someone does this, the crashes are their fault. We should apparently read every function's specification carefully, not use software written by others, and not use threads. These are unrealistic assumptions in modern software. I think we should instead strive to create APIs that are hard to screw up, and evolve as the ecosystem changes. The C language and standard library continue to play an important role at the base of most software. We either need to figure out how to improve it, or we need to figure out how to abandon it. Why is setenv() not thread-safe? The biggest problem is that getenv() returns a char*, with no need for applications to free it later. One thread could be using this pointer when another thread changes the same environment variable using setenv() or unsetenv(). The getenv() function is perfect if environment variables never change. For example, for accessing a process's initial table of environment variables (see the System V ABI: AMD64 Section 3.4.1). It turns out the C Standard only includes getenv(), so according to C, that is exactly how this should work. However, most implementations also follow the POSIX standard (e.g. POSIX.1-2017), which extends C to include functions that modify the environment. This means the current getenv() API is problematic. Even worse, putenv() adds a char* to the set of environment variables. It is explicitly required that if the application modifies the memory after putenv() returns, it modifies the environment variables. This means applications can modify the value passed to putenv() at any time, without any synchronization. FreeBSD used to implement putenv() by copying the value, but it changed it with FreeBSD 7 in 2008, which suggests some programs really do depend on modifying the environment in this fashion (see FreeBSD putenv man page). As a final problem, environ is a NULL-terminated array of pointers (char**) that an application can read and assign to (see definition in POSIX.1-2017). This is how applications can iterate over all environment variables. Accesses to this array are not thread-safe. However, in my experience many fewer applications use this than getenv() and setenv(). However, this does cause some libraries to not maintain the set of environment variables in a thread-safe way, since they directly update this table. Environment variable implementations Implementations need to choose what do do when an application overwrites an existing variable. I looked at glibc, musl, Solaris/Illumos, and FreeBSD/Apple's C standard libraries, and they make the following choices: Never free environment variables (glibc, Solaris/Illumos): Calling setenv() repeatedly is effectively a memory leak. However, once a value is returned from getenv(), it is immutable and can be used by threads safely. Free the environment variables (musl, FreeBSD/Apple): Using the pointer returned by getenv() after another thread calls setenv() can crash. A second problem is ensuring the set of environment variables is updated in a thread-safe fashion. This is what causes crashes in glibc. glibc uses an array to hold pointers to the "NAME=value" strings. It holds a lock in setenv() when changing this array, but not in getenv(). If a thread calling setenv() needs to resize the array of pointers, it copies the values to a new array and frees the previous one. This can cause other threads executing getenv() to crash, since they are now iterating deallocated memory. This is particularly annoying since glibc already leaks environment variables, and holds a lock in setenv(). All it needs to do is hold the lock inside getenv(), and it would no longer crash. This would make getenv() slightly slower. However, getenv() already uses a linear search of the array, so performance does not appear to be a concern. More sophisticated implementations are possible if this is a problem, such as Solaris/Illumos's lock-free implementation. Why do programs use environment variables? Environment variables useful for configuring shared libraries or language runtimes that are included in other programs. This allows users to change the configuration, without program authors needing to explicitly pass the configuration in. One alternative is command line flags, which requires programs to parse them and pass them in to the libraries. Another alternative are configuration files, which then need some other way to disable or configure, to be able to test new configurations. Environment variables are a simple solution. AS a result, many libraries call getenv() (see a partial list below). Since many libraries are configured through environment variables, a program may need to change these variables to configure the libraries it uses. This is common at application startup. This causes programs to need to call setenv(). Given this issue, it seems like libraries should also provide a way to explicitly configure any settings, and avoid using environment variables. We should fix this problem, and we can In my opinion, it is rediculous that this has been a known problem for so long. It has wasted thousands of hours of people's time, either debugging the problems, or debating what to do about it. We know how to fix the problem. First, we can make a thread-safe implementation, like Illumos/Solaris. This has some limitations: it leaks memory in setenv(), and is still unsafe if a program uses putenv() or the environ variable. However, this is an improvement over the current Linux and Apple implementations. The second solution is to add new APIs to get one and get all environment variables that are thread-safe by design, like Microsoft's getenv_s() (see below for the controversy around C11's "Annex K"). My preferred solution would be to do both. This would reduce the chances of hitting this problem for existing programs and libraries, and also provide a path to avoid the problems entirely for new code or languages like Go and Rust. My rough idea would be the following: Add a function to copy one single environment variable to a user-specified buffer, similar to getenv_s(). Add a thread-safe API to iterate over all environment variables, or to copy all variables out. Mark getenv() as deprecated, recommending the new thread-safe getenv() function instead. Mark putenv() as deprecated, recommending setenv() instead. Mark environ as deprecated, recommending environment variable functions instead. Update the implementation of environment varibles to be thread-safe. This requires leaking memory if getenv() is used on a variable, but we can detect if the old functions are used, and only leak memory in that case. This means programs written in other languages will avoid these problems as soon as their runtimes are updated. Update the C and POSIX standards to require the above changes. This would be progress. The getenv_s / C Standard Annex K controversy Microsoft provides getenv_s(), which copies the environment variable into a caller-provided buffer. This is easy to make thread-safe by holding a read lock while copying the variable. After the function returns, future changes to the environment have no effect. This is included in the C11 Standard as Annex K "Bounds Checking Interfaces". The C standard Annexes are optional features. This Annex includes new functions intended to make it harder to make mistakes with buffers that are the wrong size. The first draft of this extension was published in 2003. This is when Microsoft was focusing on "Trustworthy Computing" after a January 2002 memo from Bill Gates. Basically, Windows wasn't designed to be connected to the Internet, and now that it was, people were finding many security problems. Lots of them were caused by buffer handling mistakes. Microsoft developed new versions of a number of problematic functions, and added checks to the Visual C++ compiler to warn about using the old ones. They then attempted to standardize these functions. My understanding is the people responsible for the Unix POSIX standards did not like the design of these functions, so they refused to implement them. For more details, see Field Experience With Annex K published in September 2015, Stack Overflow: Why didn't glibc implement _s functions? updated March 2023, and Rich Felker of musl on both technical and social reasons for not implementing Annex K from February 2019. I haven't looked at the rest of the functions, but having spent way too long looking at getenv(), the general idea of getenv_s() seems like a good idea to me. Standardizing this would help avoid this problem. Incomplete list of common environment variables This is a list of some uses of environment variables from fairly widely used libraries and services. This shows that environment variables are pretty widely used. Cloud Provider Credentials and Services AWS's SDKs for credentials (e.g. AWS_ACCESS_KEY_ID) Google Cloud Application Default Credentials (e.g. GOOGLE_APPLICATION_CREDENTIALS) Microsoft Azure Default Azure Credential (e.g. AZURE_CLIENT_ID) AWS's Lambda serverless product: sets a large number of variables like AWS_REGION, AWS_LAMBDA_FUNCTION_NAME, and credentials like AWS_SECRET_ACCESS_KEY Google Cloud Run serverless product: configuration like PORT, K_SERVICE, K_REVISION Kubernetes service discovery: Defines variables SERVICE_NAME_HOST and SERVICE_NAME_PORT. Third-party C/C++ Libraries OpenTelemetry: Metrics and tracing. Many environment variables like OTEL_SERVICE_NAME and OTEL_RESOURCE_ATTRIBUTES. OpenSSL: many configurable variables like HTTPS_PROXY, OPENSSL_CONF, OPENSSL_ENGINES. BoringSSL: Google's fork of OpenSSL used in Chrome and others. It reads SSLKEYLOGFILE just like OpenSSL for logging TLS keys for debugging. Libcurl: proxies, SSL/TLS configuration and debugging like HTTPS_PROXY, CURL_SSL_BACKEND, CURL_DEBUG. Libpq Postgres client library: connection parameters including credentials like PGHOSTADDR, PGDATABASE, and PGPASSWORD. Rust Standard Library std::thread RUST_MIN_STACK: Calls std::env::var() on the first call to spawn() a new thread. It is cached in a static atomic variable and never read again. See implementation in thread::min_stack(). std::backtrace RUST_LIB_BACKTRACE: Calls std::env::var() on the first call to capture a backtrace. It is cached in a static atomic variable and never read again. See implementation in Backtrace::enabled().

a year ago 53 votes
Nanosecond timestamp collisions are common

I was wondering: how often do nanosecond timestamps collide on modern systems? The answer is: very often, like 5% of all samples, when reading the clock on all 4 physical cores at the same time. As a result, I think it is unsafe to assume that a raw nanosecond timestamp is a unique identifier. I wrote a small test program to test this. I used Go, which records both the "absolute" time and the "monotonic clock" relative time on each call to time.Now(), so I compared both the relative difference between consecutive timestamps, as well as just the absolute timestamps. As expected, the behavior depends on the system, so I observe very different results on Mac OS X and Linux. On Linux, within a single thread, both the absolute and monotonic times always increase. On my system, the minimum increment was 32 ns. Between threads, approximately 5% of the absolute times were exactly the same as other threads. Even with 2 threads on a 4 core system, approximately 2% of timestamps collided. On Mac OS X: the absolute time has microsecond resolution, so there are an astronomical number of collisions when I repeat this same test. Even within a thread I often observe the monotonic clock not increment. See the test program on Github if you are curious.

over a year ago 37 votes
How much does the read/write buffer size matter for socket throughput?

The read() and write() system calls take a variable-length byte array as an argument. As a simplified model, the time for the system call should be some constant "per-call" time, plus time directly proportional to the number of bytes in the array. That is, the time for each call should be time = (per_call_minimum_time) + (array_len) × (per_byte_time). With this model, using a larger buffer should increase throughput, asymptotically approaching 1/per_byte_time. I was curious: do real system calls behave this way? What are the ideal buffer sizes for read() and write() if we want to maximize throughput? I decided to do some experiments with blocking I/O. These are not rigorous, and I suspect the results will vary significantly if the hardware and software are different than one the system I tested. The really short answer is that a buffer of 32 KiB is a good starting point on today's systems, and I would want to measure the performance to go beyond that. However, for large writes, performance can increase. On Linux, the simple model holds for small buffers (≤ 4 KiB), but once the program approaches the maximum throughput, the throughput becomes highly variable and in many cases decreases as the buffers get larger. For blocking I/O, approximately 32 KiB is large enough to hit the maximum throughput for read(), but write() throughput improves with buffers up to around 256 KiB - 1 MiB. The reason for the asymmetry is that the Linux kernel will only write less than the entire buffer (a "short write") if there is an error (e.g. a signal causing EINTR). Thus, larger write buffers means the operating system needs to switch to the process less often. On the other head, "short reads", where a read() returns less than the maximum length, become increasingly common as the buffer size increases, which diminishes the benefit. There is a SO_RCVLOWAT socket option to change this that I did not test. The experiments were run on two 16 CPU Google Cloud T2D instances, which use AMD EPYC Milan processors (3rd generation, released in 2021). Each core is a real physical core. I used Ubuntu 23.04 running kernel 6.2.0-1005-gcp. My benchmark program is written in Rust and is available on Github. On localhost, Unix sockets were able to transfer data at approximately 9000 MiB/s. Localhost TCP sockets were a bit slower, around 7000 MiB/s. When using two separate cloud VMs with a networking throughput limit of 32 Gbps = 3800 MiB/s, I needed to use 6 TCP sockets to reliably reach that maximum throughput. A single TCP socket gets around 1400 MiB/s with 256 KiB buffers, with peaks as high as 2200 MiB/s. Experiment 1: /dev/zero and /dev/urandom My first experiment is reading from the /dev/zero and /dev/urandom devices. These are software devices implemented by the kernel, so they should have low overhead and low variability, since other tasks are not involved. Reading from /dev/urandom should be much slower than /dev/zero since the kernel must generate random bytes, rather than just zeros. The chart below shows the throughput for reading from /dev/zero as the buffer size is increased. The results show that the basic linear time per system call model holds until the system reaches maximum throughput (256 kiB buffer = 39000 MiB/s for /dev/zero, or 16 kiB = 410 MiB/s for /dev/urandom). As the buffer size increases further, the throughput decreases as the buffers get too big. This suggests that some other cost for larger buffers starts to outweigh the reduction in number of system calls. Perhaps CPU caches become less effective? The AMD EPYC Milan (3rd gen) CPU I tested on has 32 KiB of L1 data cache and 512 KiB of L2 data cache per core. The performance decreases don't exactly line up with these numbers, but it still seems plausible. The numbers for /dev/urandom are substantially lower, but otherwise similar. I did a linear least-squares fit on the average time per system call, shown in the following chart. If I use all the data, the fit is not good, because the trend changes for larger buffers. However, if I use the data up to the maximum throughput at 256 KiB, the fit is very good, as shown on the chart below. The linear fit models the minimum time per system call as 167 ns, with 0.0235 ns/byte additional time. If we want to use smaller buffers, using a 64 KiB buffer for reading from /dev/zero gets within 95% of the maximum throughput. Experiment 2: Unix and localhost TCP sockets Exchanging data with other processes is the thing I am actually interested in, so I tested Unix and TCP sockets on a single machine. In this case, I varied both the write buffer size and the read buffer size. Unfortunately, these results vary a lot. A more robust comparison would require running each experiment many times, and using some sort of statistical comparison. However, this "quick and dirty" experiment satisfied my curiousity, so I didn't do that. As a result, my conclusions here are vague. The simple model that increasing buffer size should decrease overhead is true, but only until the buffers are about 4 KiB. Above that point, the results start to be highly variable, and it is much harder to draw general conclusion. However, appears that increasing the write buffer size generally is quite helpful up to at least 256 KiB, and often needed as much as 1 MiB to get the highest localhost throughput. I suspect this is because on Linux with blocking sockets, write() will not return until it has written all the data in the buffer, unless there is an error (e.g. EINTR). As a result, passing a large buffer means the kernel can do a lot of the work without needing to switch back to user space. Unfortunately, the same is not true for read(), which often returns "short reads" with any data that is available in the buffer. This starts with buffer sizes around 2 KiB, with the percentage of short reads increasing as the buffer size gets larger. This means the simple model does not hold, because we aren't actually increasing the bytes per read call. I suspect this is a factor which means this microbenchmark is likely not representative of real programs. A real program will do something with the buffer, which will provide time for more data to be buffered in the kernel, and would probably decrease the number of short reads. This likely means larger buffers are in practice more useful than this microbenchmark suggests. As a result of this, the highest throughput often was achievable with small read buffers. I'm somewhat arbitrarily selecting 16 KiB at the best read buffer, and 256 KiB as the best write buffer, although a 1 MiB write buffer seems to be To give a sense of how variable the results are, the plot below shows the local Unix socket throughput for each read and write buffer throughput size. I apologize for the ugly plot. I did not want to spend the time to make it more beautiful. This plot is interactive so you can slice the data to the area of interest. I recommend zooming in to the left hand size with read buffers up to about 300 KiB. The first thing to note is at least on Linux with blocking sockets, the writer will almost never have a "short write", where the write system call returns before writing all the data in the buffer. Unless there is a signal (EINTR) or some other "error" condition, write() will not return until all the bytes are written. The same is not true for reads. The read() system call will often return a "short" read, starting around buffer sizes of 2 KiB. The percentage of short reads generally increases as buffer sizes get bigger, which is logical. Another note is that sockets have in-kernel send and receive buffers. I did not tune these at all. It is possible that better performance is possible by tuning these settings, but that was not my goal. I wanted to know what happens "out of the box" for general-purpose programs without any special tuning. Experiment 3: TCP between two hosts In this experiment, I used two separate hosts connected with 32 Gbps networking in Google Cloud. I first tested the TCP throughput using iperf, to independently verify the network performance. A single TCP connection with iperf is not enough to fully utilize the network. I tried fiddling with some command line options and with Kernel settings like net.ipv4.tcp_rmem and wasn't able to get much better than about 12 Gb/s = 1400 MiB/s. The throughput is also highly varible. Here is some example output with iperf reporting at 2 second intervals, where you can see the throughput ranging from 10 to 19 Gb/s, with an average over the entire interval of 12 Gb/s. To hit the maximum network throughput, I need to use 6 or more parallel TCP connections (iperf -c IP_ADDRESS --time 60 --interval 2 -l 262144 -P 6). Using 3 connections gets around 26 Gb/s, and using 4 or 5 will occasionally hit the maximum, but will also occasionally drop down. Using at least 6 seems to reliably stay at the maximum. Due to this variability, it is hard to draw any conclusions about buffer size. In particular: a single TCP connection is not limited by CPU. The system uses about 40% of a single CPU core, basically all in the kernel. This is more about how the buffer sizes may impact scheduling choices. That said, it is clear that you cannot hit the maximum throughput with a small write buffer. The experiments with 4 KiB write buffers reached approximately 300 MiB/s, while an 8 KiB write buffer was much faster, around 1400 MiB/s. Larger still generally seems better, up to around 256 KiB, which occasionally reached 2200 MiB/s = 17.6 Gb/s. The plot below shows the TCP socket throughput for each read and write buffer size. Again, I apologize for the ugly plot.

over a year ago 90 votes
The C Standard Library Function isspace() Depends on Locale

This is a post for myself, because I wasted a lot of time understanding this bug, and I want to be able to remember it in the future. I expect close to zero others to be interested. The C standard library function isspace() returns a non-zero value (true) for the six "standard" ASCII white-space characters ('\t', '\n', '\v', '\f', '\r', ' '), and any locale-specific characters. By default, a program starts in the "C" locale, which will only return true for the six ASCII white-space characters. However, if the program changes locales, it can return true for other values. As a result, unless you really understand locales, you should use your own version of this function, or ICU4C's u_isspace() function. An implementation of isspace() for ASCII is one line: /* Returns true for the 6 ASCII white-space characters: \t \n \v \f \r ' '. */ int isspace_ascii(int c) { return c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r' || c == ' '; } I ran into this because On Mac OS X, Postgres switches to the system's default locale, which is something that uses UTF-8 (e.g. en_US.UTF-8, fr_CA.UTF-8, etc). In this case, isspace() returns true for Unicode white-space values, which includes 0x85 = NEL = Next Line, and 0xA0 = NBSP = No-Break Space. This caused a bug in parsing Postgres Hstore values that use Unicode. I have attempted to submit a patch to fix this (mailing list post, commitfest entry). For a program to demonstrate the behaviour on different systems, see isspace_locale on Github.

over a year ago 97 votes

More in programming

It's beginning to feel like the 80s in America again

Have I told you how much I've come to dislike the 90s? The depressive music, the ironic distance to everything, the deconstructive narratives, the moral relativism, and the total cultural takeover of postmodern ideology. Oh, I did that just last week? Well, allow me another go. But rather than railing against the 90s, let me tell you about the 80s. They were amazing. America was firing on all cylinders, Reagan had brought the morning back, and the Soviet Union provided a clear black-and-white adversarial image. But it was the popular culture of the era that still fills me with hiraeth. It was the time of earnest storytelling. When Rocky could just train real hard to avenge the death of his friend by punching Dolph Lundgreen for 10 minutes straight in a montage of blood, sweat, and tears. After which even the Russians couldn't help themselves but cheer for him. Not a shred of irony. Just pure "if you work real hard, you can do the right thing" energy. Or what about Weird Science from 1985? Two nerds bring Barbie to life, and she teaches them to talk to real girls. It was goofy, it was kitsch, but it was also earnest and honest. Nerdy teenage boys have a hard time talking to girls! But they can learn how, and if they do, it'll all work out. That movie was actually the earliest memory I have of wanting to move to America. I don't remember exactly when I saw it, but I remember at the end thinking, "I have to go there." Such was the magnetic power of that American 80s earnest optimism! Or what about the music? Do you have any idea what the 1986 music video for Sabrina's BOYS could do for a young Danish boy who'd only just discovered the appeal of girls? Wonders, is what. Wonders. And again, it depicted this goofy but earnest energy. Boys and girls like each other! They have fun with the chase. The genders are not doomed to opposing trenches on the Eastern Front for all eternity. So back to today. It feels like we're finally emerging from this constant 90s Seattle drizzle to sunny 80s LA vibes in America. The constant pessimism, the cancellation militias, and the walking-on-eggshells atmosphere have given way to something far brighter, bolder, and, yes, better. An optimism, a levity, a confidence. First, America has Sweeney fever. The Sydney Sweeney Has Great Jeans campaign has been dominating the discourse for weeks, but rather than back down with a groveling apology, American Eagle has doubled down with a statement and now the Las Vegas sphere! And Sweeney herself has also kept her foot on the gas. Now there's a new Baskin-Robbins campaign that's equal parts Weird Science (two nerds!) and "Boys Boys Boys" music video. No self-referential winks to how this might achksually be ProBLeMAtIC. Just a confident IT GIRL grabbing the attention of a nation. The male role model too has been going through a rehabilitation lately. I absolutely loved F1: The Movie. It's classic Jerry Bruckheimer Top Gun pathos: Real men doing dangerous stuff for the chase, the glory, and the next generation with a love story that involves a competent yet feminine female partner.  Again, it's an entirely sincere story with great morals. The generational gap is real, but we can learn from each other. Young hotshots have speed but lack experience. Old-timers can be grumpy but their wisdom was hard-won. Even the diversity in that movie fails to feel forced! It's a woman leading the engineering in a male-dominated world, and she can't quite hack it at first. Her designs are too by-the-book. But then Brad Pitt sells the gamble of a dirty-air fighter design, she steps up her game, and wins the crown by her wits and talent. Tell me that isn't a wholesome story! David Foster Wallace nailed this all in his critique of postmodernism. He called out the irony, cynicism, and irreverence that had fully permeated the culture from the 90s forward. And frankly, he was bored with it! It's an amazing interview to watch today. Wallace had the diagnosis nailed back in 1997. But it's taken us until these mid-2020s to fully embrace its conclusions. We need earnest values and virtues. We need sincere stories that are not afraid of grand narratives. That don't constantly have to deconstruct "but what is good really?", and dare embrace a solid defense of "some ways of being really are better." We also need to have fun! We need to throw away these shit-tinted glasses that see everything in the world as a problematic example of some injustice or oppression. We a bit of gratitude for technology and progress! That's what the Sweeney campaign is doing. That's what Brad Pitt is racing for in F1: The Movie. That's what I'm here for! Because as much as I love the croissants of the Old World, I find myself craving that uniquely American brand of optimism, enthusiasm, and determination more whenever I've been back in Europe for too long. Give me some Weird Science! Give me some Sabrina at the pool! Give me some American 80s vibes!

9 hours ago 2 votes
Writing: Blog Posts and Songs

I was listening to a podcast interview with the Jackson Browne (American singer/songwriter, political activist, and inductee into the Rock and Roll Hall of Fame) and the interviewer asks him how he approaches writing songs with social commentaries and critiques — something along the lines of: “How do you get from the New York Times headline on a social subject to the emotional heart of a song that matters to each individual?” Browne discusses how if you’re too subtle, people won’t know what you’re talking about. And if you’re too direct, you run the risk of making people feel like they’re being scolded. Here’s what he says about his songwriting: I want this to sound like you and I were drinking in a bar and we’re just talking about what’s going on in the world. Not as if you’re at some elevated place and lecturing people about something they should know about but don’t but [you think] they should care. You have to get to people where [they are, where] they do care and where they do know. I think that’s a great insight for anyone looking to have a connecting, effective voice. I know for me, it’s really easily to slide into a lecturing voice — you “should” do this and you “shouldn’t” do that. But I like Browne’s framing of trying to have an informal, conversational tone that meets people where they are. Like you’re discussing an issue in the bar, rather than listening to a sermon. Chris Coyier is the canonical example of this that comes to mind. I still think of this post from CSS Tricks where Chris talks about how to have submit buttons that go to different URLs: When you submit that form, it’s going to go to the URL /submit. Say you need another submit button that submits to a different URL. It doesn’t matter why. There is always a reason for things. The web is a big place and all that. He doesn’t conjure up some universally-applicable, justified rationale for why he’s sharing this method. Nor is there any pontificating on why this is “good” or “bad”. Instead, like most of Chris’ stuff, I read it as a humble acknowledgement of the practicalities at hand — “Hey, the world is a big place. People have to do crafty things to make their stuff work. And if you’re in that situation, here’s something that might help what ails ya.” I want to work on developing that kind of a voice because I love reading voices like that. Email · Mastodon · Bluesky

2 days ago 5 votes
Doing versus Delegating

A staff+ skill

2 days ago 8 votes
p-fast trie, but smaller

Previously, I wrote some sketchy ideas for what I call a p-fast trie, which is basically a wide fan-out variant of an x-fast trie. It allows you to find the longest matching prefix or nearest predecessor or successor of a query string in a set of names in O(log k) time, where k is the key length. My initial sketch was more complicated and greedy for space than necessary, so here’s a simplified revision. (“p” now stands for prefix.) layout A p-fast trie stores a lexicographically ordered set of names. A name is a sequence of characters from some small-ish character set. For example, DNS names can be represented as a set of about 50 letters, digits, punctuation and escape characters, usually one per byte of name. Names that are arbitrary bit strings can be split into chunks of 6 bits to make a set of 64 characters. Every unique prefix of every name is added to a hash table. An entry in the hash table contains: A shared reference to the closest name lexicographically greater than or equal to the prefix. Multiple hash table entries will refer to the same name. A reference to a name might instead be a reference to a leaf object containing the name. The length of the prefix. To save space, each prefix is not stored separately, but implied by the combination of the closest name and prefix length. A bitmap with one bit per possible character, corresponding to the next character after this prefix. For every other prefix that matches this prefix and is one character longer than this prefix, a bit is set in the bitmap corresponding to the last character of the longer prefix. search The basic algorithm is a longest-prefix match. Look up the query string in the hash table. If there’s a match, great, done. Otherwise proceed by binary chop on the length of the query string. If the prefix isn’t in the hash table, reduce the prefix length and search again. (If the empty prefix isn’t in the hash table then there are no names to find.) If the prefix is in the hash table, check the next character of the query string in the bitmap. If its bit is set, increase the prefix length and search again. Otherwise, this prefix is the answer. predecessor Instead of putting leaf objects in a linked list, we can use a more complicated search algorithm to find names lexicographically closest to the query string. It’s tricky because a longest-prefix match can land in the wrong branch of the implicit trie. Here’s an outline of a predecessor search; successor requires more thought. During the binary chop, when we find a prefix in the hash table, compare the complete query string against the complete name that the hash table entry refers to (the closest name greater than or equal to the common prefix). If the name is greater than the query string we’re in the wrong branch of the trie, so reduce the length of the prefix and search again. Otherwise search the set bits in the bitmap for one corresponding to the greatest character less than the query string’s next character; if there is one remember it and the prefix length. This will be the top of the sub-trie containing the predecessor, unless we find a longer match. If the next character’s bit is set in the bitmap, continue searching with a longer prefix, else stop. When the binary chop has finished, we need to walk down the predecessor sub-trie to find its greatest leaf. This must be done one character at a time – there’s no shortcut. thoughts In my previous note I wondered how the number of search steps in a p-fast trie compares to a qp-trie. I have some old numbers measuring the average depth of binary, 4-bit, 5-bit, 6-bit and 4-bit, 5-bit, dns qp-trie variants. A DNS-trie varies between 7 and 15 deep on average, depending on the data set. The number of steps for a search matches the depth for exact-match lookups, and is up to twice the depth for predecessor searches. A p-fast trie is at most 9 hash table probes for DNS names, and unlikely to be more than 7. I didn’t record the average length of names in my benchmark data sets, but I guess they would be 8–32 characters, meaning 3–5 probes. Which is far fewer than a qp-trie, though I suspect a hash table probe takes more time than chasing a qp-trie pointer. (But this kind of guesstimate is notoriously likely to be wrong!) However, a predecessor search might need 30 probes to walk down the p-fast trie, which I think suggests a linked list of leaf objects is a better option.

2 days ago 5 votes
Software books I wish I could read

New Logic for Programmers Release! v0.11 is now available! This is over 20% longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! Full release notes here. Software books I wish I could read I'm writing Logic for Programmers because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way. Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as Data and Reality or Making Software. There is no blog or talk about debugging as good as the Debugging book. It might not be anything deeper than "people spend more time per word on writing books than blog posts". I dunno. So here are some other books I wish I could read. I don't think any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too. Everything about Configurations The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the configuration complexity clock? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration. The Big Book of Complicated Data Schemas I guess this would kind of be like Schema.org, except with a lot more on the "why" and not the what. Why is important for the Volcano model to have a "smokingAllowed" field?1 I'd see this less as "here's your guide to putting Volcanos in your database" and more "here's recurring motifs in modeling interesting domains", to help a person see sources of complexity in their own domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model. (This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe The Essence of Software touches on this? Man I feel bad I haven't read that yet.) Computer Science for Software Engineers Yes, I checked, this book does not exist (though maybe this is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. MISU Patterns MISU, or "Make Illegal States Unrepresentable", is the idea of designing system invariants in the structure of your data. For example, if a Contact needs at least one of email or phone to be non-null, make it a sum type over EmailContact, PhoneContact, EmailPhoneContact (from this post). MISU is great. Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are "patterns": smart constructors, product types, properly using sets, newtypes to some degree, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book. My one request would be to not give them cutesy names. Do something like the Aarne–Thompson–Uther Index, where items are given names like "Recognition by manner of throwing cakes of different weights into faces of old uncles". Names can come later. The Tools of '25 Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that enough developers will probably use at some point: git, VSCode, very basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science. Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030. A History of Obsolete Optimizations Probably better as a really long blog series. Each chapter would be broken up into two parts: A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology What we started doing instead, once we had more compute/network/storage available. c.f. A Spellchecker Used to Be a Major Feat of Software Engineering. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that did. Sphinx Internals I need this. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up. Systems Distributed Talk Today! Online premier's at noon central / 5 PM UTC, here! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable! In this case because it's a field on one of Volcano's supertypes. I guess schemas gotta follow LSP too ↩

2 days ago 10 votes