More from Confessions of a Code Addict
Watch now | We concluded the latest live session yesterday.
Modern software development has created a paradox: we build increasingly complex systems, yet fewer engineers understand how these systems work under the hood.
Watch now (22 mins) | Is the TLB really flushed during context switches?
I hope you enjoyed the recent article on how Unix spell was designed to lookup a 250kB dictionary on a 64kB machine. Writing it wore me down.
More in programming
Diving into what the ioctl system call is and how to invoke it from Rust
One of the neat things about the PCG random number generator by Melissa O’Neill is its use of instruction-level parallelism: the PCG state update can run in parallel with its output permutation. However, PCG only has a limited amount of ILP, about 3 instructions. Its overall speed is limited by the rate at which a CPU can run a sequence where the output of one multiply-add feeds into the next multiply-add. … Or is it? With some linear algebra and some AVX512, I can generate random numbers from a single instance of pcg32 at 200 Gbit/s on a single core. This is the same sequence of random numbers generated in the same order as normal pcg32, but more than 4x faster. You can look at the benchmark in my pcg-dxsm repository. skip ahead the insight multipliers trying it out results skip ahead One of the slightly weird features that PCG gets from its underlying linear congruential generator is “seekability”: you can skip ahead k steps in the stream of random numbers in log(k) time. The PCG paper (in section 4.3.1) cites Forrest Brown’s paper, random numbers with arbitrary strides, which explains that the skip-ahead feature is useful for reproducibility of monte carlo simulations. But what caught my eye is the skip-ahead formula. Rephrased in programmer style, state[n+k] = state[n] * pow(MUL, k) + inc * (pow(MUL, k) - 1) / (MUL - 1) the insight The skip-ahead formula says that we can calculate a future state using a couple of multiplications. The skip-ahead multipliers depend only on the LCG multiplier, not on the variable state, nor on the configurable increment. That means that for a fixed skip ahead, we can precalculate the multipliers before compile time. The skip-ahead formula allows us to unroll the PCG data dependency chain. Normally, four iterations of the PCG state update look like, state0 = rng->state state1 = state0 * MUL + rng->inc state2 = state1 * MUL + rng->inc state3 = state2 * MUL + rng->inc state4 = state3 * MUL + rng->inc rng->state = state4 With the skip-ahead multipliers it looks like, state0 = rng->state state1 = state0 * MULs1 + rng->inc * MULi1 state2 = state0 * MULs2 + rng->inc * MULi2 state3 = state0 * MULs3 + rng->inc * MULi3 state4 = state0 * MULs4 + rng->inc * MULi4 rng->state = state4 These state calculations can be done in parallel using NEON or AVX vector instructions. The disadvantage is that calculating future states in parallel requires more multiplications than doing so in series, but that’s OK because modern CPUs have lots of ALUs. multipliers The skip-ahead formula is useful for jumping ahead long distances, because (as Forrest Brown explained) you can do the exponentiation in log(k) time using repeated squaring. (The same technique is used in for modexp in RSA.) But I’m only interested in the first few skip-ahead multipliers. I’ll define the linear congruential generator as: lcg(s, inc) = s * MUL + inc Which is used in PCG’s normal state update like: rng->state = lcg(rng->state, rng->inc) To precalculate the first few skip-ahead multipliers, we iterate the LCG starting from zero and one, like this: MULs0 = 1 MULs1 = lcg(MULs0, 0) MULs2 = lcg(MULs1, 0) MULi0 = 0 MULi1 = lcg(MULi0, 1) MULi2 = lcg(MULi1, 1) My benchmark code’s commentary includes a proof by induction, which I wrote to convince myself that these multipliers are correct. trying it out To explore how well this skip-ahead idea works, I have written a couple of variants of my pcg32_bytes() function, which simply iterates pcg32 and writes the results to a byte array. The variants have an adjustable amount of parallelism. One variant is written as scalar code in a loop that has been unrolled by hand a few times. I wanted to see if standard C gets a decent speedup, perhaps from autovectorization. The other variant uses the GNU C portable vector extensions to calculate pcg32 in an explicitly parallel manner. The benchmark also ensures the output from every variant matches the baseline pcg32_bytes(). results The output from the benchmark harness lists: the function variant either the baseline version or uN for a scalar loop unrolled N times or xN for vector code with N lanes its speed in bytes per nanosecond (aka gigabytes per second) its performance relative to the baseline There are small differences in style between the baseline and u1 functions, but their performance ought to be basically the same. Apple clang 16, Macbook Pro M1 Pro. This compiler is eager and fairly effective at autovectorizing. ARM NEON isn’t big enough to get a speedup from 8 lanes of parallelism. __ 3.66 bytes/ns x 1.00 u1 3.90 bytes/ns x 1.07 u2 6.40 bytes/ns x 1.75 u3 7.66 bytes/ns x 2.09 u4 8.52 bytes/ns x 2.33 x2 7.59 bytes/ns x 2.08 x4 10.49 bytes/ns x 2.87 x8 10.40 bytes/ns x 2.84 The following results were from my AMD Ryzen 9 7950X running Debian 12 “bookworm”, comparing gcc vs clang, and AVX2 vs AVX512. gcc is less keen to autovectorize so it doesn’t do very well with the unrolled loops. (Dunno why u1 is so much slower than the baseline.) gcc 12.2 -march=x86-64-v3 __ 5.57 bytes/ns x 1.00 u1 5.13 bytes/ns x 0.92 u2 5.03 bytes/ns x 0.90 u3 7.01 bytes/ns x 1.26 u4 6.83 bytes/ns x 1.23 x2 3.96 bytes/ns x 0.71 x4 8.00 bytes/ns x 1.44 x8 12.35 bytes/ns x 2.22 clang 16.0 -march=x86-64-v3 __ 4.89 bytes/ns x 1.00 u1 4.08 bytes/ns x 0.83 u2 8.76 bytes/ns x 1.79 u3 10.43 bytes/ns x 2.13 u4 10.81 bytes/ns x 2.21 x2 6.67 bytes/ns x 1.36 x4 12.67 bytes/ns x 2.59 x8 15.27 bytes/ns x 3.12 gcc 12.2 -march=x86-64-v4 __ 5.53 bytes/ns x 1.00 u1 5.53 bytes/ns x 1.00 u2 5.55 bytes/ns x 1.00 u3 6.99 bytes/ns x 1.26 u4 6.79 bytes/ns x 1.23 x2 4.75 bytes/ns x 0.86 x4 17.14 bytes/ns x 3.10 x8 20.90 bytes/ns x 3.78 clang 16.0 -march=x86-64-v4 __ 5.53 bytes/ns x 1.00 u1 4.25 bytes/ns x 0.77 u2 7.94 bytes/ns x 1.44 u3 9.31 bytes/ns x 1.68 u4 15.33 bytes/ns x 2.77 x2 9.07 bytes/ns x 1.64 x4 21.74 bytes/ns x 3.93 x8 26.34 bytes/ns x 4.76 That last result is pcg32 generating random numbers at 200 Gbit/s.
I was talking to a friend about how to add a directory to your PATH today. It’s something that feels “obvious” to me since I’ve been using the terminal for a long time, but when I searched for instructions for how to do it, I actually couldn’t find something that explained all of the steps – a lot of them just said “add this to ~/.bashrc”, but what if you’re not using bash? What if your bash config is actually in a different file? And how are you supposed to figure out which directory to add anyway? So I wanted to try to write down some more complete directions and mention some of the gotchas I’ve run into over the years. Here’s a table of contents: step 1: what shell are you using? step 2: find your shell’s config file a note on bash’s config file step 3: figure out which directory to add step 3.1: double check it’s the right directory step 4: edit your shell config step 5: restart your shell problems: problem 1: it ran the wrong program problem 2: the program isn’t being run from your shell notes: a note on source a note on fish_add_path step 1: what shell are you using? If you’re not sure what shell you’re using, here’s a way to find out. Run this: ps -p $$ -o pid,comm= if you’re using bash, it’ll print out 97295 bash if you’re using zsh, it’ll print out 97295 zsh if you’re using fish, it’ll print out an error like “In fish, please use $fish_pid” ($$ isn’t valid syntax in fish, but in any case the error message tells you that you’re using fish, which you probably already knew) Also bash is the default on Linux and zsh is the default on Mac OS (as of 2024). I’ll only cover bash, zsh, and fish in these directions. step 2: find your shell’s config file in zsh, it’s probably ~/.zshrc in bash, it might be ~/.bashrc, but it’s complicated, see the note in the next section in fish, it’s probably ~/.config/fish/config.fish (you can run echo $__fish_config_dir if you want to be 100% sure) a note on bash’s config file Bash has three possible config files: ~/.bashrc, ~/.bash_profile, and ~/.profile. If you’re not sure which one your system is set up to use, I’d recommend testing this way: add echo hi there to your ~/.bashrc Restart your terminal If you see “hi there”, that means ~/.bashrc is being used! Hooray! Otherwise remove it and try the same thing with ~/.bash_profile You can also try ~/.profile if the first two options don’t work. (there are a lot of elaborate flow charts out there that explain how bash decides which config file to use but IMO it’s not worth it and just testing is the fastest way to be sure) step 3: figure out which directory to add Let’s say that you’re trying to install and run a program called http-server and it doesn’t work, like this: $ npm install -g http-server $ http-server bash: http-server: command not found How do you find what directory http-server is in? Honestly in general this is not that easy – often the answer is something like “it depends on how npm is configured”. A few ideas: Often when setting up a new installer (like cargo, npm, homebrew, etc), when you first set it up it’ll print out some directions about how to update your PATH. So if you’re paying attention you can get the directions then. Sometimes installers will automatically update your shell’s config file to update your PATH for you Sometimes just Googling “where does npm install things?” will turn up the answer Some tools have a subcommand that tells you where they’re configured to install things, like: Homebrew: brew --prefix (and then append /bin/ and /sbin/ to what that gives you) Node/npm: npm config get prefix (then append /bin/) Go: go env | grep GOPATH (then append /bin/) asdf: asdf info | grep ASDF_DIR (then append /bin/ and /shims/) step 3.1: double check it’s the right directory Once you’ve found a directory you think might be the right one, make sure it’s actually correct! For example, I found out that on my machine, http-server is in ~/.npm-global/bin. I can make sure that it’s the right directory by trying to run the program http-server in that directory like this: $ ~/.npm-global/bin/http-server Starting up http-server, serving ./public It worked! Now that you know what directory you need to add to your PATH, let’s move to the next step! step 4: edit your shell config Now we have the 2 critical pieces of information we need: Which directory you’re trying to add to your PATH (like ~/.npm-global/bin/) Where your shell’s config is (like ~/.bashrc, ~/.zshrc, or ~/.config/fish/config.fish) Now what you need to add depends on your shell: bash and zsh instructions: Open your shell’s config file, and add a line like this: export PATH=$PATH:~/.npm-global/bin/ (obviously replace ~/.npm-global/bin with the actual directory you’re trying to add) fish instructions: In fish, the syntax is different: set PATH $PATH ~/.npm-global/bin (in fish you can also use fish_add_path, some notes on that further down) step 5: restart your shell Now, an extremely important step: updating your shell’s config won’t take effect if you don’t restart it! Two ways to do this: open a new terminal (or terminal tab), and maybe close the old one so you don’t get confused Run bash to start a new shell (or zsh if you’re using zsh, or fish if you’re using fish) I’ve found that both of these usually work fine. And you should be done! Try running the program you were trying to run and hopefully it works now. If not, here are a couple of problems that you might run into: problem 1: it ran the wrong program If the wrong version of a is program running, you might need to add the directory to the beginning of your PATH instead of the end. For example, on my system I have two versions of python3 installed, which I can see by running which -a: $ which -a python3 /usr/bin/python3 /opt/homebrew/bin/python3 The one your shell will use is the first one listed. If you want to use the Homebrew version, you need to add that directory (/opt/homebrew/bin) to the beginning of your PATH instead, by putting this in your shell’s config file (it’s /opt/homebrew/bin/:$PATH instead of the usual $PATH:/opt/homebrew/bin/) export PATH=/opt/homebrew/bin/:$PATH or in fish: set PATH ~/.cargo/bin $PATH problem 2: the program isn’t being run from your shell All of these directions only work if you’re running the program from your shell. If you’re running the program from an IDE, from a GUI, in a cron job, or some other way, you’ll need to add the directory to your PATH in a different way, and the exact details might depend on the situation. in a cron job Some options: use the full path to the program you’re running, like /home/bork/bin/my-program put the full PATH you want as the first line of your crontab (something like PATH=/bin:/usr/bin:/usr/local/bin:….). You can get the full PATH you’re using in your shell by running echo "PATH=$PATH". I’m honestly not sure how to handle it in an IDE/GUI because I haven’t run into that in a long time, will add directions here if someone points me in the right direction. a note on source When you install cargo (Rust’s installer) for the first time, it gives you these instructions for how to set up your PATH, which don’t mention a specific directory at all. This is usually done by running one of the following (note the leading DOT): . "$HOME/.cargo/env" # For sh/bash/zsh/ash/dash/pdksh source "$HOME/.cargo/env.fish" # For fish The idea is that you add that line to your shell’s config, and their script automatically sets up your PATH (and potentially other things) for you. This is pretty common (Homebrew and asdf have something similar), and there are two ways to approach this: Just do what the tool suggests (add . "$HOME/.cargo/env" to your shell’s config) Figure out which directories the script they’re telling you to run would add to your PATH, and then add those manually. Here’s how I’d do that: Run . "$HOME/.cargo/env" in my shell (or the fish version if using fish) Run echo "$PATH" | tr ':' '\n' | grep cargo to figure out which directories it added See that it says /Users/bork/.cargo/bin and shorten that to ~/.cargo/bin Add the directory ~/.cargo/bin to PATH (with the directions in this post) I don’t think there’s anything wrong with doing what the tool suggests (it might be the “best way”!), but personally I usually use the second approach because I prefer knowing exactly what configuration I’m changing. a note on fish_add_path fish has a handy function called fish_add_path that you can run to add a directory to your PATH like this: fish_add_path /some/directory This will add the directory to your PATH, and automatically update all running fish shells with the new PATH. You don’t have to update your config at all! This is EXTREMELY convenient, but one downside (and the reason I’ve personally stopped using it) is that if you ever need to remove the directory from your PATH a few weeks or months later because maybe you made a mistake, it’s kind of hard to do (there are instructions in this comments of this github issue though). that’s all Hopefully this will help some people. Let me know (on Mastodon or Bluesky) if you there are other major gotchas that have tripped you up when adding a directory to your PATH, or if you have questions about this post!
A surprising number of strategies are doomed from inception because their authors get attached to one particular approach without considering alternatives that would work better for their current circumstances. This happens when engineers want to pick tools solely because they are trending, and when executives insist on adopting the tech stack from their prior organization where they felt comfortable. Exploration is the antidote to early anchoring, forcing you to consider the problem widely before evaluating any of the paths forward. Exploration is about updating your priors before assuming the industry hasn’t evolved since you last worked on a given problem. Exploration is continuing to believe that things can get better when you’re not watching. This chapter covers: The goals of the exploration phase of strategy creation When to explore (always first!) and when it makes sense to stop exploring How to explore a topic, including discussion of the most common mechanisms: mining for internal precedent, reading industry papers and books, and leveraging your external network Why avoiding judgment is an essential part of exploration By the end of this chapter, you’ll be able to conduct an exploration for the current or next strategy that you work on. What is exploration? One of the frequent senior leadership anti-patterns I’ve encountered in my career is The Grand Migration, where a new leader declares that a massive migration to a new technology stack–typically the stack used by their former employer–will solve every pressing problem. What’s distinguishing about the Grand Migration is not the initially bad selection, but the single-minded ferocity with which the senior leader pushes for their approach, even when it becomes abundantly clear to others that it doesn’t solve the problem at hand. These senior leaders are very intelligent, but have allowed themselves to be framed in by their initial thinking from prior experiences. Accepting those early thoughts as the foundation of their strategy, they build the entire strategy on top of those ideas, and eventually there is so much weight standing on those early assumptions that it becomes impossible to acknowledge the errors. Exploration is the deliberate practice of searching through a strategy’s problem and solution spaces before allowing yourself to commit to a given approach. It’s understanding how others have approached the same problem recently and in the past. It’s doing this both in trendy companies you admire, and in practical companies that actually resemble yours. Most exploration will be external to your team, but depending on your company, much of your exploration might be internal to the company. If you’re in a massive engineering organization of 100,000, there are likely existing internal solutions to your problem that you’ve never heard of. Conversely, if you’re in an organization of 50 engineers, it’s likely that much of your exploration will be external. When to explore Exploration is the first step of good strategy work. Even when you want to skip it, you will always regret skipping it, because you’ll inadvertently frame yourself into whatever approach you focus on first. Especially when it comes to problems that you’ve solved previously, exploration is the only thing preventing you from over-indexing on your prior experiences. Try to continue exploration until you know how three similar teams within your company and three similar companies have recently solved the same problem. Further, make sure you are able to explain the thinking behind those decisions. At that point,you should be ready to stop exploring and move on to the diagnosis step of strategy creation. Exploration should always come with a minimum and maximum timeframe: less than a few hours is very suspicious, and more than a week is generally questionably as well. How to explore While the details of each exploration will differ a bit, the overarching approach tends to be pretty similar across strategies. After I open up the draft strategy document I’m working on, my general approach to exploration is: Start throwing in every resource I can think of related to that problem. For example, in the Uber service provisioning strategy, I started by collecting recent papers on Mesos, Kubernetes, and Aurora to understand the state of the industry on orchestration. Do some web searching, foundational model prompting, and checking with a few current and prior colleagues about what topics and resources I might be missing. For example, for the Calm engineering strategy, I focused on talking with industry peers on tools they’d used to focus a team with diffuse goals. Summarize the list of resources I’ve gathered, organizing them by which I want to explore, and which I won’t spend time on but are worth mentioning. For example, the Large Language Model adoption strategy’s exploration section documents the variety of resources the team explored before completing it. Work through the list one by one, continuing to collect notes in the strategy document. When you’re done, synthesize those into a concise, readable summary of what you’ve learned. For example, the monolith decomposition strategy synthesizes the exploration of a broad topic into four paragraphs, with links out to references. Stop once I generally understand how a handful of similar internal and external teams have recently approached this problem. Of all the steps in strategy creation, exploration is inherently open-ended, and you may find a different approach works better for you. If you’re not sure what to do, try following the above steps closely. If you have a different approach that you’re confident in–as long as it’s not skipping exploration!–then go ahead and try that instead. While not discussed in this chapter, you can also use some techniques like Wardley mapping, covered in the Refinement chapter, to support your exploration phase. Wardley mapping is a strategy tool designed within a different strategy tradition, and consequently categorizing it as either solely an exploration tool or a refinement tool ignores some of its potential uses. There’s no perfect way to do strategy: take what works for you and use it. Mine internal precedent One of the most powerful forms of strategy is simply documenting how similar decisions have been made internally: often this is enough to steer how similar future decisions are made within your organization. This approach, documented in Staff Engineer’s Write five, then synthesize, is also the most valuable step of exploration for those working in established companies. If you are a tenured engineer within your organization, then it’s somewhat safe to assume that you are aware of the typical internal approaches. Even then, it’s worth poking around to see if there are any related skunkworks projects happening internally. This is doubly true if you’ve joined the organization recently, or are distant from the codebase itself. In that case, it’s almost always worth poking around to see what already exists. Sometimes the internal approach isn’t ideal, but it’s still superior because it’s already been implemented and there’s someone else maintaining it. In the long-run, your strategy can ride along as someone else addresses the issues that aren’t perfect fits. Using your network How should we control access to user data’s exploration section begins with: Our experience is that best practices around managing internal access to user data are widely available through our networks, and otherwise hard to find. The exact rationale for this is hard to determine, While there are many topics with significant public writing out there, my experience is that there are many topics where there’s very little you can learn without talking directly to practitioners. This is especially true for security, compliance, operating at truly large scale, and competitive processes like optimizing advertising spend. Further, it’s surprisingly common to find that how people publicly describe solving a problem and how they actually approach the problem are largely divorced. This is why having a broad personal network is exceptionally powerful, and makes it possible to quickly understand the breadth of possible solutions. It also provides access to the practical downsides to various approaches, which are often omitted when talking to public proponents. In a recent strategy session, a proposal came up that seemed off to me, and I was able to text–and get answers to those texts–industry peers before the meeting ended, which invalidated the room’s assumptions about what was and was not possible. A disagreement that might have taken weeks to resolve was instead resolved in a few minutes, and we were able to figure out next steps in that meeting rather than waiting a week for the next meeting when we’d realized our mistake. Of course, it’s also important to hold information from your network with skepticism. I’ve certainly had my network be wrong, and your network never knows how your current circumstances differ from theirs, so blindly accepting guidance from your network is never the right decision either. If you’re looking for a more detailed coverage on building your network, this topic has also come up in Staff Engineer’s chapter on Build a network of peers, and The Engineering Executive’s Primer’s chapter on Building your executive network. It feels silly to cover the same topic a third time, but it’s a foundational technique for effective decision making. Read widely; read narrowly Reading has always been an important part of my strategy work. There are two distinct motions to this approach: read widely on an ongoing basis to broaden your thinking, and read narrowly on the specific topic you’re working on. Starting with reading widely, I make an effort each year to read ten to twenty industry-relevant works. These are not necessarily new releases, but are new releases for me. Importantly, I try to read things that I don’t know much about or that I initially disagree with. Some of my recent reads were Chip War, Building Green Software, Tidy First?, and How Big Things Get Done. From each of these books, I learned something, and over time they’ve built a series of bookmarks in my head about ideas that might apply to new problems. On the other end of things is reading narrowly. When I recently started working on an AI agents strategy, the first thing I did was read through Chip Huyen’s AI Engineering, which was an exceptionally helpful survey. Similarly, when we started thinking about Uber’s service migration, we read a number of industry papers, including Large-scale cluster management at Google with Borg and Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. None of these readings had all the answers to the problems I was working on, but they did an excellent job at helping me understand the range of options, as well as identifying other references to consult in my exploration. I’ll mention two nuances that will help a lot here. First, I highly encourage getting comfortable with skimming books. Even tightly edited books will have a lot of content that isn’t particularly relevant to your current goals, and you should skip that content liberally. Second, what you read doesn’t have to be books. It can be blog posts, essays, interview transcripts, or certainly it can be books. In this context, “reading” doesn’t event have to actually be reading. There are conference talks that contain just as much as a blog post, and conferences that cover as much breadth as a book. There are also conference talks without a written equivalent, such as Dan Na’s excellent Pushing Through Friction. Each job is an education Experience is frequently disregarded in the technology industry, and there are ways to misuse experience by copying too liberally the solutions that worked in different circumstances, but the most effective, and the slowest, mechanism for exploring is continuing to work in the details of meaningful problems. You probably won’t choose every job to optimize for learning, but allowing you to instantly explore more complex problems over time–recognizing that a bit of your data will have become stale each time–is uniquely valuable. Save judgment for later As I’ve mentioned several times, the point of exploration is to go broad with the goal of understanding approaches you might not have considered, and invalidating things you initially think are true. Both of those things are only possible if you save judgment for later: if you’re passing judgment about whether approaches are “good” or “bad”, then your exploration is probably going astray. As a soft rule, I’d argue that if no one involved in a strategy has changed their mind about something they believed when you started the exploration step, then you’re not done exploring. This is especially true when it comes to strategy work by senior leaders. Their beliefs are often well-justified by years of experience, but it’s unclear which parts of their experience have become stale over time. Summary At this point, I hope you feel comfortable exploring as the first step of your strategy work, and understand the likely consequences of skipping this step. It’s not an overstatement to say that every one of the worst strategic failures I’ve encountered would have been prevented by its primary author taking a few days to explore the space before anchoring on a particular approach. A few days of feeling slow are always worth avoiding years of misguided efforts.
Today we are announcing a new privacy feature coming to Kagi Search.