Switching Things Over to ikiwiki

from bt RSS Feed [alt+shift+b] in programming

Switching Things Over to ikiwiki 2023-12-17 I’ve done it again. My personal website is no longer generated with barf but is instead built on top of ikiwiki. The old RSS feed (btxx.org/atom.xml) still exists but will no longer receive updates. The new feed can be found on the bottom of the homepage (index.rss) Why a Wiki? I love the simplicity of a minimal blog, which is why I always gravitated towards purely “static” site builders. Over time though, I found two minor issues that slowly chipped away at me: ease-of-use and flexibility. I had a vision, back when I began tinkering with my own place on the web, of building out my own personal “resource center” or wiki. Often times through work or personal projects I stumble into little problems that I need to solve. Most times I find a solution and move on with my life. The problem with this approach is lack of documentation. What if I come across that issue at a later point in time? Will I even remember my old solution? Probably not. So,...

a year ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from bt RSS Feed

Installing OpenBSD on Linveo KVM VPS

Installing OpenBSD on Linveo KVM VPS 2024-10-21 I recently came across an amazing deal for a VPS on Linveo. For just $15 a year they provide: AMD KVM 1GB 1024 MB RAM 1 CPU Core 25 GB NVMe SSD 2000 GB Bandwidth It’s a pretty great deal and I suggest you look more into it if you’re interested! But this post is more focused on setting up OpenBSD via the custom ISO option in the KVM dashboard. Linveo already provides several Linux OS options, along with FreeBSD by default (which is great!). Since there is no OpenBSD template we need to do things manually. Getting Started Once you have your initial VPS up and running, login to the main dashboard and navigate to the Media tab. Under CD/DVD-ROM you’ll want to click “Custom CD/DVD” and enter the direct link to the install76.iso: https://cdn.openbsd.org/pub/OpenBSD/7.6/amd64/install76.iso The "Media" tab of the Linveo Dashboard. Use the official ISO link and set the Boot Order to CD/DVD. Select “Insert”, then set your Boot Order to CD/DVD and click “Apply”. Once complete, Restart your server. Installing via VNC With the server rebooting, jump over to Options and click on “Browser VNC” to launch the web-based VNC client. From here we will boot into the OpenBSD installer and get things going! Follow the installer as you normally would when installing OpenBSD (if you’re unsure, I have a step-by-step walkthrough) until you reach the IPv4 selection. At this point you will want to input your servers IPv4 and IPv6 IPs found under your Network section of your dashboard. Next you will want to set the IPv6 route to first default listed option (not “none”). After that is complete, choose cd0 for your install media (don’t worry about http yet). Continue with the rest of the install (make users if desired, etc) until it tells you to reboot the machine. Go back to the Linveo Dashboard, switch your Boot Order back to “Harddrive” and reboot the machine directly. Booting into OpenBSD Load into the VNC client again. If you did everything correctly you should be greeted with the OpenBSD login prompt. There are a few tweaks we still need to make, so login as the root user. Remember how we installed our sets directly from the cd0? We’ll want to change that. Since we are running OpenBSD “virtually” through KVM, our target network interface will be vio0. Edit the /etc/hostname.vio0 file and add the following: dhcp !route add default <your_gateway_ip> The <your_gateway_ip> can be found under the Network tab of your dashboard. The next file we need to tweak is /etc/resolv.conf. Add the following to it: nameserver 8.8.8.8 nameserver 1.1.1.1 These nameservers are based on your selected IPs under the Resolvers section of Network in the Linveo dashboard. Change these as you see fit, so long as they match what you place in the resolve.conf file. Finally, the last file we need to edit is /etc/pf.conf. Like the others, add the following: pass out proto { tcp, udp } from any to any port 53 Final Stretch Now just reboot the server. Log back in as your desired user and everything should be working as expected! You can perform a simple test to check: ping openbsd.org This should work - meaning your network is up and running! Now you’re free to enjoy the beauty that is OpenBSD.

9 months ago • 83 votes

Vertical Tabs in Safari

Vertical Tabs in Safari 2024-09-26 I use Firefox as my main browser (specifically the Nightly build) which has vertical tabs built-in. There are instances where I need to use Safari, such as debugging or testing iOS devices, and in those instances I prefer to have a similar experience to that of Firefox. Luckily, Apple has finally made it fairly straight forward to do so. Click the Sidebar icon in the top left of the Safari browser Right click and group your current tab(s) (I normally name mine something uninspired like “My Tabs” or simply “Tabs”) For an extra “clean look”, remove the horizontal tabs by right clicking the top bar, selected Customize Toolbar and dragging the tabs out When everything is set properly, you’ll have something that looks like this: One minor drawback is not having access to a direct URL input, since we have removed the horizontal tab bar altogether. Using a set of curated bookmarks could help avoid the need for direct input, along with setting our new tab page to DuckDuckGo or any other search engine.

10 months ago • 86 votes

Build and Deploy Websites Automatically with Git

Build and Deploy Websites Automatically with Git 2024-09-20 I recently began the process of setting up my self-hosted1 cgit server as my main code forge. Updating repos via cgit on NearlyFreeSpeech on its own has been simple enough, but it lacked the “wow-factor” of having some sort of automated build process. I looked into a bunch of different tools that I could add to my workflow and automate deploying changes. The problem was they all seemed to be fairly bloated or overly complex for my needs. Then I realized I could simply use post-receive hooks which were already built-in to git! You can’t get more simple than that… So I thought it would be best to document my full process. These notes are more for my future self when I inevitably forget this, but hopefully others can benefit from it! Before We Begin This “tutorial” assumes that you already have a git server setup. It shouldn’t matter what kind of forge you’re using, so long as you have access to the hooks/ directory and have the ability to write a custom post-receive script. For my purposes I will be running standard git via the web through cgit, hosted on NearlyFreeSpeech (FreeBSD based). Overview Here is a quick rundown of what we plan to do: Write a custom post-receive script in the repo of our choice Build and deploy our project when a remote push to master is made Nothing crazy. Once you get the hang of things it’s really simple. Prepping Our Servers Before we get into the nitty-gritty, there are a few items we need to take care of first: Your main git repo needs ssh access to your web hosting (deploy) server. Make sure to add your public key and run a connection test first (before running the post-receive hook) in order to approve the “fingerprinting”. You will need to git clone your main git repo in a private/admin area of your deploy server. In the examples below, mine is cloned under /home/private/_deploys Once you do both of those tasks, continue with the rest of the article! The post-receive Script I will be using my own personal website as the main project for this example. My site is built with wruby, so the build instructions are specific to that generator. If you use Jekyll or something similar, you will need to tweak those commands for your own purposes. Head into your main git repo (not the cloned one on your deploy server), navigate under the hooks/ directory and create a new file named post-receive containing the following: #!/bin/bash # Get the branch that was pushed while read oldrev newrev ref do branch=$(echo $ref | cut -d/ -f3) if [ "$branch" == "master" ]; then echo "Deploying..." # Build on the remote server ssh user@deployserver.net << EOF set -e # Stop on any error cd /home/private/_deploys/btxx.org git pull origin master gem install 'kramdown:2.4.0' 'rss:0.3.0' make build rsync -a build/* ~/public/btxx.org/ EOF echo "Build synced to the deployment server." echo "Deployment complete." fi done Let’s break everything down. First we check if the branch being pushed to the remote server is master. Only if this is true do we proceed. (Feel free to change this if you prefer something like production or deploy) if [ "$branch" == "master" ]; then Then we ssh into the server (ie. deployserver.net) which will perform the build commands and also host these built files. ssh user@deployserver.net << EOF Setting set -e ensures that the script stops if any errors are triggered. set -e # Stop on any error Next, we navigate into the previously mentioned “private” directory, pull the latest changes from master, and run the required build commands (in this case installing gems and running make build) cd /home/private/_deploys/btxx.org git pull origin master gem install 'kramdown:2.4.0' 'rss:0.3.0' make build Finally, rsync is run to copy just the build directory to our public-facing site directory. rsync -a build/* ~/public/btxx.org/ With that saved and finished, be sure to give this file proper permissions: chmod +x post-receive That’s all there is to it! Time to Test! Now make changes to your main git project and push those up into master. You should see the post-receive commands printing out into your terminal successfully. Now check out your website to see the changes. Good stuff. Still Using sourcehut My go-to code forge was previously handled through sourcehut, which will now be used for mirroring my repos and handling mailing lists (since I don’t feel like hosting something like that myself - yet!). This switch over was nothing against sourcehut itself but more of a “I want to control all aspects of my projects” mentality. I hope this was helpful and please feel free to reach out with suggestions or improvements! By self-hosted I mean a NearlyFreeSpeech instance ↩

10 months ago • 99 votes

Burning & Playing PS2 Games without a Modded Console

Burning & Playing PS2 Games without a Modded Console 2024-09-02 Important: I do not support pirating or obtaining illegal copies of video games. This process should only be used to copy your existing PS2 games for backup, in case of accidental damage to the original disc. Requirements Note: This tutorial is tailored towards macOS users, but most things should work similar on Windows or Linux. You will need: An official PS2 game disc (the one you wish to copy) A PS2 Slim console An Apple device with a optical DVD drive (or a portable USB DVD drive) Some time and a coffee! (or tea) Create an ISO Image of Your PS2 Disc: Insert your PS2 disc into your optical drive. Open Disk Utility (Applications > Utilities) In Disk Utility, select your PS2 disc from the sidebar Click on the File menu, then select New Image > Image from [Disc Name] Choose a destination to save the ISO file and select the format as DVD/CD Master Name your file and click Save. Disk Utility will create a .cdr file, which is essentially an ISO file Before we move on, we will need to convert that newly created cdr file into ISO. Navigate to the directory where the .cdr file is located and use the hdiutil command to convert the .cdr file to an ISO file: hdiutil convert yourfile.cdr -format UDTO -o yourfile.iso You’ll end up with a file named yourfile.iso.cdr. Rename it by removing the .cdr extension to make it an .iso file: mv yourfile.iso.cdr yourfile.iso Done and done. Getting Started For Mac and Linux users, you will need to install Wine in order to run the patcher: # macOS brew install wine-stable # Linux (Debian) apt install wine Clone & Run the Patcher Clone the FreeDVDBoot ESR Patcher: git clone https://git.sr.ht/~bt/fdvdb-esr Navigate to the cloned project folder: cd /path/to/fdvdb-esr The run the executable: wine FDVDB_ESR_Patcher.exe Now you need to select your previously cloned ISO file, use the default Payload setting and then click Patch!. After a few seconds your file should be patched. Burning Our ISO to DVD It’s time for the main event! Insert a blank DVD-R into your disc drive and mount it. Then right click on your patched ISO file and run “Burn Disk Image to Disc...". From here, you want to make sure you select the slowest write speed and enable verification. Once the file is written to the disc and verified (verification might fail - it is safe to ignore) you can remove the disc from the drive. Before Playing the Game Make sure you change the PS2 disc speed from Standard to Fast in the main “Browser” setting before you put the game into your console. After that, enjoy playing your cloned PS2 game!

11 months ago • 75 votes

"This Key is Useless Now. Discard?"

“This Key is Useless Now. Discard?” 2024-08-28 The title of this article probably triggers nostalgic memories for old school Resident Evil veterans like myself. My personal favourite in the series (not that anyone asked) was the original, 1998 version of Resident Evil 2 (RE2). I believe that game stands the test of time and is very close to a masterpiece. The recent remake lost a lot of the charm and nuance that made the original so great, which is why I consistently fire up the PS1 version on my PS2 Slim. Resident Evil 2 (PS1) running on my PS2, hooked up to my Toshiba CRT TV. But the point of this post isn’t to gush over RE2. Instead I would like to discuss how well RE2 handled its interface and user experience across multiple in-game systems. HUD? What HUD? Just like the first Resident Evil that came before it, RE2 has no in-game HUD (heads-up display) to speak of. It’s just your playable character and the environment. No ammo-counters. No health bars. No “quest” markers. Nothing. This is how the game looks while you play. Zero HUD elements. The game does provide you with an inventory system, which holds your core items, weapons and notes you find along your journey. Opening up this sub-menu allows you to heal, reload weapons, combine objects or puzzle items, or read through previously collected documents. Not only is this more immersive (HUDs don’t exist for us in the real world, we need to look through our packs as well…) it also gets out of the way. The main inventory screen. Shows everything you need to know, only when you need it. (I can hear this screenshot...) I don’t need a visual element in the bottom corner showing me a list of “items” I can cycle through. I don’t want an ammo counter cluttering up my screen with information I only need to see in combat or while manually reloading. If those are pieces of information I need, I’ll explicitly open and look for it. Don’t make assumptions about what is important to me on screen. Capcom took this concept of less visual clutter even further in regards to maps and the character health status. Where We’re Going, We Don’t Need Roads Mini-Maps A great deal of newer games come pre-packaged with a mini-map on the main interface. In certain instances this works just fine, but most are 100% UI clutter. Something to add to the screen. I can only assume some devs believe it is “helpful”. Most times it’s simply a distraction. Thank goodness most games allow you to disable them. As for RE2, you collect maps throughout your adventure and, just like most other systems in the game, you need to consciously open the map menu to view them. You know, just like in real life. This creates a higher tension as well, since you need to constantly reference your map (on initial playthroughs) to figure out where the heck to go. You feel the pressure of someone frantically pulling out a physical map and scanning their surroundings. It also helps the player build a mental model in their head, thus providing even more of that sweet, sweet immersion. The map of the Raccoon City Police Station. No Pain, No Gain The game doesn’t display any health bar or player status information. In order to view your current status (symbolized by “Fine”, “Caution” or “Danger”) you need to open your inventory screen. From here you can heal yourself (if needed) and see the status type change in real-time. The "condition" health status. This is fine. But that isn’t the only way to visually see your current status. Here’s a scenario: you’re traveling down a hallway, turn a corner and run right into the arms of a zombie. She takes a couple good bites out of your neck before you push her aside. You unload some handgun rounds into her and down she goes. As you run over her body she reaches out and chomps on your leg as a final “goodbye”. You break free and move along but notice something different in your character’s movement - they’re holding their stomach and limping. Here we can see the character "Hunk" holding his stomach and limping, indicating an injury without the need for a custom HUD element. If this was your first time playing, most players would instinctively open the inventory menu, where their characters health is displayed, and (in this instance) be greeted with a “Caution” status. This is another example of subtle UX design. I don’t need to know the health status of my character until an action is required (in this example: healing). The health system is out of the way but not hidden. This keeps the focus on immersion, not baby-sitting the game’s interface. A Key to Every Lock Hey! This section is in reference to the title of the article. We made it! …But yes, discarding keys in RE2 is a subtle example of fantastic user experience. As a player, I know for certain this key is no longer needed. I can safely discard it and free up precious space from my inventory. There is also a sense of accomplishment, a feeling of “I’ve completed a task” or an internal checkbox being ticked. Progress has been made! Don’t overlook how powerful of a interaction this little text prompt is. Ask any veteran of the series and they will tell you this prompt is almost euphoric. The game's prompt asking if you'd like to discard a useless key. Perfection. Inspiring Greatness RE2 is certainly not the first or last game to implement these “minimal” game systems. A more “modern” example is Dead Space (DS), along with its very faithful remake. In DS the character’s health is displayed directly on the character model itself, and a similar inventory screen is used to manage items. An ammo-counter is visible but only when the player aims their weapon. Pretty great stuff and another masterpiece of survival horror. In Dead Space, the character's health bar is set as part of their spacesuit. The Point I guess my main takeaway is that designers and developers should try their best to keep user experience intuitive. I know that sounds extremely generic but it is a lot more complex than one might think. Try to be as direct as possible while remaining subtle. It’s a delicate balance but experiences like RE2 show us it is attainable. I’d love to talk more, but I have another play-through of RE2 to complete…

11 months ago • 77 votes

More in programming

The Framework Desktop is a beast

I've been running the Framework Desktop for a few months here in Copenhagen now. It's an incredible machine. It's completely quiet, even under heavy, stress-all-cores load. It's tiny too, at just 4.5L of volume, especially compared to my old beautiful but bulky North tower running the 7950X — yet it's faster! And finally, it's simply funky, quirky, and fun! In some ways, the Framework Desktop is a curious machine. Desktop PCs are already very user-repairable! So why is Framework even bringing their talents to this domain? In the laptop realm, they're basically alone with that concept, but in the desktop space, it's rather crowded already. Yet it somehow still makes sense. Partly because Framework has gone with the AMD Ryzen AI Max 395+, which is technically a laptop CPU. You can find it in the ASUS ROG Flow Z13 and the HP ZBook Ultra. Which means it'll fit in a tiny footprint, and Framework apparently just wanted to see what they could do in that form factor. They clearly had fun with it. Look at mine: There are 21 little tiles on the front that you can get in a bunch of different colors or with logos from Framework. Or you can 3D print your own! It's a welcome change in aesthetic from the brushed aluminum or gamer-focused RGBs approach that most of the competition is taking. But let's cut to the benchmarks. That's really why you'd buy a machine like the Framework Desktop. There are significantly cheaper mini PCs available from Beelink and others, but so far, Framework has the only AMD 395+ unit on sale that's completely silent (the GMKTec very much is not, nor is the Z3 Flow). And for me, that's just a dealbreaker. I can't listen to roaring fans anymore. Here's the key benchmark for me: That's the only type of multi-core workload I really sit around waiting on these days, and the Framework Desktop absolutely crushes it. It's almost twice as fast as the Beelink SER8 and still a solid third faster than the Beelink SER9 too. Of course, it's also a lot more expensive, but you're clearly getting some multi-core bang for your buck here! It's even a more dramatic difference to the Macs. It's a solid 40% faster than the M4 Max and 50% faster than the M4 Pro! Now some will say "that's just because Docker is faster on Linux," and they're not entirely wrong. Docker runs natively on Linux, so for this test, where the MySQL/Redis/ElasticSearch data stores run in Docker while Ruby and the app code runs natively, that's part of the answer. Last I checked, it was about 25% of the difference. But so what? Docker is an integral part of the workflow for tons of developers. We use it to be able to run different versions of MySQL, Redis, and ElasticSearch for different applications on the same machine at the same time. You can't really do that without Docker. So this is what Real World benchmarks reveal. It's not just about having a Docker advantage, though. The AMD 395+ is also incredibly potent in RAW CPU performance. Those 16 Zen5 cores are running at 5.1GHz, and in Geekbench 6 multicore, this is how they stack up: Basically matching the M4 Max! And a good chunk faster than the M4 Pro (as well as other AMDs and Intel's 14900K!). No wonder that it's crazy quick with a full-core stress test like running 30,000 assertions for our HEY test suite. To be fair, the M4s are faster in single-core performance. Apple holds the crown there. It's about 20%. And you'll see that in benchmarks like Speedometer, which mostly measures JavaScript single-core performance. The Framework Desktop puts out 670 vs 744 on the M4 Pro on Speedometer 2.1. On SP 3.1, it's an even bigger difference with 35 vs 50. But I've found that all these computers feel fast enough in single-core performance these days. I can't actually feel the difference browsing on a machine that does 670 vs 744 on SP2.1. Hell, I can barely feel the difference between the SER8, which does 506, and the M4 Pro! The only time I actually feel like I'm waiting on anything is in multi-core workloads like the HEY test suite, and here the AMD 395+ is very near the fastest you can get for a consumer desktop machine today at any price. It gets even better when you bring price into the equation, though. The Framework Desktop with 64GB RAM + 2TB NVMe is $1,876. To get a Mac Studio with similar specs — M4 Max, 64GB RAM, 2TB NVMe — you'll literally spend nearly twice as much at $3,299! If you go for 128GB RAM, you'll spend $2,276 on the Framework, but $4,099 on the Mac. And it'll still be way slower for development work using Docker! The Framework Desktop is simply a great deal. Speaking of 64GB vs 128GB, I've been running the 64GB version, and I almost never get anywhere close to the limits. I think the highest I've seen in regular use is about 20GB of RAM in action. Linux is really efficient. Especially when you're using a window manager like Hyprland, as we do in Omarchy. The only reason you really want to go for the full 128GB RAM is to run local LLM models. The AMD 395+ uses unified memory, like Apple, so nearly all of it is addressable to be used by the GPU. That means you can run monster models, like the new 120b gpt-oss from OpenAI. Framework has a video showing them pushing out 40 tokens/second doing just that. That seems about in range of the numbers I've seen from the M4 Max, which also seem in the 40-50 token/second range, but I'll defer to folks who benchmark local LLMs for the exact details on that. I tried running the new gpt-oss-20b on my 64GB machine, though, and I wasn't exactly blown away by the accuracy. In fact, I'd say it was pretty bad. I mean, exceptionally cool that it's doable, but very far off the frontier models we have access to as SaaS. So personally, this isn't yet something I actually use all that much in day-to-day development. I want the best models running at full speed, and right now that means SaaS. So if you just want the best, small computer that runs Linux superbly well out of the box, you should buy the Framework Desktop. It's completely quiet, fantastically fast, and super fun to look at. But I think it's also fair to mention that you can get something like a Beelink SER9 for half the price! Yes, it's also only 2/3 the performance in multi-core, but it's just as fast in single-core. Most developers could totally get away with the SER9, and barely notice what they were missing. But there are just as many people for whom the extra $1,000 is worth the price to run the test suite 40 seconds quicker! You know who you are. Oh, before I close, I also need to mention that this thing is a gaming powerhouse. It basically punches about as hard as an RTX 4060! With an iGPU! That's kinda crazy. Totally new territory on the PC side for integrated graphics. ETA Prime has a video showing the same chip in the GMK Tech running premier games at 1440p High Settings at great frame rates. You can run most games under Linux these days too (thanks Valve and Steam Deck!), but if you need to dual boot with Windows, the dual NVMe slots in the Framework Desktop come very handy. Framework did good with this one. AMD really blew it out of the water with the 395+. We're spoiled to have such incredible hardware available for Linux at such appealing discounts over similar stuff from Cupertino. What a great time to love open source software and tinker-friendly hardware!

19 hours ago • 4 votes

Writing: Blog Posts and Songs

I was listening to a podcast interview with the Jackson Browne (American singer/songwriter, political activist, and inductee into the Rock and Roll Hall of Fame) and the interviewer asks him how he approaches writing songs with social commentaries and critiques — something along the lines of: “How do you get from the New York Times headline on a social subject to the emotional heart of a song that matters to each individual?” Browne discusses how if you’re too subtle, people won’t know what you’re talking about. And if you’re too direct, you run the risk of making people feel like they’re being scolded. Here’s what he says about his songwriting: I want this to sound like you and I were drinking in a bar and we’re just talking about what’s going on in the world. Not as if you’re at some elevated place and lecturing people about something they should know about but don’t but [you think] they should care. You have to get to people where [they are, where] they do care and where they do know. I think that’s a great insight for anyone looking to have a connecting, effective voice. I know for me, it’s really easily to slide into a lecturing voice — you “should” do this and you “shouldn’t” do that. But I like Browne’s framing of trying to have an informal, conversational tone that meets people where they are. Like you’re discussing an issue in the bar, rather than listening to a sermon. Chris Coyier is the canonical example of this that comes to mind. I still think of this post from CSS Tricks where Chris talks about how to have submit buttons that go to different URLs: When you submit that form, it’s going to go to the URL /submit. Say you need another submit button that submits to a different URL. It doesn’t matter why. There is always a reason for things. The web is a big place and all that. He doesn’t conjure up some universally-applicable, justified rationale for why he’s sharing this method. Nor is there any pontificating on why this is “good” or “bad”. Instead, like most of Chris’ stuff, I read it as a humble acknowledgement of the practicalities at hand — “Hey, the world is a big place. People have to do crafty things to make their stuff work. And if you’re in that situation, here’s something that might help what ails ya.” I want to work on developing that kind of a voice because I love reading voices like that. Email · Mastodon · Bluesky

2 days ago • 4 votes

Doing versus Delegating

A staff+ skill

2 days ago • 7 votes

p-fast trie, but smaller

Previously, I wrote some sketchy ideas for what I call a p-fast trie, which is basically a wide fan-out variant of an x-fast trie. It allows you to find the longest matching prefix or nearest predecessor or successor of a query string in a set of names in O(log k) time, where k is the key length. My initial sketch was more complicated and greedy for space than necessary, so here’s a simplified revision. (“p” now stands for prefix.) layout A p-fast trie stores a lexicographically ordered set of names. A name is a sequence of characters from some small-ish character set. For example, DNS names can be represented as a set of about 50 letters, digits, punctuation and escape characters, usually one per byte of name. Names that are arbitrary bit strings can be split into chunks of 6 bits to make a set of 64 characters. Every unique prefix of every name is added to a hash table. An entry in the hash table contains: A shared reference to the closest name lexicographically greater than or equal to the prefix. Multiple hash table entries will refer to the same name. A reference to a name might instead be a reference to a leaf object containing the name. The length of the prefix. To save space, each prefix is not stored separately, but implied by the combination of the closest name and prefix length. A bitmap with one bit per possible character, corresponding to the next character after this prefix. For every other prefix that matches this prefix and is one character longer than this prefix, a bit is set in the bitmap corresponding to the last character of the longer prefix. search The basic algorithm is a longest-prefix match. Look up the query string in the hash table. If there’s a match, great, done. Otherwise proceed by binary chop on the length of the query string. If the prefix isn’t in the hash table, reduce the prefix length and search again. (If the empty prefix isn’t in the hash table then there are no names to find.) If the prefix is in the hash table, check the next character of the query string in the bitmap. If its bit is set, increase the prefix length and search again. Otherwise, this prefix is the answer. predecessor Instead of putting leaf objects in a linked list, we can use a more complicated search algorithm to find names lexicographically closest to the query string. It’s tricky because a longest-prefix match can land in the wrong branch of the implicit trie. Here’s an outline of a predecessor search; successor requires more thought. During the binary chop, when we find a prefix in the hash table, compare the complete query string against the complete name that the hash table entry refers to (the closest name greater than or equal to the common prefix). If the name is greater than the query string we’re in the wrong branch of the trie, so reduce the length of the prefix and search again. Otherwise search the set bits in the bitmap for one corresponding to the greatest character less than the query string’s next character; if there is one remember it and the prefix length. This will be the top of the sub-trie containing the predecessor, unless we find a longer match. If the next character’s bit is set in the bitmap, continue searching with a longer prefix, else stop. When the binary chop has finished, we need to walk down the predecessor sub-trie to find its greatest leaf. This must be done one character at a time – there’s no shortcut. thoughts In my previous note I wondered how the number of search steps in a p-fast trie compares to a qp-trie. I have some old numbers measuring the average depth of binary, 4-bit, 5-bit, 6-bit and 4-bit, 5-bit, dns qp-trie variants. A DNS-trie varies between 7 and 15 deep on average, depending on the data set. The number of steps for a search matches the depth for exact-match lookups, and is up to twice the depth for predecessor searches. A p-fast trie is at most 9 hash table probes for DNS names, and unlikely to be more than 7. I didn’t record the average length of names in my benchmark data sets, but I guess they would be 8–32 characters, meaning 3–5 probes. Which is far fewer than a qp-trie, though I suspect a hash table probe takes more time than chasing a qp-trie pointer. (But this kind of guesstimate is notoriously likely to be wrong!) However, a predecessor search might need 30 probes to walk down the p-fast trie, which I think suggests a linked list of leaf objects is a better option.

2 days ago • 4 votes

Software books I wish I could read

New Logic for Programmers Release! v0.11 is now available! This is over 20% longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! Full release notes here. Software books I wish I could read I'm writing Logic for Programmers because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way. Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as Data and Reality or Making Software. There is no blog or talk about debugging as good as the Debugging book. It might not be anything deeper than "people spend more time per word on writing books than blog posts". I dunno. So here are some other books I wish I could read. I don't think any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too. Everything about Configurations The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the configuration complexity clock? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration. The Big Book of Complicated Data Schemas I guess this would kind of be like Schema.org, except with a lot more on the "why" and not the what. Why is important for the Volcano model to have a "smokingAllowed" field?1 I'd see this less as "here's your guide to putting Volcanos in your database" and more "here's recurring motifs in modeling interesting domains", to help a person see sources of complexity in their own domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model. (This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe The Essence of Software touches on this? Man I feel bad I haven't read that yet.) Computer Science for Software Engineers Yes, I checked, this book does not exist (though maybe this is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. MISU Patterns MISU, or "Make Illegal States Unrepresentable", is the idea of designing system invariants in the structure of your data. For example, if a Contact needs at least one of email or phone to be non-null, make it a sum type over EmailContact, PhoneContact, EmailPhoneContact (from this post). MISU is great. Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are "patterns": smart constructors, product types, properly using sets, newtypes to some degree, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book. My one request would be to not give them cutesy names. Do something like the Aarne–Thompson–Uther Index, where items are given names like "Recognition by manner of throwing cakes of different weights into faces of old uncles". Names can come later. The Tools of '25 Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that enough developers will probably use at some point: git, VSCode, very basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science. Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030. A History of Obsolete Optimizations Probably better as a really long blog series. Each chapter would be broken up into two parts: A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology What we started doing instead, once we had more compute/network/storage available. c.f. A Spellchecker Used to Be a Major Feat of Software Engineering. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that did. Sphinx Internals I need this. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up. Systems Distributed Talk Today! Online premier's at noon central / 5 PM UTC, here! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable! In this case because it's a field on one of Volcano's supertypes. I guess schemas gotta follow LSP too ↩

2 days ago • 9 votes

New here?