More from Posts on Made of Bugs
About a month ago, the CPython project merged a new implementation strategy for their bytecode interpreter. The initial headline results were very impressive, showing a 10-15% performance improvement on average across a wide range of benchmarks across a variety of platforms. Unfortunately, as I will document in this post, these impressive performance gains turned out to be primarily due to inadvertently working around a regression in LLVM 19. When benchmarked against a better baseline (such GCC, clang-18, or LLVM 19 with certain tuning flags), the performance gain drops to 1-5% or so depending on the exact setup.
Earlier this month, I used Claude to port (parts of) an Emacs package into Rust, shrinking the execution time by a factor of 1000 or more (in one concrete case: from 90s to about 15ms). This is a variety of yak-shave that I do somewhat routinely, both professionally and in service of my personal computing environment. However, this time, Claude was able to execute substantially the entire project under my supervision without me writing almost-any lines of code, speeding up the project substantially compared to doing it by hand.
Suppose we have a large collection of documents, and we wish you identify which documents are approximately the same as each other. For instance, we may have crawled the web over some period of time, and expect to have fetched the “same page” several times, but to see slight differences in metadata, or that we have several revisions of a page following small edits. In this post I want to explore the method of approximate deduplication via Jaccard similarity and the MinHash approximation trick.
I worked at Stripe for about seven years, from 2012 to 2019. Over that time, I used and contributed to many generations of Stripe’s developer environment – the tools that engineers used daily to write and test code. I think Stripe did a pretty good job designing and building that developer experience, and since leaving, I’ve found myself repeatedly describing features of that environment to friends and colleagues. This post is an attempt to record the salient features of that environment as I remember it.
I was recently introduced to the paper “Seeing the Invisible: Perceptual-Cognitive Aspects of Expertise” by Gary Klein and Robert Hoffman. It’s excellent and I recommend you read it when you have a chance. Klein and Hoffman discuss the ability of experts to “see what is not there”: in addition to observing data and cues that are present in the environment, experts perceive implications of these cues, such as the absence of expected or “typical” information, the typicality or atypicality of observed data, and likely/possible past and future time trajectories of a system based on a point-in-time snapshot or limited duration of observation.
More in technology
Since my last piece about Bluesky, I’ve been using the service a lot more. Just about everyone I followed on other services is there now, and it’s way more fun than late-stage Twitter ever was. Halifax is particularly into Bluesky, which reminds me of our local scene during the late-2000s/early-2010s Twitter era. That said, I still have reservations about the service. Primarily around the whole decentralized/federated piece. The Bluesky team continues to work toward the goal of creating a decentralized and open protocol, but they’ve got quite a way to go. Part of my fascination with Bluesky is due to its radical openness. There is no similar service that allows users unauthenticated access to the firehose, or that publishes in-depth stats around user behaviour and retention. I like watching numbers go up, so I enjoy following those stats and collecting some of my own. A few days ago I noticed that the rate of user growth was accelerating. Growth had dropped off steadily since late January. As of this writing, there are currently around 5 users a second signing up for the service. It was happening around the same time as tariff news was dropping, but that didn’t seem like a major driver. Turned out that the bigger cause was a new Tiktok-like video sharing app called Skylight Social. I was a bit behind on tech news, so I missed when TechCrunch covered the app. It’s gathered more steam since then, and today is one of the highest days for new Bluesky signups since the US election. As per the TechCrunch story, Skylight has been given some initial funding by Mark Cuban. It’s also selling itself as “decentralized” and “unbannable”. I’m happy for their success, especially given how unclear the Tiktok situation is, but I continue to feel like everyone’s getting credit for work they haven’t done yet. Skylight Social goes out of its way to say that it’s powered by the AT Protocol. They’re not lying, but I think it’s truer at the moment to say that the app is powered by Bluesky. In fact, the first thing you see when launching the app is a prompt to sign up for a “BlueSky” account 1 if you don’t already have one. The Bluesky team are working on better ways to handle this, but it’s work that isn’t completed. At the moment, Skylight is not decentralized. I decided to sign up and test the service out, but this wasn’t a smooth experience. I started by creating an App Password, and tried logging using the “Continue with Bluesky” button. I used both my username and email address along with the app password, but both failed with a “wrong identifier or password” error. I saw a few other people having the same issue. It wasn’t until later that I tried using the “Sign in to your PDS” route, which ended up working fine. The only issue: I don’t run my own PDS! I just use custom domain name on top of Bluesky’s first-party PDS. In fact, it looks like third-party PDSs might not even be supported at the moment. Even if/when you can sign up with a third-party PDS, this is just a data storage and authentication platform. You’re still relying on Skylight and Bluesky’s services to shuttle the data around and show it to you. I’m not trying to beat up on Skylight specifically. I want more apps to be built with open standards, and I think TikTok could use a replacement — especially given that something is about to happen tomorrow. I honestly wish them luck! I just think the “decentralized” and “unbannable” copy on their website should currently be taken with a shaker or two of salt. I don’t know why, but seeing “BlueSky” camel-cased drives me nuts. Most of the Skylight Social marketing material doesn’t make this mistake, but I find it irritating to see during the first launch experience. ↩
Nintendo Life: Nintendo Delays Switch 2 Pre-Orders in the US Amidst New Trump Tariffs Nintendo has delayed pre-orders for the Switch 2 in the US while it evaluates the potential impact of new tariffs from The Trump Administration. And A $2,300 Apple iPhone? Trump tariffs could make that happen.
China’s sulfur emissions, Japan’s new semiconductor effort, declining sunbelt housing construction, water competition in Texas, and more.
A Minnesota cybersecurity and computer forensics expert whose testimony has featured in thousands of courtroom trials over the past 30 years is facing questions about his credentials and an inquiry from the Federal Bureau of Investigation (FBI). Legal experts say the inquiry could be grounds to reopen a number of adjudicated cases in which the expert's testimony may have been pivotal.
What's that Skippy? Another Ivanti Connect Secure vulnerability? At this point, regular readers will know all about Ivanti (and a handful of other vendors of the same class of devices), from our regular analysis. Do you know the fun things about these posts? We can copy text from