Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
11
Most of the projects I've been working on today have fairly strict code review policies. My work requires code review on most of our code, and as we bring on an army of interns for the summer, I've been responsible for reviewing lots of code. Additionally, about five months ago BarnOwl, the console-based IM client I develop, adopted an official pre-commit review policy. And I have a confession to make: I hate mandatory code review.
over a year ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Posts on Made of Bugs

Performance of the Python 3.14 tail-call interpreter

About a month ago, the CPython project merged a new implementation strategy for their bytecode interpreter. The initial headline results were very impressive, showing a 10-15% performance improvement on average across a wide range of benchmarks across a variety of platforms. Unfortunately, as I will document in this post, these impressive performance gains turned out to be primarily due to inadvertently working around a regression in LLVM 19. When benchmarked against a better baseline (such GCC, clang-18, or LLVM 19 with certain tuning flags), the performance gain drops to 1-5% or so depending on the exact setup.

3 days ago 4 votes
Building personal software with Claude

Earlier this month, I used Claude to port (parts of) an Emacs package into Rust, shrinking the execution time by a factor of 1000 or more (in one concrete case: from 90s to about 15ms). This is a variety of yak-shave that I do somewhat routinely, both professionally and in service of my personal computing environment. However, this time, Claude was able to execute substantially the entire project under my supervision without me writing almost-any lines of code, speeding up the project substantially compared to doing it by hand.

a month ago 22 votes
Finding near-duplicates with Jaccard similarity and MinHash

Suppose we have a large collection of documents, and we wish you identify which documents are approximately the same as each other. For instance, we may have crawled the web over some period of time, and expect to have fetched the “same page” several times, but to see slight differences in metadata, or that we have several revisions of a page following small edits. In this post I want to explore the method of approximate deduplication via Jaccard similarity and the MinHash approximation trick.

8 months ago 21 votes
Stripe's monorepo developer environment

I worked at Stripe for about seven years, from 2012 to 2019. Over that time, I used and contributed to many generations of Stripe’s developer environment – the tools that engineers used daily to write and test code. I think Stripe did a pretty good job designing and building that developer experience, and since leaving, I’ve found myself repeatedly describing features of that environment to friends and colleagues. This post is an attempt to record the salient features of that environment as I remember it.

9 months ago 14 votes
Performance engineering, profilers, and seeing the invisible

I was recently introduced to the paper “Seeing the Invisible: Perceptual-Cognitive Aspects of Expertise” by Gary Klein and Robert Hoffman. It’s excellent and I recommend you read it when you have a chance. Klein and Hoffman discuss the ability of experts to “see what is not there”: in addition to observing data and cues that are present in the environment, experts perceive implications of these cues, such as the absence of expected or “typical” information, the typicality or atypicality of observed data, and likely/possible past and future time trajectories of a system based on a point-in-time snapshot or limited duration of observation.

a year ago 16 votes

More in technology

This Arduino device helps ‘split the G’ on a pint of Guinness

Guinness is one of those beers (specifically, a stout) that people take seriously and the Guinness brand has taken full advantage of that in their marketing. They even sell a glass designed specifically for enjoying their flagship creation, which has led to a trend that the company surely appreciates: “splitting the G.” But that’s difficult […] The post This Arduino device helps ‘split the G’ on a pint of Guinness appeared first on Arduino Blog.

11 hours ago 2 votes
Why Website Taxonomies Drift (and What to Do about It)

AI is everywhere, but most websites are still managed manually by humans using content management systems like WordPress and Drupal. These systems provide means for tagging and categorizing content. But over time, these structures degrade. Without vigilance and maintenance, taxonomies become less useful and relevant over time. Users struggle to find stuff. Ambiguity creeps in. Search results become incomplete and unreliable. And as terms proliferate, the team struggles to maintain the site, making things worse. The site stops working as well as it could. Sales, engagement, and trust suffer. And the problem only gets worse over time. Eventually, the team embarks on a redesign. But hitting the reset button only fixes things for a while. Entropy is the nature of things. Systems tend toward disorder unless we invest in keeping them organized. But it’s hard: small teams have other priorities. They’re under pressure to publish quickly. Turnover is high. Not ideal conditions for consistent tagging. Many content teams don’t have governance processes for taxonomies. Folks create new terms on the fly, often without checking whether similar ones exist. But even when teams have the structures and processes needed to do it right, content and taxonomies themselves change over time as the org’s needs and contexts evolve. The result is taxonomy drift, the gradual misalignment of the system’s structures and content. It’s a classic “boiled frog” situation: since it happens slowly, teams don’t usually recognize it until symptoms emerge. By then, the problem is harder and more expensive to fix. Avoiding taxonomy drift calls for constant attention and manual tweaking, which can be overwhelming for resource-strapped teams. But there’s good news on the horizon: this is exactly the kind of gradual, large-scale, boring challenge where AIs can shine. I’ve worked on IA redesigns for content-heavy websites and have seen the effects of taxonomy drift firsthand. Often, one person is responsible for keeping the website organized, and they’re overwhelmed. After a redesign, they face three challenges: Implementing the new taxonomy on the older corpus. Learning to use the new taxonomy in their workflows. Adapting and evolving the taxonomy so it remains useful and consistent over time. AI is well-suited to tackling these challenges. LLMs excel at pattern matching and categorizing existing text at scale. Unlike humans, AIs don’t get overwhelmed or bored when categorizing thousands of items over and over again. And with predefined taxonomies, they’re not as prone to hallucinations. I’ve been experimenting with using AI to solve taxonomy drift, and the results are promising. I’m building a product to tackle this issue, and looking implement the approach in real-world scenarios. If you or someone you know is struggling to keep a content-heavy website organized, please get in touch.

21 hours ago 1 votes
Why are sine waves so common?

A simple question that takes some effort to answer in a satisfying way.

yesterday 4 votes
Intel and the New Millenium

Losing the performance crown

2 days ago 4 votes
Apple might be cooking this fall

Tim Hardwick reporting on Gurman’s reporting in Bloomberg, which I don’t have access to, so I’m quoting the MacRumors article: While specific details are scarce, it's supposedly the biggest update to iOS since iOS 7, and the biggest update to macOS since

2 days ago 2 votes