More from Posts on Made of Bugs
Suppose we have a large collection of documents, and we wish you identify which documents are approximately the same as each other. For instance, we may have crawled the web over some period of time, and expect to have fetched the “same page” several times, but to see slight differences in metadata, or that we have several revisions of a page following small edits. In this post I want to explore the method of approximate deduplication via Jaccard similarity and the MinHash approximation trick.
I worked at Stripe for about seven years, from 2012 to 2019. Over that time, I used and contributed to many generations of Stripe’s developer environment – the tools that engineers used daily to write and test code. I think Stripe did a pretty good job designing and building that developer experience, and since leaving, I’ve found myself repeatedly describing features of that environment to friends and colleagues. This post is an attempt to record the salient features of that environment as I remember it.
I was recently introduced to the paper “Seeing the Invisible: Perceptual-Cognitive Aspects of Expertise” by Gary Klein and Robert Hoffman. It’s excellent and I recommend you read it when you have a chance. Klein and Hoffman discuss the ability of experts to “see what is not there”: in addition to observing data and cues that are present in the environment, experts perceive implications of these cues, such as the absence of expected or “typical” information, the typicality or atypicality of observed data, and likely/possible past and future time trajectories of a system based on a point-in-time snapshot or limited duration of observation.
This December, the imp of the perverse struck me, and I decided to see how many days of Advent of Code I could do purely in compile-time C++ metaprogramming. As of this writing, I’ve done two days, and I’m not sure I’ll make it any further. However, that’s one more day than I planned to do as of yesterday, which is in turn further than I thought I’d make it after my first attempt.
More in technology
Waymo’s factory, a map of US land values, ships in the Arctic Circle, battery industry trends, and more.
What `git config` settings should be defaults by now? Here are some settings that even the core developers change.
Dean Burnett writing for Science Focus: Your Brain Is Hard-Wired to Avoid Exercise. Here's Why So why, even though we’ve evolved to do it, doesn't everyone enjoy exercise? The baffling complexity of the human brain is to blame. Evolving an ability doesn’t
It’s been fantastic being in the Philippines for this year’s WordCamp Asia. We have attendees from 71 countries, over 1,800 tickets sold, and contributor day had over 700 people! It’s an interesting contrast to US and EU WordCamps as well in that the audience is definitely a lot younger, and there’s very little interest in … Continue reading WordCamp Asia and Maha Kumbh Mela →
I’m not trying to send any hate at a random MacRumors forum commentor who I don’t know, so please take this more as me explaining why there’s more skepticism about the Vision Pro than other gen-1 Apple products. From the post: BTW, I also