More from pcloadletter
I write this blog because I enjoy writing. Some people enjoy reading what I write, which makes me feel really great! Recently, I took down a post and stopped writing for a few months because I didn't love the reaction I was getting on social media sites like Reddit and Hacker News. On these social networks, there seems to be an epidemic of "gotcha" commenters, contrarians, and know-it-alls. No matter what you post, you can be sure that folks will come with their sharpest pitchforks to try to skewer you. I'm not sure exactly what it is about those two websites in particular. I suspect it's the gamification of the comment system (more upvotes = more points = dopamine hit). Unfortunately, it seems the easiest way to win points on these sites is to tear down the original content. At any rate, I really don't enjoy bad faith Internet comments and I have a decent-enough following outside of these social networks that I don't really have to endure them. Some might argue I need thicker skin. I don't think that's really true: your experience on the Internet is what you make of it. You don't have to participate in parts of it if you don't want. Also, I know many of you reading this post (likely RSS subscribers at this point) came from Reddit or Hacker News in the first place. I don't mean to insult you or suggest by any means that everyone, or even the majority of users, on these sites are acting in bad faith. Still, I have taken a page from Tom MacWright's playbook and decided to add a bit of javascript to my website that helpfully redirects users from these two sites elsewhere: try { const bannedReferrers = [/news\.ycombinator\.com/i, /reddit\.com/i]; if (document.referrer) { const ref = new URL(document.referrer); if (bannedReferrers.some((r) => r.test(ref.host))) { window.location.href = "https://google.com/"; } } } catch (e) {} After implementing this redirect, I feel a lot more energized to write! I'm no longer worried about having to endlessly caveat my work for fear of getting bludgeoned on social media. I'm writing what I want to write and, if for those of you here to join me, I say thank you!
The older I get, the more I dislike clever code. This is not a controversial take; it is pretty-well agreed upon that clever code is bad. But I particularly like the on-call responsiblity framing: write code that you can understand when you get paged at 2am. If you have never been lucky enough to get paged a 2am, I'll paint the picture for you: A critical part of the app is down. Your phone starts dinging on your nightstand next to you. You wake up with a start, not quite sure who you are or where you are. You put on your glasses and squint at the way-too-bright screen of your phone. It's PagerDuty. "Oh shit," you think. You pop open your laptop, open the PagerDuty web app, and read the alert. You go to your telemetry and logging systems and figure out approximate whereabouts in the codebase the issue is. You open your IDE and start sweating: "I have no idea what the hell any of this code means." The git blame shows you wrote the code 2 years ago. You thought that abstraction was pretty clever at the time, but now you're paying a price: your code is inscrutable to an exhausted, stressed version of yourself who just wants to get the app back online. Reasons for clever code # There are a few reasons for clever code that I have seen over my career. Thinking clever code is inherently good # I think at some point a lot of engineers end up in a place where they become very skilled in a language before they understand the importance of writing clean, readable code. Consider the following two javascript snippets: snippet 1 const sum = items.reduce( (acc, el) => (typeof el === "number" ? acc + el : acc), 0 ); snippet 2 let sum = 0; for (const item of items) { if (typeof item === "number") { sum = sum + item; } } At one point in my career, I would have assumed the first snippet was superior: fewer lines and uses the reduce method! But I promise far more engineers can very quickly and easily understand what's going on in the second snippet. I would much rather the second snippet in my codebase any day. Premature abstraction # Premature abstractions tend to be pretty common in object-oriented languages. This stackexchange answer made me laugh quite a bit, so I'll use it as an example. Let's say you have a system with employee information. Well perhaps you decide employees are types of humans, so we'd better have a human class, and humans are a type of mammal, so we'd better have a mammal class, and so on. All of a sudden, you might have to navigate several layers up to the animal class to see an employee's properties and methods. As the stackexchange answer succinctly put it: As a result, we ended up with code that really only needed to deal with, say, records of employees, but were carefully written to be ready if you ever hired an arachnid or maybe a crustacean. DRY dogma # Don't Repeat Yourself (DRY) is a coding philosophy where you try to minimize the amount of code repeated in your software. In theory, even repeating code once results in an increased chance that you'll miss updating the code in both places or having inconsistent behavior when you have to implement the code somewhere else. In practice, DRYing up code can sometimes be complex. Perhaps there is a little repeated code shared between client and server. Do we need to create a way to share this logic? If it's only one small instance, it simply may not be worth the complexity of sharing logic. If this is going to be a common issue in the codebase, then perhaps centralizing the logic is worth it. But importantly we can't just assume that one instance of repeated code means we must eliminate the redundancy. What should we aim for instead? # There's definitely a balance to be struck. We can't have purely dumb code with no abstractions: that ends up being pretty error prone. Imagine you're working with an API that has some set of required headers. Forcing all engineers to remember to include those headers with every API call is error-prone. file1 fetch("/api/users", { headers: { Authorization: `Bearer ${token}`, AppVersion: version, XsrfToken: xsrfToken, }, }); fetch(`/api/users/${userId}`, { headers: { Authorization: `Bearer ${token}`, AppVersion: version, XsrfToken: xsrfToken, }, }); file2 fetch("/api/transactions", { headers: { Authorization: `Bearer ${token}`, AppVersion: version, XsrfToken: xsrfToken, }, }); file3 fetch("/api/settings", { headers: { Authorization: `Bearer ${token}`, AppVersion: version, XsrfToken: xsrfToken, }, }); Furthermore, having to track down every instance of that API call to update the headers (or any other required info) could be challenging. In this instance, it makes a lot of sense to create some kind of API service that encapsulates the header logic: service function apiRequest(...args) { const [url, headers, ...rest] = args; return fetch( url, { ...headers, Authorization: `Bearer ${token}`, AppVersion: version, XsrfToken: xsrfToken, }, ...rest ); } file1 apiRequest("/api/users"); apiRequest(`/api/users/${userId}`); file2 apiRequest("/api/transactions"); file3 apiRequest("/api/settings"); The apiRequest function is a pretty helpful abstraction. It helps that it is a very minimal abstraction: just enough to prevent future engineers from making mistakes but not so much that it's confusing. These kinds of abstractions, however, can get out of hand. I have see code where making a request looks something like this: const API_PATH = "api"; const USER_PATH = "user"; const TRANSACTIONS_PATH = "transactions"; const SETTINGS_PATH = "settings"; createRequest( endpointGenerationFn, [API_PATH, USER_PATH], getHeaderOverrides("authenticated") ); createRequest( endpointGenerationFn, [API_PATH, USER_PATH, userId], getHeaderOverrides("authenticated") ); There's really no need for this. You're not saving all that much for making variables instead of using strings for paths. In fact, this ends up making it really hard for someone debugging the code to search! Typically, I'd lok for the string "api/user" in my IDE to try to find the location of the request. Would I be able to find it with this abstraction? Would I be able to find it at 2am? Furthermore, passing an endpoint-generation function that consumes the path parts seems like overkill and may be inscrutable to more junior engineers (or, again, 2am you). Keep it as simple as possible # So I think in the end my message is to keep your code as simple as possible. Don't create some abstraction that may or may not be needed eventually. Weigh the maintenance value of DRYing up parts of your codebase versus readability.
I have noticed a trend in a handful of products I've worked on at big tech companies. I have friends at other big tech companies that have noticed a similar trend: The products are kind of crummy. Here are some experiences that I have often encountered: the UI is flakey and/or unintuitive there is a lot of cruft in the codebase that has never been cleaned up bugs that have "acceptable" workarounds that never get fixed packages/dependencies are badly out of date the developer experience is crummy (bad build times, easily breakable processes) One of the reasons I have found for these issues is that we simply aren't investing enough time to increase product quality: we have poorly or nonexistent quality metrics, invest minimally in testing infrastructure (and actually writing tests), and don't invest in improving the inner loop. But why is this? My experience has been that quality is simply a hard sell in bigh tech. Let's first talk about something that's an easy sell right now: AI everything. Why is this an easy sell? Well, Microsoft could announce they put ChatGPT in a toaster and their stock price would jump $5/share. The sad truth is that big tech is hyper-focused on doing the things that make their stock prices go up in the short-term. It's hard to make this connection with quality initiatives. If your software is slightly less shitty, the stock price won't jump next week. So instead of being able to sell the obvious benefit of shiny new features, you need to have an Engineering Manager willing to risk having lower impact for the sake of having a better product. Even if there is broad consensus in your team, group, org that these quality improvements are necessary, there's a point up the corporate hierarchy where it simply doesn't matter to them. Certainly not as much as shipping some feature to great fanfare. Part of a bigger strategy? # Cory Doctorow has said some interesting things about enshittification in big tech: "enshittification is a three-stage process: first, surpluses are allocated to users until they are locked in. Then they are withdrawn and given to business-customers until they are locked in. Then all the value is harvested for the company's shareholders, leaving just enough residual value in the service to keep both end-users and business-customers glued to the platform." At a macro level, it's possible this is the strategy: hook users initially, make them dependent on your product, and then cram in superficial features that make the stock go up but don't offer real value, and keep the customers simply because they really have no choice but to use your product (an enterprise Office 365 customer probably isn't switching anytime soon). This does seem to have been a good strategy in the short-term: look at Microsoft's stock ever since they started cranking out AI everything. But how can the quality corner-cutting work long-term? I hope the hubris will backfire # Something will have to give. Big tech products can't just keep getting shittier—can they? I'd like to think some smaller competitors will come eat their lunch, but I'm not sure. Hopefully we're not all too entrenched in the big tech ecosystem for this to happen.
Coding interviews are controversial. It can be unpleasant to code in front of someone else, knowing you're being judged. And who likes failing? Especially when it feels like you failed intellectually. But, coding interviews are effective. One big criticism of coding interviews is that they end up filtering out a lot of great candidates. It's true: plenty of great developers don't do well in coding interviews. Maybe they don't perform well under pressure. Or perhaps they don't have time (or desire) to cram leetcode. So if this is happening, then how can coding interviews be effective? Minimizing risk # Coding interviews are optimized towards minimizing risk and hiring a bad candidate is far worse than not hiring a good candidate. In other words, the hiring process is geared towards minimizing false positives, not false negatives. The truth is, there are typically a bunch of good candidates that apply for a job. There are also not-so-great candidates. As long as a company hires one of the good ones, they don't really care if they lose all the rest of the good ones. They just need to make sure they don't hire one of the no-so-great ones. Coding interviews are a decent way to screen out the false positives. Watching someone solve coding challenges gives you some assurance that they can, well, code. Why I myself like coding interviews # Beyond why coding interviews are beneficial for the company, I actually enjoy them as an interviewer. It's not that I like making people uncomfortable or judging them (I don't), but rather I like seeing how potential future colleagues think. How do they think about problems? Do they plan their solution or just jump in? This is a person with whom I'll be working closely. How do they respond to their code being scrutinized? Do I feel comfortable having to "own" their code? On automated online assessments (OAs) # The junior developer market right now is extremely competitive and therefore it is common to use automated coding challenges (OAs) as an initial screen. OAs kind of accomplish the false positive filtering mentioned above, but that assumes candidates aren't cheating. But some are. So you're filtering your candidate pool down to good candidates and dishonest candidates. Maybe that's worth it? Additionally, OAs don't give you any chance to interact with candidates. So you get no sense of what they'd really be like to work with. All in all, I'm not a fan of OAs. Far from perfect # Coding interviews are far from perfect. They're a terrible simulation of actual working conditions. They favor individuals who have time to do the prep work (e.g., grind leetcode). They're subject to myriad biases of the interviewer. But there's a reason companies still use them: they're effective in minimizing hiring risk for the company. And to them, that's the ball game.
More in science
The future of Ukraine, of Europe, freedom of speech, and Germany’s economy
By screening films in a brain scanner, neuroscientists discovered a rich library of neural scripts — from a trip through an airport to a marriage proposal — that form scaffolds for memories of our experiences. The post How ‘Event Scripts’ Structure Our Personal Memories first appeared on Quanta Magazine
I am fascinated by the technologies that live largely behind the scenes. These are not generally consumer devices, but they may be components of consumer products, or may largely have a role in industry – but they make our modern world possible, or make it much better. In addition I think that material science is […] The post Thermoelectric Cooling – It’s Cooler Than You Think first appeared on NeuroLogica Blog.
It is estimated that 1.5 million pairs of waders breed in Iceland, most of which spend the winter in West Europe and West Africa. There is a lot of guesswork associated with this number and little national monitoring information to assess whether species are doing well or badly. In this context, a 2025 paper in … Continue reading Iceland’s waders in decline
Shortly after the Trump administration took office in the United States in late January, more than 8,000 pages across several government websites and databases were taken down, the New York Times found. Though many of these have now been restored, thousands of pages were purged of references to gender and diversity initiatives, for example, and others including the U.S. Agency for International Development (USAID) website remain down. By 11 February, a federal judge ruled that the government agencies must restore public access to pages and datasets maintained by the Centers for Disease Control and Prevention (CDC) and the Food and Drug Administration (FDA). While many scientists fled to online archives in a panic, ironically, the Justice Department had argued that the physicians who brought the case were not harmed because the removed information was available on the Internet Archive’s Wayback Machine. In response, a federal judge wrote, “The Court is not persuaded,” noting that a user must know the original URL of an archived page in order to view it. The administration’s legal argument “was a bit of an interesting accolade,” says Mark Graham, director of the Wayback Machine, who believes the judge’s ruling was “apropos.” Over the past few weeks, the Internet Archive and other archival sites have received attention for preserving government databases and websites. But these projects have been ongoing for years. The Internet Archive, for example, was founded as a nonprofit dedicated to providing universal access to knowledge nearly 30 years ago, and it now records more than a billion URLs every day, says Graham. Since 2008, Internet Archive has also hosted an accessible copy of the End of Term Web Archive, a collaboration that documents changes to federal government sites before and after administration changes. In the most recent collection, it has already archived more than 500 terabytes of material. Complementary Crawls The Internet Archive’s strength is scale, Graham says. “We can often [preserve] things quickly, at scale. But we don’t have deep experience in analysis.” Meanwhile, groups like the Environmental Data and Governance Initiative and the Association of Health Care Journalists provide help for activists and academics identifying and documenting changes. The Library Innovation Lab at Harvard Law School has also joined the efforts with its archive of data.gov, a 16 TB collection that includes more than 311,000 public datasets and is being updated daily with new data. The project began in late 2024, when the library realized that data sets are often missed in other web crawls, says Jack Cushman, a software engineer and director of the Library Innovation Lab. “You can miss anything where you have to interact with JavaScript or with a button or with a form.” —Jack Cushman, Library Innovation Lab A typical crawl has no trouble capturing basic HTML, PDF, or CSV files. But archiving interactive web services that are driven by databases poses a challenge. It would be impossible to archive a site like Amazon, for example, says Graham. The datasets the Library Innovation Lab (LIL) is working to archive are similarly tricky to capture. “If you’re doing a web crawl and just clicking from link to link, as the End of Term archive does, you can miss anything where you have to interact with JavaScript or with a button or with a form, where you have to ask for permission and then register or download something,” explains Cushman. “We wanted to do something that was complementary to existing web crawls, and the way we did that was to go into APIs,” he says. By going into the API’s, which bypass web pages to access data directly, the LIL’s program could fetch a complete catalog of the data sets—whether CSV, Excel, XML, or other file types—and pull the associated URLs to create an archive. In the case of data.gov, Cushman and his colleagues wrote a script to send the right 300 queries that would fetch 1,000 items per query, then go through the 300,000 total items to gather the data. “What we’re looking for is areas where some automation will unlock a lot of new data that wouldn’t otherwise be unlocked,” says Cushman. The other important factor for the LIL archive was to make sure the data was in a usable format. “You might get something in a web crawl where [the data] is there across 100,000 web pages, but it’s very hard to get it back out into a spreadsheet or something that you can analyze,” Cushman says. Making it usable, both in the data format and user interface, helps create a sustainable archive. Lots Of Copies Keep Stuff Safe The key to preserving the internet’s data is a principle that goes by the acronym LOCKSS: Lots Of Copies Keep Stuff Safe. When the Internet Archive suffered a cyberattack last October, the Archive took down the site for a three-and-a-half week period to audit the entire site and implement security upgrades. “Libraries have traditionally always been under attack, so this is no different,” Graham says. As part of its defense, the Archive now has several copies of the materials in disparate physical locations, both inside and outside the U.S. “The US government is the world’s largest publisher,” Graham notes. It publishes material on a wide range of topics, and “much of it is beneficial to people, not only in this country, but throughout the world, whether that is about energy or health or agriculture or security.” And the fact that many individuals and organizations are contributing to preservation of the digital world is actually a good thing. “The goal is for those copies to be diverse across every metric that you can think of. They should be on different kinds of media. They should be controlled by different people, with different funding sources, in different formats,” says Cushman. “Every form of similarity between your backups creates a risk of loss.” The data.gov archive has its primary copy stored through a cloud service with others as backup. The archive also includes open source software to make it easy to replicate. In addition to maintaining copies, Cushman says it’s important to include cryptographic signatures and timestamps. Each time an archive is created, it’s signed with cryptographic proof of the creator’s email address and time, which can help verify the validity of an archive. An Ongoing Challenge Since President Trump took office, a lot of material has been removed from US federal websites—quantifiably more than previous new administrations, says Graham. On a global scale, however, this isn’t unprecedented, he adds. In the U.S., official government websites have been changed with each new administration since Bill Clinton’s, notes Jason Scott, a “free range archivist” at the Internet Archive and co-founder of digital preservation site Archive Team. “This one’s more chaotic,” Scott says. But “the web is a very high entropy entity ... Google is an archive like a supermarket is a food museum.” The job of digital archivists is a difficult one, especially with a backlog of sites that have existed across the evolution of internet standards. But these efforts are not new. “The ramping up will only be in terms of disk space and bandwidth resources, not the process that has been ongoing,” says Scott. For Cushman, working on this project has underscored the value of public data. “The government data that we have is like a GPS signal,” he says. “It doesn’t tell us where to go, but it tells us what’s around us, so that we can make decisions. Engaging with it for the first time this way has really helped me appreciate what a treasure we have.”