More from Irrational Exuberance
Back in 2018, I wrote lethain/systems as a domain-specific language for writing runnable systems models, and introduced it with this blog post modeling a hiring funnel. While it’s far from a perfect system, I’ve gotten a lot of value out of it over the last seven years, because it allows me to maintain systems models in version control. As I’ve been playing with writing Model Context Protocol (MCP) servers, one I’ve been thinking about frequently is one to help writing systems syntax, and I finally put that together in the lethain/systems-mcp repository. More detailed installation and usage instructions are in the GitHub repository, so I’ll just share a couple of screenshots and comments here. Starting with the load_systems_documentation tool which loads a copy of lethain/systems/README.md and a file with example systems into the context window. The biggest challenge of properly writing DSLs with an LLM is providing enough in-context learning (ICL) examples, and I think the idea of providing tools that are specifically designed to provide that context is a very interesting idea. Eventually I imagine there will be generalized tools for this, e.g. a search index of the best ICL examples for a wide variety of DSLs. Until then, my guess is that this sort of tool is particularly valuable. The second tool is run_systems_model which passes the DSL (and an optional parameter for number of rounds) to the tool and then returns the result. I experimented with interface design here, initially trying to return a rendered chart of the results, but ultimately even multi-modal models are just much better at working with text than with images. This meant that I had the best results returning JSON of the results and then having the LLM build a tool for interacting with the results. Altogether, a fun little experiment, and another confirmation in my mind that the most interesting part of designing MCPs today is deciding where to introduce and eliminate complexity from the LLM. Introduce too little and the tool lacks power; eliminate too little and the combination rarely works.
At Carta, we recently ran a reading group for Facilitating Software Architecture by Andrew Harmel-Law. We already loosely followed the ideas of an architectural advice process (from this 2021 article by the same Andrew Harmel-Law), but in practice we found that internal tech spec and architecture decision record (ADR) authors tended to exclusively share their documents locally within their team rather than more widely. As we asked authors why they preferred sharing locally, the most common answer was that they got enough feedback from their team that they didn’t want to pay the time overhead of sharing widely. The wider feedback wasn’t necessarily bad or combative. It just wasn’t good enough to compensate for the additional time it cost to process. This made sense from the authors’ perspectives, but didn’t work well for me from the executive perspective, as I was seeing teams make misaligned decisions due to lack of cross-team communication. As one step in reducing the overhead of sharing documents widely, I wrote up and shared this recommended process for providing feedback on documents: Before starting, remember that the goal of providing feedback on a document is to help the author. Optimizing for anything else, even if it’s a worthy cause, discourages authors from sharing their future writing. If you prioritize something other than helping the author, you are discouraging them from sharing future work. Start by skimming the document to understand its structure and where various kinds of topics are addressed. Why? This helps avoid giving feedback on ways the document’s actual structure diverges from how you imagined it would be structured. It also reduces questions about topics that are answered later in the document. Both of these sorts of feedback are a distraction during a discussion on a tech spec. In general, it’s better to avoid them. If you notice an author making the same significant structural mistake over several ADRs, it’s worth delivering that feedback separately. After skimming, reread the document, leaving comments with concerns. Each comment should include these details: What your suggested change or concern is Why you believe this is meaningful to address How important this seems (from ignorable nitpick to critical) If you find yourself leaving more than three or four issues, then you should either raise your threshold for commenting or you should schedule time with the individual to talk over the feedback. If the document is unreasonably weak, then it’s appropriate to nudge their leadership to dig into what’s happening on that team. The most important idea behind these steps is that your goal as a feedback giver is to help the document’s author. It is not to protect your team’s strategy or platform. It is not to optimize for your goals. It’s to help the author. This might feel wrong, but ultimately optimizing for anything else will lead to an environment where sharing widely is an irrational behavior. As a final aside, I think the user experience around commenting on documents is fundamentally wrong in most document editors. For example, Google Docs treats individual comments as first-order objects, similarly to how old version control systems like CVS tracked changes to individual files without tracking an overall state of the project. Ultimately, you want to collect all your comments into a bundle, then review that bundle for consistency and duplicates, and then submit that bundle as commentary, but editors don’t support that flow particularly well.
A few years ago I wrote about reading a Profit & Loss statement, which is a foundational executive skill. I also subsequently wrote about ways to measure your engineering organization. Despite having written those, I still spend a lot of time wondering about effective ways to represent an engineering organization to your board of directors. Over the past few years, one of the most useful charts I’ve found for explaining an R&D organization is a scatterplot of R&D spend as a % of margin versus YoY growth of last twelve months (LTM) revenue. Unlike so many other measures, this is an explicit measure of your R&D organization value as an investment relative to peer organizations. Until recently, I assumed building this dataset required reading financial filings but my strategic finance partner at Carta, Tyler Braslow, pointed out that you can get all of this data for the tech section from Meritech Analytics, for free. When you login to Meritech, you’re dropped into a table of public company comparables for tech companies. This is the exact dataset I’d been looking for to build this chart. After logging in, you can then copy the contents of that table into a Google Sheets spreadsheet or Excel, or whatever you’re most comfortable with. Within that sheet, the columns you care about are: % YoY Growth LTM Rev (column Q for me) – how much “last twelve months revenue” has grown year over year, as a percentage % LTM Margins for R&D (column U for me) – how much R&D spend is as a percentage of last twelve months margin LTM Revenue (column O for me) – although I don’t show this in the scatterplot, I find this one useful for debugging outlier values Hiding the other columns gives you a much simpler table. From that table, you’re then able to build the scatterplot. Note that being “higher” means your R&D spend as a percentage of LTM margin is higher, which is a bad thing. The best companies are to the bottom and to the right; the worst companies are to the top and to the left. With this chart as a starting point, you can then plot your company in and show where you stand. You could also show how your company’s position in the chart has evolved over time: hopefully improving. Finally, you might want to cull some of these data points to better determine your public company comparables. The Meritech dataset has 106 entries, but you might prefer a more representative thirty entries.
Every few years I take a pass at reducing the chaos in my personal inboxes. There are simply too many emails to deal with, and that generally leads to me increasingly failing to follow up on important email. Up to this point, my strategy has largely been filtering out emails that I never want to read. But there’s another category of email which is stuff I often want to read when it’s fresh, but never want to read after it’s fresh. For example, calendar reminders, some mailing lists, some news letters, etc. I decided to figure out how I could setup a system where I could mark a number of things as “filter three days after receipt”. This is a nice compromise, because I do want to see those things, but I don’t want to have to remember to archive them after the fact. You can write a search query for this in GMail: from:(calendar-notification@google.com) older_than:3d However, if you try to create a GMail filter using that, it turns the older_than:3d into a fixed point in time rather than doing what you want. It seems that this is unsolvable within GMail itself. However, some quick searching suggested it was possible to create a Google App Script to solve this, and asked Claude to write the script for me. Following those instructions, I went to script.google.com, which I have not gone to in many years. I edited the generated script from Claude to use the tag “TempMsg”, to archive messages (as originally it had those commented out), and to limit itself to the first fifty items matching that tag. You can find the full code in this gist. I attempted to run this as is, and got an error message that I needed to grant permissions. That requires three clicks within the Google Scripts UI. This also requires approving the somewhat scary message that I trust myself. From there I tried to run this script, and it failed because the TempMsg tag doesn’t exist in my inbox. So I went ahead and created that tag, and setup some filters to assign that tag to certain email senders. After that, I was able to run the script and it worked properly. Note that I convinced myself that it was failing for a bit, because it doesn’t remove messages from the past three days. That is exactly how it’s supposed to work, but I would run it and then see messages with the tag there and think it was failing. Woops. After convincing myself it was working, I added a periodic trigger to run this. I now have this running on a daily basis, and it’s given me a nice new tool for managing my email a bit better. After verifying it, I also used the tag manager to “hide” this tag in the inbox, so I don’t have to see the TempMsg tag everywhere. If I ever need to debug things, I can always make it visible again.
More in programming
Today we go over how hydration errors happen in react-router and how to fix them.
Imagine we have a list of paths to Parquet files on R2. We need to fetch Parquet footer of each file. However, we don’t know in advance whether we will need footers of all files and we want to avoid fetching extra. Rust has a streams abstraction. It is kind of like an iterator, but with async operations permitted. Like iterators, streams are lazy - they don’t do anything unless explicitly polled. This sounds like it ticks all the boxes, so lets try to implement it:
Watch now (20 mins) | Learn how to inspect registers, step through instructions, and investigate crashes using GDB.
A clip from “Buy Now! The Shopping Conspiracy” features a former executive of an online retailer explaining how motivated they were to make buying easy. Like, incredibly easy. So easy, in fact, that their goal was to “reduce your time to think a little bit more critically about a purchase you thought you wanted to make.” Why? Because if you pause for even a moment, you might realize you don’t actually want whatever you’re about to buy. Been there. Ready to buy something and the slightest inconvenience surfaces — like when I can’t remember the precise order of my credit card’s CCV number and realize I’ll have to find my credit card and look it up — and that’s enough for me to say, “Wait a second, do I actually want to move my slug of a body and find my credit card? Nah.” That feels like the socials too. The algorithms. The endless feeds. The social interfaces. All engineered to make you think less about what you’re consuming, to think less critically about reacting or responding or engaging. Don’t think, just scroll. Don’t think, just like. Don’t think, just repost. And now with AI don’t think at all.[1] Because if you have to think, that’s friction. Friction is an engagement killer on content, especially the low-grade stuff. Friction makes people ask, “Is this really worth my time?” Maybe we need a little more friction in the world. More things that merit our time. Less things that don’t. It’s kind of ironic how the things we need present so much friction in our lives (like getting healthcare) while the things we don’t need that siphon money from our pockets (like online gambling[2]) present so little friction you could almost inadvertently slip right into them. It’s as if The Good Things™️ in life are full of friction while the hollow ones are frictionless. Nicholas Carr said, “The endless labor of self-expression cries out for the efficiency of automation.” Why think when you can prompt a probability machine to stitch together a facade of thinking for you? ⏎ John Oliver did a segment on sports betting if you want to feel sad. ⏎ Email · Mastodon · Bluesky