More from macwright.com
Google published Zanzibar: Google’s Consistent, Global Authorization System in 2019. It describes a system for authorization – enforcing who can do what – which maxes out both flexibility and scalability. Google has lots of different apps that rely on Zanzibar, and bigger scale than practically any other company, so it needed Zanzibar. The Zanzibar paper made quite a stir. There are at least four companies that advertise products as being inspired by or based on Zanzibar. It says a lot for everyone to loudly reference this paper on homepages and marketing materials: companies aren’t advertising their own innovation as much as simply saying they’re following the gospel. A short list of companies & OSS products I found: Companies WorkOS FGA Authzed auth0 FGA Ory Permify Open source Ory Keto (Go) Warrant (Go) probably the basis for WorkOS FGA, since WorkOS acquired Warrant. SpiceDB (Go) the basis for Authzed. Permify (Go) OpenFGA (Go) the basis of auth0 FGA. I read the paper, and have a few notes, but the Google Zanzibar Paper, annotated by AuthZed is the same thing from a real domain expert (albeit one who works for one of these companies), so read that too, or instead. Features My brief summary is that the Zanzibar paper describes the features of the system succinctly, and those features are really appealing. They’ve figured out a few primitives from which developers can build really flexible authorization rules for almost any kind of application. They avoid making assumptions about ID formats, or any particular relations, or how groups are set up. It’s abstract and beautiful. The gist of the system is: Objects: things in your data model, like documents Users: needs no explanation Namespaces: for isolating applications Usersets: groups of users Userset rewrite rules: allow usersets to inherit from each other or have other kinds of set relationships Tuples, which are like (object)#(relation)@(user), and are sort of the core ‘rule’ construct for saying who can access what There’s then a neat configuration language which looks like this in an example: name: "doc" relation { name: "owner"} relation { name: "editor" userset_rewrite { union { child { _this f } } child { computed_userset { relation: "owner" } } relation { name: "viewer" userset_rewrite { union { child {_this f} } child { computed_userset & relation: "editor" 3 } child { tuple_to_userset { tupleset { relation: "parent" } computed_userset { object: $TUPLE_USERSET_OBJECT # parent folder relation: "viewer" } } } } } } It’s pretty neat. At this point in the paper I was sold on Zanzibar: I could see this as being a much nicer way to represent authorization than burying it in a bunch of queries. Specifications & Implementation details And then the paper discusses specifications: how much scale it can handle, and how it manages consistency. This is where it becomes much more noticeably Googley. So, with Google’s scale and international footprint, all of their services need to be globally distributed. So Zanzibar is a distributed system, and it is also a system that needs good consistency guarantees so that it avoid the “new enemy” problem, nobody is able to access resources that they shouldn’t, and applications that are relying on Zanzibar can get a consistent view of its data. Pages 5-11 are about this challenge, and it is a big one with a complex, high-end solution, and a lot of details that are very specific to Google. Most noticeably, Zanzibar is built with Spanner Google’s distributed database, and Spanner has the ability to order timestamps using TrueTime, which relies on atomic clocks and GPS antennae: this is not standard equipment for a server. Even CockroachDB, which is explicitly modeled off of Spanner, can’t rely on having GPS & atomic clocks around so it has to take a very different approach. But this time accuracy idea is pretty central to Zanzibar’s idea of zookies, which are sort of like tokens that get sent around in its API and indicate what time reference the client expects so that a follow-up response doesn’t accidentally include stale data. To achieve scalability, Zanzibar is also a multi-server architecture: there are aclservers, watchservers, a Leopard indexing system that creates compressed skip list-based representations of usersets. There’s also a clever solution to the caching & hot-spot problem, in which certain objects or tuples will get lots of requests all at once so their database shard gets overwhelmed. Conclusions Zanzibar is two things: A flexible, relationship-based access control model A system to provide that model to applications at enormous scale and with consistency guarantees My impressions of these things match with AuthZed’s writeup so I’ll just quote & link them: There seems to be a lot of confusion about Zanzibar. Some people think all relationship-based access control is “Zanzibar”. This section really brings to light that the ReBAC concepts have already been explored in depth, and that Zanzibar is really the scaling achievement of bringing those concepts to Google’s scale needs. link And Zookies are very clearly important to Google. They get a significant amount of attention in the paper and are called out as a critical component in the conclusion. Why then do so many of the Zanzibar-like solutions that are cropping up give them essentially no thought? link I finished the paper having absorbed a lot of tricky ideas about how to solve the distributed-consistency problems, and if I were to describe Zanzibar, those would be a big part of the story. But maybe that’s not what people mean when they say Zanzibar, and it’s more a description of features? I did find that Permify has a zookie-like Snap Token, AuthZed/SpiceDB has ZedTokens, and Warrant has Warrant-Tokens. Whereas OpenFGA doesn’t have anything like zookies and neither does Ory Keto. So it’s kind of mixed on whether these Zanzibar-inspired products have Zanzibar-inspired implementations, or focus more on exposing the same API surface. For my own needs, zookies and distributed consistency to the degree described in the Zanzibar paper are overkill. There’s no way that we’d deploy a sharded five-server system for authorization when the main application is doing just fine with single-instance Postgres. I want the API surface that Zanzibar describes, but would trade some scalability for simplicity. Or use a third-party service for authorization. Ideally, I wish there was something like these products but smaller, or delivered as a library rather than a server.
I watched a large part of All Watched Over By Machines of Loving Grace this month. This also counts as a “listening” item, because the theme song, “Baby Love Child” by Pizzicato Five, is also spectacular. Guitar Moves is a good series of interviews by Matt Sweeney, who I mostly know via his involvement in Bonnie Prince Billy. It’s a really cool format. I like how he interviews guitarists with recognizable sounds, and you get to see how little they need to play to sound just like themselves. The episode with St. Vincent is excellent too: she’s one of my guitar heroes: check out the guitar solo in Just The Same But Brand New, or her version of Dig a Pony. I also watched No Other Land. Everyone should watch No Other Land. AI thoughts roundup I don’t have a conclusion. Really, that’s my current state: ambivalence. I acknowledge that these tools are incredibly powerful, I’ve even started incorporating them into my work in certain limited ways (low-stakes code like POCs and unit tests seem like an ideal use case), but I absolutely hate them. I hate the way they’ve taken over the software industry, I hate how they make me feel while I’m using them, and I hate the human-intelligence-insulting postulation that a glorified Excel spreadsheet can do what I can but better. Nolan Lawson: AI ambivalence As I always say, the purpose of the system is what it does. Or, in this case, how I think about AI stuff is mostly affected by how people use AI stuff, and how people use AI stuff is a real mixed bag. There’s the tidal wave of spam, the aesthetic of fascism, the low-effort marketing materials with nonsense images, the non-consensual AI porn. I see all of the bad stuff every day both online and in the odd subway ad. The good stuff seems pretty theoretical, though: the press releases about AI-driven medical advances never seem to break into the real world. The stories about engineers 10x’ing their ability seem pretty mixed: we’re already at the hangover-and-regret phase with programmers bemoaning how they’ve generated so much slop and lost so much knowledge. Anyway, I’m mildly optimistic about the potential! But it’s a lot like crypto in that you could theoretically use the technology for something good but most people loudly used it for bad stuff, and people including me judged it based on what it did. AI has to start doing some good stuff soon. Potential isn’t enough. I think one thing chatGPT’s invention has revealed is how many people - including some very important people in society - find just basic reading and writing to be laborious and cumbersome to perform, and how oddly closely that type of strained literacy correlates with having other shitty opinions. From mtsw on Bluesky, about this story about Andrew Cuomo using ChatGPT to half-assedly write a policy platform. Right off the bat I should say that judging people for their level of literacy reeks of classism and so on. My own ability to read & write has a lot to do with my place in society: I went to good schools, had a stable home life, and smart parents. However, “the way that society was set up” kind of evened this out. Extremely social people with cultural capital and chiseled jawlines and biceps would get their rewards, and people like… myself, we would get rewarded for literacy and critical thinking. When one group needed the other, it was usually some kind of payment or partnership: Cuomo pays his scriptwriter, the TV show creator pays the actors. And some people can do both sides of the equation. But LLMs definitely indicate that people do not like this deal. Whew, they don’t like writing, but they also don’t like paying the writers or reading what they write. Maybe they could rejigger the system so that they could do it all. They have ideas for music and art but no interest in learning about music, practicing instruments, going to art school, or concentrating on a task for a long time, so why not generate it all? Why not, well - there are reasons, those reasons being that the generated output usually passes their own vibe check but once someone who looks closely at things or reads all the words encounters it, everyone points at the slop and it’s embarrassing. (Cuomo will never be embarrassed) Plus, you’re always going to get average results by asking a device that is incapable of creativity or thought. Also, you’ll miss out on the human experience of creating. And you’ll be indirectly feeding output data into training data for future LLMs, consequentially making their output worse. (Cuomo does not care about consequences) Colophon update I’ve moved the images for this website to Bunny (that’s an affiliate link, here’s a non-affiliate link if that’s what you prefer). When I initially moved my photos to this website, I set them up with Amazon S3 for storage and CloudFront to serve them with a CDN. Using AWS is painful for me, so I moved them to Cloudflare R2, which is Cloudflare’s equivalent to S3, and Cloudflare as a CDN. Thanks to owning my own domains, swapping out image hosts is pretty quick: switching to Bunny took all of five minutes. So what’s the deal with Bunny? Partly I’ve become a little more negative on Cloudflare and R2: I think Cloudflare’s technology is neat, but R2 has iffy reliability and Cloudflare has iffy politics. I’m also intrigued by diversifying my dependencies geographically. Bunny is a Slovenian company, and my email is from an Australian company. This probably won’t have any practical effect, but it feels kind of good for obvious reasons to even minutely hedge my bets here. So far Bunny has been great. They don’t support the S3 protocol but they do support SFTP, which works just as well for my purposes and works great with the beautiful Transmit app. Before, with R2, I was using the significantly less beautiful Cyberduck application because Cloudflare R2 doesn’t support all of the S3 protocol. It seems to be just as fast as Cloudflare was, too. And I’m somewhat reassured by the prospect of paying Bunny. I don’t like the feeling of getting “free” services like I can from Cloudflare. I want that customer relationship. Reading Then again, pop culture is powerful, and even the dumbest marketing both affects and reflects it. Busch Light’s can holder shaped like a cup that holds beer is dumb, which is fine, because most beer promos are. But the fact that the brand frames it as a functional, masculine alternative to Stanley’s H2.0 Flowstate affirms a similarly retrograde outlook on gender roles to the one that young American men are seeking out on the political right. From my friend Dave’s article about Busch Light’s weird attempt to riff on the Stanley Tumbler trend. I was once a loyal listener of the Chapo Trap House podcast, but fell off of it in 2020 when their support of Bernie Sanders led them to be jerks about Elizabeth Warren. But reading this Vanity Fair article about the cohosts of the podcast endeared them to me a bit. “Like to thank” is linguistic phlegm. “I’d like to thank the Academy.” They’d “like to thank” me. Well I’d like to be 6’3” and drive a G Wagon, thanks. I’d like you to accept my novella. I’d like to quit paying three dollars to Submittable every time I want to send a story out. The world is full of actions I would like to do. The most direct way to say thank you is just to say it: “thank you, name, for doing X.” “I’d like to thank” is a performative thanks, a thanks with a smirk and a blink, eyeing for extra credit. Just because people say it in their award show acceptance speeches doesn’t mean you should say it, too. In fact, that’s the reason you shouldn’t say it. Loved this article “Close reading my rejections” from friend of the blog Barrett Hathcock.
Reading Whether it’s cryptocurrency scammers mining with FOSS compute resources or Google engineers too lazy to design their software properly or Silicon Valley ripping off all the data they can get their hands on at everyone else’s expense… I am sick and tired of having all of these costs externalized directly into my fucking face. Drew DeVault on the annoyance and cost of AI scrapers. I share some of that pain: Val Town is routinely hammered by some AI company’s poorly-coded scraping bot. I think it’s like this for everyone, and it’s hard to tell if AI companies even care that everyone hates them. And perhaps most recently, when a person who publishes their work under a free license discovers that work has been used by tech mega-giants to train extractive, exploitative large language models? Wait, no, not like that. Molly White wrote a more positive article about the LLM scraping problem, but I have my doubts about its positivity. For example, she suggests that Wikimedia’s approach with “Wikimedia Enterprise” gives LLM companies a way to scrape the site without creating too much cost. But that doesn’t seem like it’s working. The problem is that these companies really truly do not care. Harberger taxes represent an elegant theoretical solution that fails in practice for immobile property. Just as mobile home residents face exploitation through sudden ground rent increases, property owners under a Harberger system would face similar hold-up problems. This creates an impossible dilemma: pay increasingly burdensome taxes or surrender investments at below-market values. Progress and Poverty, a blog about Georgism, has this post about Herberger taxes, which are a super neat idea. The gist is that you would be in charge of saying how much your house is worth, but the added wrinkle is that by saying a price you are bound to be open to selling your house at that price. So if you go too low, someone will buy it, or too high, and you’re paying too much in taxes. It’s clever but doesn’t work, and the analysis points to the vital difference between housing and other goods: that buying, selling, and moving between houses is anything but simple. I’ve always been a little skeptical of the line that the AI crowd feels contempt for artists, or that such a sense is particularly widespread—because certainly they all do not!—but it’s hard to take away any other impression from a trend so widely cheered in its halls as AI Ghiblification. Brian Merchant on the OpenAI Studio Ghibli ‘trend’ is a good read. I can’t stop thinking that AI is in danger of being right-wing coded, the examples of this, like the horrifying White House tweet mentioned in that article, are multiplying. I feel bad when I recoil to innocent usage of the tool by good people who just want something cute. It is kind of fine, on the micro level. But with context, it’s so bad in so many ways. Already the joy and attachment I’ve felt to the graphic style is fading as more shitty Studio Ghibli knockoffs have been created in the last month than in all of the studio’s work. Two days later, at a state dinner in the White House, Mark gets another chance to speak with Xi. In Mandarin, he asks Xi if he’ll do him the honor of naming his unborn child. Xi refuses. Careless People was a good read. It’s devastating for Zuckerberg, Joel Kaplan, and Sheryl Sandberg, as well as a bunch of global leaders who are eager to provide tax loopholes for Facebook. Perhaps the only person who ends the book as a hero is President Obama, who sees through it all. In a March 26 Slack message, Lavingia also suggested that the agency should do away with paper forms entirely, aiming for “full digitization.” “There are over 400 vet-facing forms that the VA supports, and only about 10 percent of those are digitized,” says a VA worker, noting that digitizing forms “can take years because of the sensitivity of the data” they contain. Additionally, many veterans are elderly and prefer using paper forms because they lack the technical skills to navigate digital platforms. “Many vets don’t have computers or can’t see at all,” they say. “My skin is crawling thinking about the nonchalantness of this guy.” Perhaps because of proximity, the story that Sahil Lavingia has been working for DOGE seems important. It was a relief when a few other people noticed it and started retelling the story to the tech sphere, like Dan Brown’s “Gumroad is not open source” and Ernie Smith’s “Gunkroad”, but I have to nitpick on the structure here: using a non-compliant open source license is not the headline, collaborating with fascists and carelessly endangering disabled veterans is. Listening Septet by John Carroll Kirby I saw John Carroll Kirby play at Public Records and have been listening to them constantly ever since. The music is such a paradox: the components sound like elevator music or incredibly cheesy jazz if you listen to a few seconds, but if you keep listening it’s a unique, deep sound. Sierra Tracks by Vega Trails More new jazz! Mammoth Hands and Portico Quartet overlap with Vega Trails, which is a beautiful minimalist band. Watching This short video with John Wilson was great. He says a bit about having a real physical video camera, not just a phone, which reminded me of an old post of mine, Carrying a Camera.
I used to make little applications just for myself. Sixteen years ago (oof) I wrote a habit tracking application, and a keylogger that let me keep track of when I was using a computer, and generate some pretty charts. I’ve taken a long break from those kinds of things. I love my hobbies, but they’ve drifted toward the non-technical, and the idea of keeping a server online for a fun project is unappealing (which is something that I hope Val Town, where I work, fixes). Some folks maintain whole ‘homelab’ setups and run Kubernetes in their basement. Not me, at least for now. But I have been tiptoeing back into some little custom tools that only I use, with a focus on just my own computing experience. Here’s a quick tour. Hammerspoon Hammerspoon is an extremely powerful scripting tool for macOS that lets you write custom keyboard shortcuts, UIs, and more with the very friendly little language Lua. Right now my Hammerspoon configuration is very simple, but I think I’ll use it for a lot more as time progresses. Here it is: hs.hotkey.bind({"cmd", "shift"}, "return", function() local frontmost = hs.application.frontmostApplication() if frontmost:name() == "Ghostty" then frontmost:hide() else hs.application.launchOrFocus("Ghostty") end end) Not much! But I recently switched to Ghostty as my terminal, and I heavily relied on iTerm2’s global show/hide shortcut. Ghostty doesn’t have an equivalent, and Mikael Henriksson suggested a script like this in GitHub discussions, so I ran with it. Hammerspoon can do practically anything, so it’ll probably be useful for other stuff too. SwiftBar I review a lot of PRs these days. I wanted an easy way to see how many were in my review queue and go to them quickly. So, this script runs with SwiftBar, which is a flexible way to put any script’s output into your menu bar. It uses the GitHub CLI to list the issues, and jq to massage that output into a friendly list of issues, which I can click on to go directly to the issue on GitHub. #!/bin/bash # <xbar.title>GitHub PR Reviews</xbar.title> # <xbar.version>v0.0</xbar.version> # <xbar.author>Tom MacWright</xbar.author> # <xbar.author.github>tmcw</xbar.author.github> # <xbar.desc>Displays PRs that you need to review</xbar.desc> # <xbar.image></xbar.image> # <xbar.dependencies>Bash GNU AWK</xbar.dependencies> # <xbar.abouturl></xbar.abouturl> DATA=$(gh search prs --state=open -R val-town/val.town --review-requested=@me --json url,title,number,author) echo "$(echo "$DATA" | jq 'length') PR" echo '---' echo "$DATA" | jq -c '.[]' | while IFS= read -r pr; do TITLE=$(echo "$pr" | jq -r '.title') AUTHOR=$(echo "$pr" | jq -r '.author.login') URL=$(echo "$pr" | jq -r '.url') echo "$TITLE ($AUTHOR) | href=$URL" done Tampermonkey Tampermonkey is essentially a twist on Greasemonkey: both let you run your own JavaScript on anybody’s webpage. Sidenote: Greasemonkey was created by Aaron Boodman, who went on to write Replicache, which I used in Placemark, and is now working on Zero, the successor to Replicache. Anyway, I have a few fancy credit cards which have ‘offers’ which only work if you ‘activate’ them. This is an annoying dark pattern! And there’s a solution to it - CardPointers - but I neither spend enough nor care enough about points hacking to justify the cost. Plus, I’d like to know what code is running on my bank website. So, Tampermonkey to the rescue! I wrote userscripts for Chase, American Express, and Citi. You can check them out on this Gist but I strongly recommend to read through all the code because of the afore-mentioned risks around running untrusted code on your bank account’s website! Obsidian Freeform This is a plugin for Obsidian, the notetaking tool that I use every day. Freeform is pretty cool, if I can say so myself (I wrote it), but could be much better. The development experience is lackluster because you can’t preview output at the same time as writing code: you have to toggle between the two states. I’ll fix that eventually, or perhaps Obsidian will add new API that makes it all work. I use Freeform for a lot of private health & financial data, almost always with an Observable Plot visualization as an eventual output. For example, when I was switching banks and one of the considerations was mortgage discounts in case I ever buy a house (ha 😢), it was fun to chart out the % discounts versus the required AUM. It’s been really nice to have this kind of visualization as ‘just another document’ in my notetaking app. Doesn’t need another server, and Obsidian is pretty secure and private.
More in programming
The best engineering teams take control of their tools. They help develop the frameworks and libraries they depend on, and they do this by running production code on edge — the unreleased next version. That's where progress is made, that's where participation matters most. This sounds scary at first. Edge? Isn't that just another word for danger? What if there's a bug?! Yes, what if? Do you think bugs either just magically appear or disappear? No, they're put there by programmers and removed by the very same. If you want bug-free frameworks and libraries, you have to work for it, but if you do, the reward for your responsibility is increased engineering excellence. Take Rails 8.1, as an example. We just released the first beta version at Rails World, but Shopify, GitHub, 37signals, and a handful of other frontier teams have already been running this code in production for almost a year. Of course, there were bugs along the way, but good automated testing and diligent programmers caught virtually all of them before they went to production. It didn't always used to be this way. Once upon a time, I felt like I had one of the only teams running Rails on edge in production. But now two of the most important web apps in the world are doing the same! At an incredible scale and criticality. This has allowed both of them, and the few others with the same frontier ambition, to foster a truly elite engineering culture. One that isn't just a consumer of open source software, but a real-time co-creator. This is a step function in competence and prowess for any team. It's also an incredible motivation boost. When your programmers are able to directly influence the tools they're working with, they're far more likely to do so, and thus they go deeper, learn more, and create connections to experts in the same situation elsewhere. But this requires being able to immediately use the improvements or bug fixes they help devise. It doesn't work if you sit around waiting patiently for the next release before you dare dive in. Far more companies could do this. Far more companies should do this. Whether it's with Ruby, Rails, Omarchy, or whatever you're using, your team could level up by getting more involved, taking responsibility for finding issues on edge, and reaping the reward of excellence in the process. So what are you waiting on?
Here on a summer night in the grass and lilac smell Drunk on the crickets and the starry sky, Oh what fine stories we could tell With this moonlight to tell them by. A summer night, and you, and paradise, So lovely and so filled with grace, Above your head, the universe has hung its … Continue reading Dreams of Late Summer →
The first in a series of posts about doing things the right way
A deep dive into the Action Cache, the CAS, and the security issues that arise from using Bazel with a remote cache but without remote execution
(I present to you my stream of consciousness on the topic of casing as it applies to the web platform.) I’m reading about the new command and commandfor attributes — which I’m super excited about, declarative behavior invocation in HTML? YES PLEASE!! — and one thing that strikes me is the casing in these APIs. For example, the command attribute has a variety of values in HTML which correspond to APIs in JavaScript. The show-popover attribute value maps to .showPopover() in JavaScript. hide-popover maps to .hidePopover(), etc. So what we have is: lowercase in attribute names e.g. commandfor="..." kebab-case in attribute values e.g. show-popover camelCase for JS counterparts e.g. showPopover() After thinking about this a little more, I remember that HTML attributes names are case insensitive, so the browser will normalize them to lowercase during parsing. Given that, I suppose you could write commandFor="..." but it’s effectively the same. Ok, lowercase attribute names in HTML makes sense. The related popover attributes follow the same convention: popovertarget popovertargetaction And there are many other attribute names in HTML that are lowercase, e.g.: maxlength novalidate contenteditable autocomplete formenctype So that all makes sense. But wait, there are some attribute names with hyphens in them, like aria-label="..." and data-value="...". So why isn’t it command-for="..."? Well, upon further reflection, I suppose those attributes were named that way for extensibility’s sake: they are essentially wildcard attributes that represent a family of attributes that are all under the same namespace: aria-* and data-*. But wait, isn’t that an argument for doing popover-target and popover-target-action? Or command and command-for? But wait (I keep saying that) there are kebab-case attribute names in HTML — like http-equiv on the <meta> tag, or accept-charset on the form tag — but those seem more like legacy exceptions. It seems like the only answer here is: there is no rule. Naming is driven by convention and decisions are made on a case-by-case basis. But if I had to summarize, it would probably be that the default casing for new APIs tends to follow the rules I outlined at the start (and what’s reflected in the new command APIs): lowercase for HTML attributes names kebab-case for HTML attribute values camelCase for JS counterparts Let’s not even get into SVG attribute names We need one of those “bless this mess” signs that we can hang over the World Wide Web. Email · Mastodon · Bluesky