More from Eric Bailey
I’ve been seeing, and enjoying reading these posts as they pop up in my RSS reader. Dave Rupert tagged me into the chain, so here we go! Why did you start blogging in the first place? With the gift of hindsight, I guess I came up being blog-adjacent. Like Dave, I also had a background in publishing as a youth. I worked for my high school newspaper, and had a part- and then later full-time job at my local newspaper. I also published a weirdo, monkey cheese nerd zine. Its main claims to fame were both pissing off the principal and preventing me from getting dates. Zines are cool and embracing cringe will set you free. I read a ton of blogs, but I never initially thought I’d be be someone who published one. This was due to fear of dog-piling criticism, as well as not thinking I had anything meaningful to contribute. Then I got Kivikoskied. Reader, I strongly encourage you to get Kivikoskied yourself. The first post I put on my site was a reaction to the WebAIM Millions report. Reading through it generated enough feelings that I needed a place to put them in a constructive way. What platform are you using to manage your blog and why did you choose it? The reaction to the WebAIM Millions report was originally just a HTML page with a dream. That page seemed to resonate with people, so with that encouragement I had to build blogging infrastructure after the fact. That infrastructure wound up being Eleventy. I love Eleventy, and it’s only gotten better since that initial adoption. Zach Leatherman is a mensch, and I sing the praises of his project every chance I can get. I love blogging with Eleventy because it prioritizes speed, stability, and performance. Static web pages generated via Markdown are easy enough to wrangle, and it means I can spend the majority of my time focusing on writing, and not managing dependencies or database updates. Have you blogged on other platforms before? WordPress, Jekyll, thoughtbot’s homegrown CMS, and a few others. May you never have to work with Méthode. How do you write your posts? For example, in a local editing tool, or in a panel/dashboard that’s part of your blog? I’ve evaluated countless writing apps, but find myself keep coming back to Dropbox Paper. I’m highly distractible, and love to fiddle and tinker. Because of this, I find that Paper’s intentional, designed simplicity keeps me focused and on-task. It’s a shame that we live in the rot economy—where innovation is synonymous with value extraction—and there is apparently no longer enough incentive to maintain it. The post is then exported as a Markdown file from Paper, has its contents pasted into VS Code, cleaned up a little bit, metadata is added, merged into GitHub, and voilà! Blog post! There are more efficient ways to do this, but I find the ritual of it all soothing. When do you feel most inspired to write? I’m going to share a little secret with you: Nearly every technical blog post I write is a longform subtweet. By this, I mean I use writing as a way to channel a lot of my anxieties and frustrations into something constructive. I wish I wrote more silly posts, but it’s difficult to prioritize them given the state of things. Do you publish immediately after writing, or do you let it simmer a bit as a draft? I’ll chip away at a draft for weeks, moving sections around and tweaking language until the entire thing feels cohesive. It’s less perfectionism and more wanting to be sure I’m communicating my thoughts as clearly as I can. There is also the inevitable flurry of edits that follow hitting publish. I’d bottle that feeling of sudden, panicked clarity if I could. What are you generally interested in writing about? The intersection of accessibility, usability, design systems, and the web platform. I’m also a sucker for CSS, tech culture, and a good metaphor. Who are you writing for? I write for people who are curious about the web, accessibility, and frontend technology at a medium-to-high level of familiarity. It has been so liberating to not have to explain the basics of accessibility and why it matters anymore. I also write for myself as augmented memory. This, along with services like Pinboard help with my memory. Blog posts are also conversations. It is also a disservice to both audiences if I’m not weaving a lot of contextually relevant voices into the work as outgoing links. What’s your favorite post on your blog? My favorite post on my website is my opus, Accessibility annotation kits only annotate. It took forever to put those thoughts into words. My favorite post on another website is Consider the Tomato. thoughtbot tolerated and encouraged a lot of my shenanigans, and I’m thankful for that. Any future plans for your blog? Maybe a redesign, a move to another platform, or adding a new feature? This website is in desperate need of a redesign, and the “updating in the open” banner is an albatross around my neck. Ironically, the time I should spend on that is spent writing blog posts. I’m now at the point where I fantasize about taking a month off of work to make said redesign happen. Grinning face with sweat emoji. Tag ‘em I’d tag everyone on my RSS reader, if I could. Until then: Adrian Roselli. I’m more or less contractually obligated to include a link to Adrian’s site any time I write about accessibility, as chances are he’s already covered it. Ben Myers. Another favorite accessibility author. I really enjoy his takes on disability and digital accessibility. Jan Maarten. Coworker and samebrain friend, whose longform pieces are always worth reading. Jim Nielsen. A Melanie Richards. Melanie is, in a word, prolific. I’m in awe of her digital gardening efforts. Miriam Suzanne. Less a triple threat and more a, uh, quintuple threat? Brilliance at every turn.
I debuted these principles in my axe-con 2025 talk, It is designed to break your heart: Cultivating a harm reduction mindset as an accessibility practitioner. They are adapted from The National Harm Reduction Coalition’s original eight principles. My adapted principles reflect philosophical and behavioral changes I’ve been cultivating. This is done to try and offset, and defend against systemic trauma and its resultant depression, burnout, and other negative experiences you can incur when doing digital accessibility work. If you have the time, I’d advise reading the original eight principles. I also recommend watching or reading the talk. I say this not in a self-promotional way, but instead that there is a lot of context that will be helpful in understanding: How these adapted principles came to be, and also The larger mindset shifts and practices that led to their creation. The principles There are eight principles in total. They are delivered in the context of how to approach evaluating a team’s efforts, and are: Accepting ableism and minimizing it Accepting, for better or worse, that ableism is a part of our world and choosing to work to minimize its harmful effects, rather than simply ignoring or condemning it. The original principle this is derived from is: “Accepts, for better or worse, that licit and illicit drug use is part of our world and chooses to work to minimize its harmful effects rather than simply ignore or condemn them.” Provisioning of resources is non-judgemental Calling for the non-judgemental provision of services and resources for people who create access barriers within the disciplines in which they work, in order to assist them in reducing harm. The original principle this is derived from is: “Calls for the non-judgmental, non-coercive provision of services and resources to people who use drugs and the communities in which they live in order to assist them in reducing attendant harm.” Do not minimize or ignore real harm Does not attempt to minimize or ignore the real and tragic harm and danger that can be created by inaccessible experiences. The original principle this is derived from is: “Does not attempt to minimize or ignore the real and tragic harm and danger that can be associated with illicit drug use.” Some barriers are worse than others Understands that how access barriers are created is a complex, multi-faceted phenomenon that encompasses a range of severities from life-endangering to annoying, and acknowledges that some barriers are clearly worse than others. The original principle this is derived from is: “Understands drug use as a complex, multi-faceted phenomenon that encompasses a continuum of behaviors from severe use to total abstinence, and acknowledges that some ways of using drugs are clearly safer than others.” Social inequalities affect vulnerability Recognizes that the realities of poverty, class, racism, social isolation, past trauma, sex-based discrimination, and other social inequalities affect both people’s vulnerability to, and capacity for effectively dealing with creating inaccessible experiences. The original principle this is derived from is: “Recognizes that the realities of poverty, class, racism, social isolation, past trauma, sex-based discrimination, and other social inequalities affect both people’s vulnerability to and capacity for effectively dealing with drug-related harm.” Improvement of quality is success Establishes quality of individual and team life and well-being—not necessarily cessation of all current workflows—as the criteria for successful interventions and policies. The original principle this is derived from is: “Establishes quality of individual and community life and well-being—not necessarily cessation of all drug use—as the criteria for successful interventions and policies.” Empowering people also helps their peers Affirms people who create access barriers themselves as the primary agents of reducing the harms of their efforts, and seeks to empower them to share information and support each other in creating and using remediation strategies that are effective for their daily workflows. The original principle this is derived from is: “Affirms people who use drugs themselves as the primary agents of reducing the harms of their drug use and seeks to empower people who use drugs to share information and support each other in strategies which meet their actual conditions of use.” Ensure that disabled people have a voice in change Ensures that people who are affected by access barriers, and those who have been affected by your organization’s access barriers, have a real voice in the creation of features and services designed to serve them. The original principle this is derived from is: “Ensures that people who use drugs and those with a history of drug use routinely have a real voice in the creation of programs and policies designed to serve them.” Reframe My talk digs deeper into into the parallels between the adapted and original principles, as well as the similarities between digital accessibility and harm reduction work. This is in the service of attempting to reframe our efforts. By this, I mean that we are miscategorized participants in imperfect, trauma-generating systems. The change in perspective I am advocating for also compels changes in behavior in order to not only survive, but also flourish as digital accessibility practitioners. The adapted principles are integral to making this effort successful.
I get asked about my opinion on overlay-adjacent accessibility products with enough frequency that I thought it could be helpful to write about it. There’s a category of third party products out there that are almost, but not quite an accessibility overlay. By this I mean that they seem a little less predatory, and a little more grounded in terms of the promises they make. Some of these products are widgets. Some are browser extensions. Some are apps. Some are an odd fourth thing. Sometimes it’s a case of a solutioneering disability dongle grift, sometimes its a case of good intentions executed in a less-than-optimal way, and sometimes it’s something legitimately helpful. Oftentimes it’s something that lies in the middle area of all of this. Many of them also have some sort of “AI” integration, which is the unfortunate upsell du jour we have to collectively endure for the time being. The rubric I use to evaluate these products remains very similar to how I scrutinize overlays. Hopefully it’s something that can be helpful for your own efforts. Should the product’s functionality be patented? I’m not very happy with the idea that the mechanism to operate something in an accessible way is inhibited by way of legal restriction. This artificially limits who can use it, which is in opposition to the overall mission of digital accessibility. Ideally the technology is the free bit, and the service that facilitates it is what generates the profit. Do I need to subscribe to use it? A subscription-based model is a great way to run a business, but you don’t need to pay a recurring fee to use an accessible website. The nature of the web’s technology means it can be operated via keyboard, voice control, and other assistive technology if constructed properly. Workarounds and community support also exist for some things where it’s not built well. Here I’d also like you to consider the disability tax, and how that factors into a rental model. It’s not great. Does the browser or operating system already have this functionality? A lot of the time this boils down to an issue of discovery, digital literacy, or identity. As touched on in the previous section, browsers and operating systems offer a lot to help you self-serve. Notable examples are reading mode, on-screen narration, color filters, interface and text zoom, and forced color inversion. Can it be used across multiple experiences, or just one website? Stability and predictability of operation and output are vital for technology like this. It’s why I am so bullish on utilizing existing browser and operating system features. Products built to “enhance” the accessibility of a single website or app can’t contribute towards this. Ironically, their presence may actually contribute friction towards someone’s existing method of using things. A tricky little twist here is products that target a single website are often advertised towards the website owner, and not the people who will be using said website. Can I use the keyboard to operate it? I’ve gotten in the habit of pressing Tab a few times when I first check out the product’s website and see if anything happens. It’s a quick and easy test to see if the company walks the walk in addition to talking the talk. Here, I regrettably encounter missing focus indicators and non-semantic interactive controls more often than not. I might also sometimes run the homepage through axe DevTools, to see if there are other egregious errors. I then try to use the product itself with a keyboard if a demo is offered. I am usually found wanting here. How reliable is the AI? There are two broad considerations here: How reliable is the output? How can bias affect someone’s interpretation of things? While I am a skeptic, I can also acknowledge that there are some good use cases for LLMs and related technology when it comes to disability. I think about reliability in terms of the output in terms of the “assistive” part of assistive technology. By this, I mean it actually helps you do what you need to get done. Here, I’d point to Salma Alam-Naylor’s experience with newer startups in this space versus established, community supported solutions. Then consider LLM-based image description products. Here we want to make sure the content is accurate and relevant. Remember that image descriptions are the mechanism that some people rely on to help them understand the world. If that description is not accurate, it impacts how they form an understanding of their environment. A step past that thought is the biases inherent in, and perpetuated by LLM-based technology. I recall Ben Myers’ thoughts on implicit, hegemonic normalization, as well as the sobering truth that this technology can exert influence over its users worldview at scale. Can the company be trusted with your data? A lot of assistive technology is purposely designed to not announce the fact that it is being used. This is to stave off things like discrimination or ineffective, separate-yet-equal “accessibility only” sites. There’s also the murky world of data brokerage, and if the company is selling off this information or not. AccessiBe comes to mind here, and not in a good way. Also consider if the product has access to everything you visit and interact with, and who has access to that information. As a companion concern, it is also worth considering the product’s data security practices—or lack thereof. Here, I would like to point out that startups tend to deprioritize this boring kind of infrastructure work in favor of feature creation. Not having any personal information present in a system is the best way to guard against its theft. Also know that there is no way to undo a data breach once it occurs. Leaked information stays leaked. Will the company last? Speaking of startups, know that more fail than succeed. Are you prepared for an outcome where the product you rely on is is no longer updated or supported because the company that made it went out of business? It could also be a case where the company still exists, but ceases to support the product you use. Here, know that sometimes these companies will actively squash attempts for community-based resurrection and support of the service because it represents potential liability. This concern is another reason why I’m bullish on operating system and browser functionality. They have a lot more resiliency and focus on the long view in this particular area. But also I’m not the arbiter of who can use what. In the spirit of “the best camera is the one you have on you:” if something works for your specific access needs, by all means use it.
A lieutenant colonel in the Soviet Air Defense Forces prevented the end of human civilization on September 26th, 1983. His name was Stanislav Petrov. Protocol dictated that the Soviet Union would retaliate against any nuclear strikes sent by the United States. This was a policy of mutually assured destruction, a doctrine that compels a horrifying logical conclusion. The second and third stage effects of this type of exchange would be even more catastrophic. Allies for each side would likely be pulled into the conflict. The resulting nuclear winter was projected to lead to 2 billion deaths due to starvation. This is to say nothing about those who would have been unfortunate enough to have survived. Petrov’s job was to monitor Oko, the computerized warning systems built to centralize Soviet satellite communications. Around midnight, he received a report that one of the satellites had detected the infrared signature of a single launch of a United States ICBM. While Petrov was deciding what to do about this report, the system detected four more incoming missile launches. He had minutes to make a choice about what to do. It is impossible to imagine the amount of pressure placed on him at this moment. Source: Stanislav Petrov, Soviet officer credited with averting nuclear war, dies at 77 by Schwartzreport. Petrov lived in a world of deterministic systems. The technologies that powered these warning systems have outputs that are guaranteed, provided the proper inputs are provided. However, deterministic does not mean infallible. The only reason you are alive and reading this is because Petrov understood that the systems he observed were capable of error. He was suspicious of what he was seeing reported, and chose not to escalate a retaliatory strike. There were two factors guiding his decision: A surprise attack would most likely have used hundreds of missiles, and not just five. The allegedly foolproof Oko system was new and prone to errors. An error in a deterministic system can still lead to expected outputs being generated. For the Oko system, infrared reflections of the sun shining off of the tops of clouds created a false positive that was interpreted as detection of a nuclear launch event. Source: US-K History by Kosmonavtika. The concept of erroneous truth is a deep thing to internalize, as computerized systems are presented as omniscient, indefective, and absolute. Petrov’s rewards for this action were reprimands, reassignment, and denial of promotion. This was likely for embarrassing his superiors by the politically inconvenient shedding of light on issues with the Oko system. A coerced early retirement caused a nervous breakdown, likely him having to grapple with the weight of his decision. It was only in the 1990s—after the fall of the Soviet Union—that his actions were discovered internationally and celebrated. Stanislav Petrov was given the recognition that he deserved, including being honored by the United Nations, awarded the Dresden Peace Prize, featured in a documentary, and being able to visit a Minuteman Missile silo in the United States. On January 31st, 2025, OpenAI struck a deal with the United States government to use its AI product for nuclear weapon security. It is unclear how this technology will be used, where, and to what extent. It is also unclear how OpenAI’s systems function, as they are black box technologies. What is known is that LLM-generated responses—the product OpenAI sells—are non-deterministic. Non-deterministic systems don’t have guaranteed outputs from their inputs. In addition, LLM-based technology hallucinates—it invents content with no self-knowledge that it is a falsehood. Non-deterministic systems that are computerized also have the perception as being authoritative, the same as their deterministic peers. It is not a question of how the output is generated, it is one of the output being perceived to come from a machine. These are terrifying things to know. Consider not only the systems this technology is being applied to, but also the thoughtless speed of their integration. Then consider how we’ve historically been conditioned and rewarded to interpret the output of these systems, and then how we perceive and treat skeptics. We don’t live in a purely deterministic world of technology anymore. Stanislav Petrov died on September 18th, 2017, before this change occurred. I would be incredibly curious to know his thoughts about our current reality, as well as the increasing abdication of human monitoring of automated systems in favor of notably biased, supposed “AI solutions.” In acknowledging Petrov’s skepticism in a time of mania and political instability, we acknowledge a quote from former U.S. Secretary of Defense William J. Perry’s memoir about the incident: [Oko’s false positives] illustrates the immense danger of placing our fate in the hands of automated systems that are susceptible to failure and human beings who are fallible.
More in programming
I like the job title “Design Engineer”. When required to label myself, I feel partial to that term (I should, I’ve written about it enough). Lately I’ve felt like the term is becoming more mainstream which, don’t get me wrong, is a good thing. I appreciate the diversification of job titles, especially ones that look to stand in the middle between two binaries. But — and I admit this is a me issue — once a title starts becoming mainstream, I want to use it less and less. I was never totally sure why I felt this way. Shouldn’t I be happy a title I prefer is gaining acceptance and understanding? Do I just want to rebel against being labeled? Why do I feel this way? These were the thoughts simmering in the back of my head when I came across an interview with the comedian Brian Regan where he talks about his own penchant for not wanting to be easily defined: I’ve tried over the years to write away from how people are starting to define me. As soon as I start feeling like people are saying “this is what you do” then I would be like “Alright, I don't want to be just that. I want to be more interesting. I want to have more perspectives.” [For example] I used to crouch around on stage all the time and people would go “Oh, he’s the guy who crouches around back and forth.” And I’m like, “I’ll show them, I will stand erect! Now what are you going to say?” And then they would go “You’re the guy who always feels stupid.” So I started [doing other things]. He continues, wondering aloud whether this aversion to not being easily defined has actually hurt his career in terms of commercial growth: I never wanted to be something you could easily define. I think, in some ways, that it’s held me back. I have a nice following, but I’m not huge. There are people who are huge, who are great, and deserve to be huge. I’ve never had that and sometimes I wonder, ”Well maybe it’s because I purposely don’t want to be a particular thing you can advertise or push.” That struck a chord with me. It puts into words my current feelings towards the job title “Design Engineer” — or any job title for that matter. Seven or so years ago, I would’ve enthusiastically said, “I’m a Design Engineer!” To which many folks would’ve said, “What’s that?” But today I hesitate. If I say “I’m a Design Engineer” there are less follow up questions. Now-a-days that title elicits less questions and more (presumed) certainty. I think I enjoy a title that elicits a “What’s that?” response, which allows me to explain myself in more than two or three words, without being put in a box. But once a title becomes mainstream, once people begin to assume they know what it means, I don’t like it anymore (speaking for myself, personally). As Brian says, I like to be difficult to define. I want to have more perspectives. I like a title that befuddles, that doesn’t provide a presumed sense of certainty about who I am and what I do. And I get it, that runs counter to the very purpose of a job title which is why I don’t think it’s good for your career to have the attitude I do, lol. I think my own career evolution has gone something like what Brian describes: Them: “Oh you’re a Designer? So you make mock-ups in Photoshop and somebody else implements them.” Me: “I’ll show them, I’ll implement them myself! Now what are you gonna do?” Them: “Oh, so you’re a Design Engineer? You design and build user interfaces on the front-end.” Me: “I’ll show them, I’ll write a Node server and setup a database that powers my designs and interactions on the front-end. Now what are they gonna do?” Them: “Oh, well, we I’m not sure we have a term for that yet, maybe Full-stack Design Engineer?” Me: “Oh yeah? I’ll frame up a user problem, interface with stakeholders, explore the solution space with static designs and prototypes, implement a high-fidelity solution, and then be involved in testing, measuring, and refining said solution. What are you gonna call that?” [As you can see, I have some personal issues I need to work through…] As Brian says, I want to be more interesting. I want to have more perspectives. I want to be something that’s not so easily definable, something you can’t sum up in two or three words. I’ve felt this tension my whole career making stuff for the web. I think it has led me to work on smaller teams where boundaries are much more permeable and crossing them is encouraged rather than discouraged. All that said, I get it. I get why titles are useful in certain contexts (corporate hierarchies, recruiting, etc.) where you’re trying to take something as complicated and nuanced as an individual human beings and reduce them to labels that can be categorized in a database. I find myself avoiding those contexts where so much emphasis is placed in the usefulness of those labels. “I’ve never wanted to be something you could easily define” stands at odds with the corporate attitude of, “Here’s the job req. for the role (i.e. cog) we’re looking for.” Email · Mastodon · Bluesky
We received over 2,200 applications for our just-closed junior programmer opening, and now we're going through all of them by hand and by human. No AI screening here. It's a lot of work, but we have a great team who take the work seriously, so in a few weeks, we'll be able to invite a group of finalists to the next phase. This highlights the folly of thinking that what it'll take to land a job like this is some specific list of criteria, though. Yes, you have to present a baseline of relevant markers to even get into consideration, like a great cover letter that doesn't smell like AI slop, promising projects or work experience or educational background, etc. But to actually get the job, you have to be the best of the ones who've applied! It sounds self-evident, maybe, but I see questions time and again about it, so it must not be. Almost every job opening is grading applicants on the curve of everyone who has applied. And the best candidate of the lot gets the job. You can't quantify what that looks like in advance. I'm excited to see who makes it to the final stage. I already hear early whispers that we got some exceptional applicants in this round. It would be great to help counter the narrative that this industry no longer needs juniors. That's simply retarded. However good AI gets, we're always going to need people who know the ins and outs of what the machine comes up with. Maybe not as many, maybe not in the same roles, but it's truly utopian thinking that mankind won't need people capable of vetting the work done by AI in five minutes.
Recently I got a question on formal methods1: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is "building the right thing", not "building the thing right". One possible response: "why write tests"? You shouldn't write tests, especially lots of unit tests ahead of time, if you might just throw them all away when the requirements change. This is a bad response because we all know the difference between writing tests and formal methods: testing is easy and FM is hard. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, "high(ish) cost" isn't affordable and "high correctness" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't. But eventually you get something that solves the problem, and what then? Most of us don't work for Google, we can't axe features and products on a whim. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when you add new features that come into conflict. And just as importantly, it should never stop solving their problem. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems. Every successful requirement met spawns a new requirement: "keep this working". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes. (Is this all a pretentious of way of saying "software maintenance is hard?" Maybe!) Phase changes In physics there's a concept of a phase transition. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.2 This continues until you raise it to 100°C, then it stops. After you've added two thousand joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely. Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a "before" and "after" in the system form. Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be Set(key, val), which updates data[key] to val.3 A model of asynchronous updates would be AsyncSet(key, val, priority) adds a (key, val, priority, server_time()) tuple to a tasks set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls Set(key, val). Here are some properties the client may need preserved as a requirement: If AsyncSet(key, val, _, _) is called, then eventually db[key] = val (possibly violated if higher-priority tasks keep coming in) If someone calls AsyncSet(key1, val1, low) and then AsyncSet(key2, val2, low), they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times) If someone calls AsyncSet(key, val, _) and immediately reads db[key] they should get val (obviously violated, though the client may accept a slightly weaker property) If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug before releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like automatically generate test suites. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements stop changing, and then you're stuck with them forever. That's where models shine. As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because "formal specification" is really awkward to say. ↩ Also called a "calorie". The US "dietary Calorie" is actually a kilocalorie. ↩ This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax ↩
While Stripe is a widely admired company for things like its creation of the Sorbet typer project, I personally think that Stripe’s most interesting strategy work is also among its most subtle: its willingness to significantly prioritize API stability. This strategy is almost invisible externally. Internally, discussions around it were frequent and detailed, but mostly confined to dedicated API design conversations. API stability isn’t just a technical design quirk, it’s a foundational decision in an API-driven business, and I believe it is one of the unsung heroes of Stripe’s business success. This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts. Reading this document To apply this strategy, start at the top with Policy. To understand the thinking behind this strategy, read sections in reverse order, starting with Explore. More detail on this structure in Making a readable Engineering Strategy document. Policy & Operation Our policies for managing API changes are: Design for long API lifetime. APIs are not inherently durable. Instead we have to design thoughtfully to ensure they can support change. When designing a new API, build a test application that doesn’t use this API, then migrate to the new API. Consider how integrations might evolve as applications change. Perform these migrations yourself to understand potential friction with your API. Then think about the future changes that we might want to implement on our end. How would those changes impact the API, and how would they impact the application you’ve developed. At this point, take your API to API Review for initial approval as described below. Following that approval, identify a handful of early adopter companies who can place additional pressure on your API design, and test with them before releasing the final, stable API. All new and modified APIs must be approved by API Review. API changes may not be enabled for customers prior to API Review approval. Change requests should be sent to api-review email group. For examples of prior art, review the api-review archive for prior requests and the feedback they received. All requests must include a written proposal. Most requests will be approved asynchronously by a member of API Review. Complex or controversial proposals will require live discussions to ensure API Review members have sufficient context before making a decision. We never deprecate APIs without an unavoidable requirement to do so. Even if it’s technically expensive to maintain support, we incur that support cost. To be explicit, we define API deprecation as any change that would require customers to modify an existing integration. If such a change were to be approved as an exception to this policy, it must first be approved by the API Review, followed by our CEO. One example where we granted an exception was the deprecation of TLS 1.2 support due to PCI compliance obligations. When significant new functionality is required, we add a new API. For example, we created /v1/subscriptions to support those workflows rather than extending /v1/charges to add subscriptions support. With the benefit of hindsight, a good example of this policy in action was the introduction of the Payment Intents APIs to maintain compliance with Europe’s Strong Customer Authentication requirements. Even in that case the charge API continued to work as it did previously, albeit only for non-European Union payments. We manage this policy’s implied technical debt via an API translation layer. We release changed APIs into versions, tracked in our API version changelog. However, we only maintain one implementation internally, which is the implementation of the latest version of the API. On top of that implementation, a series of version transformations are maintained, which allow us to support prior versions without maintaining them directly. While this approach doesn’t eliminate the overhead of supporting multiple API versions, it significantly reduces complexity by enabling us to maintain just a single, modern implementation internally. All API modifications must also update the version transformation layers to allow the new version to coexist peacefully with prior versions. In the future, SDKs may allow us to soften this policy. While a significant number of our customers have direct integrations with our APIs, that number has dropped significantly over time. Instead, most new integrations are performed via one of our official API SDKs. We believe that in the future, it may be possible for us to make more backwards incompatible changes because we can absorb the complexity of migrations into the SDKs we provide. That is certainly not the case yet today. Diagnosis Our diagnosis of the impact on API changes and deprecation on our business is: If you are a small startup composed of mostly engineers, integrating a new payments API seems easy. However, for a small business without dedicated engineers—or a larger enterprise involving numerous stakeholders—handling external API changes can be particularly challenging. Even if this is only marginally true, we’ve modeled the impact of minimizing API changes on long-term revenue growth, and it has a significant impact, unlocking our ability to benefit from other churn reduction work. While we believe API instability directly creates churn, we also believe that API stability directly retains customers by increasing the migration overhead even if they wanted to change providers. Without an API change forcing them to change their integration, we believe that hypergrowth customers are particularly unlikely to change payments API providers absent a concrete motivation like an API change or a payment plan change. We are aware of relatively few companies that provide long-term API stability in general, and particularly few for complex, dynamic areas like payments APIs. We can’t assume that companies that make API changes are ill-informed. Rather it appears that they experience a meaningful technical debt tradeoff between the API provider and API consumers, and aren’t willing to consistently absorb that technical debt internally. Future compliance or security requirements—along the lines of our upgrade from TLS 1.2 to TLS 1.3 for PCI—may necessitate API changes. There may also be new tradeoffs exposed as we enter new markets with their own compliance regimes. However, we have limited ability to predict these changes at this point.
I've been biking in Brooklyn for a few years now! It's hard for me to believe it, but I'm now one of the people other bicyclists ask questions to now. I decided to make a zine that answers the most common of those questions: Bike Brooklyn! is a zine that touches on everything I wish I knew when I started biking in Brooklyn. A lot of this information can be found in other resources, but I wanted to collect it in one place. I hope to update this zine when we get significantly more safe bike infrastructure in Brooklyn and laws change to make streets safer for bicyclists (and everyone) over time, but it's still important to note that each release will reflect a specific snapshot in time of bicycling in Brooklyn. All text and illustrations in the zine are my own. Thank you to Matt Denys, Geoffrey Thomas, Alex Morano, Saskia Haegens, Vishnu Reddy, Ben Turndorf, Thomas Nayem-Huzij, and Ryan Christman for suggestions for content and help with proofreading. This zine is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, so you can copy and distribute this zine for noncommercial purposes in unadapted form as long as you give credit to me. Check out the Bike Brooklyn! zine on the web or download pdfs to read digitally or print here!