More from tonsky.me
Imagine you’re writing a project and need a library. Let’s call it libpupa. You look up its current version, which is 1.2.3, and add it to your dependencies: "libpupa": "1.2.3" In turn, the developer of libpupa, when writing its version 1.2.3, needed another library: liblupa. So they did the same thing: they looked up the version, which was 0.7.8 at the time, and added it to the dependencies of libpupa 1.2.3: "liblupa": "0.7.8" The version 0.7.8 of liblupa is immortalized forever in the dependencies of libpupa 1.2.3. No matter how many other versions of either liblupa or libpupa are released, libpupa 1.2.3 will always depend on liblupa 0.7.8. Our dependency resolution algorithm thus is like this: Get the top-level dependency versions Look up versions of libraries they depend on Look up versions of libraries they depend on ... The important point of this algorithm is that it’s fully deterministic. Given just the top-level dependencies, it will produce the entire dependency tree, identical every time. It’s also space-efficient: you don’t need to specify all the versions, just the top-level ones. Given libpupa 1.2.3, we will always arrive at liblupa 0.7.8. So why write it down in a separate file? And that’s it. End of story. Write down your top-level dependencies. The computer will figure out transitive ones. They are guaranteed to be the same, since everything is immutable. The sun is shining, the grass is green, and builds are fully reproducible. But people had to invent lockfiles. Imagine you voluntarily made your build non-reproducible by making them depend on time. If I build my app now, I get libpupa 1.2.3 and liblupa 0.7.8. If I repeat the same build in 10 minutes, I’ll get liblupa 0.7.9. Crazy, right? That would be chaos. But this is what version ranges essentially are. Instead of saying “libpupa 1.2.3 depends on liblupa 0.7.8”, they are saying “libpupa 1.2.3 depends on whatever the latest liblupa version is at the time of the build.” Notice that this is determined not at the time of publishing, but at the time of the build! If the author of libpupa has published 1.2.3 a year ago and I’m pulling it now, I might be using a liblupa version that didn’t even exist at the time of publishing! But... why would libpupa’s author write a version range that includes versions that don’t exist yet? How could they know that liblupa 0.7.9, whenever it will be released, will continue to work with libpupa? Surely they can’t see the future? Semantic versioning is a hint, but it has never been a guarantee. For that, kids, I have no good answer. The funny thing is, these version ranges end up not being used anyway. You lock your dependencies once in a lockfile and they stay there, unchanged. You don’t even get the good part! I guess, builds that depend on the calendar date are too crazy even for people who believe that referencing non-existing versions is fine. “But Niki, you can regenerate the lockfile and pull in all the new dependencies!” Sure. In exactly the same way you can update your top-level dependencies. “But Niki, lockfiles help resolve version conflicts!” In what way? Version conflicts don’t happen because of what’s written in dependency files. Your library might work with the newer dependency, or it might not. It doesn’t really depend on what the library’s author has guessed. Your library might have a version range of 0.7.*, work with 0.7.8, 0.7.9 but not with 0.7.10. Either way, the solution is the same: you have to pick the version that works. And the fact that someone somewhere long time ago wrote 0.7.* doesn’t really help you. “But Niki, if lockfiles exist, there must be a reason! People can’t be doing it for nothing!” You are new in IT, I see. People absolutely can and do things here for no good reason all the time. But if you want an existence proof: Maven. The Java library ecosystem has been going strong for 20 years, and during that time not once have we needed a lockfile. And we are pulling hundreds of libraries just to log two lines of text, so it is actively used at scale. In conclusion, lockfiles are an absolutely unnecessary concept that complicates things without a good reason. Dependency managers can and are working without it just the same.
Any person who has used a computer in the past ten years knows that doing meaningless tasks is just part of the experience. Millions of people create accounts, confirm emails, dismiss notifications, solve captchas, reject cookies, and accept terms and conditions—not because they particularly want to or even need to. They do it because that’s what the computer told them to do. Like it or not, we are already serving the machines. Well, now there is a new way to serve our silicon overlords. LLMs started to have opinions on how your API should look, and since 90% of all code will be written by AI comes September, we have no choice but to oblige. You might’ve heard a story of Soundslice adding a feature because ChatGPT kept telling people it exists. We see the same at Instant: for example, we used tx.update for both inserting and updating entities, but LLMs kept writing tx.create instead. Guess what: we now have tx.create, too. Is it good or is it bad? It definitely feels strange. In a sense, it’s helpful: LLMs here have seen millions of other APIs and are suggesting the most obvious thing, something every developer would think of first, too. It’s also a unique testing device: if developers use your API wrong, they blame themselves, read the documentation, and fix their code. In the end, you might never learn that they even had the problem. But with ChatGPT, you yourself can experience “newbie’s POV” at any time. Of course, this approach doesn’t work if you are trying to do something new and unique. LLMs just won’t “get it”. But how many of us are doing something new and unique? Maybe, API is not the place to get clever? Maybe, for most cases, it’s truly best if you did the most obvious thing? So welcome to the new era. AI is not just using tools we gave it. It now has opinions about how these tools should’ve been made. And instead of asking nicely, it gaslights everybody into thinking that’s how it’s always been.
Чем Datomic отличается от других баз данных и почему иногда остутствие оптимизатора лучше, чем его присутствие
We’ll explore the complexities of traditional stack (db-server-frontend), develop a theory of software evolution: which systems succeed and why. Then we’ll see how local-first fits into it and which local-first-adjacent practices are making software development easier and therefore are doomed for success (or not?)
More in programming
The 2025 edition of the TokyoDev Developer Survey is now live! If you’re a software developer living in Japan, please take a few minutes to participate. All questions are optional, and it should take less than 10 minutes to complete. The survey will remain open until September 30th. Last year, we received over 800 responses. Highlights included: Median compensation remained stable. The pay gap between international and Japanese companies narrowed to 47%. Fewer respondents had the option to work fully remotely. For 2025, we’ve added several new questions, including a dedicated section on one of the most talked-about topics in development today: AI. The survey is completely anonymous, and only aggregated results will be shared—never personally identifiable information. The more responses we get, the deeper and more meaningful our insights will be. Please help by taking the survey and sharing it with your peers!
Greetings everyone! You might have noticed that it's September and I don't have the next version of Logic for Programmers ready. As penance, here's ten free copies of the book. So a few months ago I wrote a newsletter about how we use nondeterminism in formal methods. The overarching idea: Nondeterminism is when multiple paths are possible from a starting state. A system preserves a property if it holds on all possible paths. If even one path violates the property, then we have a bug. An intuitive model of this is that for this is that when faced with a nondeterministic choice, the system always makes the worst possible choice. This is sometimes called demonic nondeterminism and is favored in formal methods because we are paranoid to a fault. The opposite would be angelic nondeterminism, where the system always makes the best possible choice. A property then holds if any possible path satisfies that property.1 This is not as common in FM, but it still has its uses! "Players can access the secret level" or "We can always shut down the computer" are reachability properties, that something is possible even if not actually done. In broader computer science research, I'd say that angelic nondeterminism is more popular, due to its widespread use in complexity analysis and programming languages. Complexity Analysis P is the set of all "decision problems" (basically, boolean functions) can be solved in polynomial time: there's an algorithm that's worst-case in O(n), O(n²), O(n³), etc.2 NP is the set of all problems that can be solved in polynomial time by an algorithm with angelic nondeterminism.3 For example, the question "does list l contain x" can be solved in O(1) time by a nondeterministic algorithm: fun is_member(l: List[T], x: T): bool { if l == [] {return false}; guess i in 0..<(len(l)-1); return l[i] == x; } Say call is_member([a, b, c, d], c). The best possible choice would be to guess i = 2, which would correctly return true. Now call is_member([a, b], d). No matter what we guess, the algorithm correctly returns false. and just return false. Ergo, O(1). NP stands for "Nondeterministic Polynomial". (And I just now realized something pretty cool: you can say that P is the set of all problems solvable in polynomial time under demonic nondeterminism, which is a nice parallel between the two classes.) Computer scientists have proven that angelic nondeterminism doesn't give us any more "power": there are no problems solvable with AN that aren't also solvable deterministically. The big question is whether AN is more efficient: it is widely believed, but not proven, that there are problems in NP but not in P. Most famously, "Is there any variable assignment that makes this boolean formula true?" A polynomial AN algorithm is again easy: fun SAT(f(x1, x2, …: bool): bool): bool { N = num_params(f) for i in 1..=num_params(f) { guess x_i in {true, false} } return f(x_1, x_2, …) } The best deterministic algorithms we have to solve the same problem are worst-case exponential with the number of boolean parameters. This a real frustrating problem because real computers don't have angelic nondeterminism, so problems like SAT remain hard. We can solve most "well-behaved" instances of the problem in reasonable time, but the worst-case instances get intractable real fast. Means of Abstraction We can directly turn an AN algorithm into a (possibly much slower) deterministic algorithm, such as by backtracking. This makes AN a pretty good abstraction over what an algorithm is doing. Does the regex (a+b)\1+ match "abaabaabaab"? Yes, if the regex engine nondeterministically guesses that it needs to start at the third letter and make the group aab. How does my PL's regex implementation find that match? I dunno, backtracking or NFA construction or something, I don't need to know the deterministic specifics in order to use the nondeterministic abstraction. Neel Krishnaswami has a great definition of 'declarative language': "any language with a semantics has some nontrivial existential quantifiers in it". I'm not sure if this is identical to saying "a language with an angelic nondeterministic abstraction", but they must be pretty close, and all of his examples match: SQL's selects and joins Parsing DSLs Logic programming's unification Constraint solving On top of that I'd add CSS selectors and planner's actions; all nondeterministic abstractions over a deterministic implementation. He also says that the things programmers hate most in declarative languages are features that "that expose the operational model": constraint solver search strategies, Prolog cuts, regex backreferences, etc. Which again matches my experiences with angelic nondeterminism: I dread features that force me to understand the deterministic implementation. But they're necessary, since P probably != NP and so we need to worry about operational optimizations. Eldritch Nondeterminism If you need to know the ratio of good/bad paths, the number of good paths, or probability, or anything more than "there is a good path" or "there is a bad path", you are beyond the reach of heaven or hell. Angelic and demonic nondeterminism are duals: angelic returns "yes" if some choice: correct and demonic returns "no" if !all choice: correct, which is the same as some choice: !correct. ↩ Pet peeve about Big-O notation: O(n²) is the set of all algorithms that, for sufficiently large problem sizes, grow no faster that quadratically. "Bubblesort has O(n²) complexity" should be written Bubblesort in O(n²), not Bubblesort = O(n²). ↩ To be precise, solvable in polynomial time by a Nondeterministic Turing Machine, a very particular model of computation. We can broadly talk about P and NP without framing everything in terms of Turing machines, but some details of complexity classes (like the existence "weak NP-hardness") kinda need Turing machines to make sense. ↩
(I present to you my stream of consciousness on the topic of casing as it applies to the web platform.) I’m reading about the new command and commandfor attributes — which I’m super excited about, declarative behavior invocation in HTML? YES PLEASE!! — and one thing that strikes me is the casing in these APIs. For example, the command attribute has a variety of values in HTML which correspond to APIs in JavaScript. The show-popover attribute value maps to .showPopover() in JavaScript. hide-popover maps to .hidePopover(), etc. So what we have is: lowercase in attribute names e.g. commandfor="..." kebab-case in attribute values e.g. show-popover camelCase for JS counterparts e.g. showPopover() After thinking about this a little more, I remember that HTML attributes names are case insensitive, so the browser will normalize them to lowercase during parsing. Given that, I suppose you could write commandFor="..." but it’s effectively the same. Ok, lowercase attribute names in HTML makes sense. The related popover attributes follow the same convention: popovertarget popovertargetaction And there are many other attribute names in HTML that are lowercase, e.g.: maxlength novalidate contenteditable autocomplete formenctype So that all makes sense. But wait, there are some attribute names with hyphens in them, like aria-label="..." and data-value="...". So why isn’t it command-for="..."? Well, upon further reflection, I suppose those attributes were named that way for extensibility’s sake: they are essentially wildcard attributes that represent a family of attributes that are all under the same namespace: aria-* and data-*. But wait, isn’t that an argument for doing popover-target and popover-target-action? Or command and command-for? But wait (I keep saying that) there are kebab-case attribute names in HTML — like http-equiv on the <meta> tag, or accept-charset on the form tag — but those seem more like legacy exceptions. It seems like the only answer here is: there is no rule. Naming is driven by convention and decisions are made on a case-by-case basis. But if I had to summarize, it would probably be that the default casing for new APIs tends to follow the rules I outlined at the start (and what’s reflected in the new command APIs): lowercase for HTML attributes names kebab-case for HTML attribute values camelCase for JS counterparts Let’s not even get into SVG attribute names We need one of those “bless this mess” signs that we can hang over the World Wide Web. Email · Mastodon · Bluesky
Learn how to use Vitest’s defaults to eliminate extra configuration and prevent flaky results, letting you write reliable tests with less effort.
In my previous post, I advocate turning against the unproductive. Whenever you decide to turn against a group, it’s very important to prevent purity spirals. There needs to be a bright line that doesn’t move. Here is that line. You should be, on net, producing more than you are consuming. You shouldn’t feel bad if you are producing less than you could be. But at the end of your life, total it all up. You should have produced more than you consumed. We used to make shit in this country, build shit. It needs to stop. I have to believe that the average person is net positive, because if they aren’t, we’re already too far gone, and any prospect of a democracy is over. But if we aren’t too far gone, we have to stop the hemorrhaging. The unproductive rich are in cahoots with the unproductive poor to take from you. And it’s really the unproductive rich that are the problem. They loudly frame helping the unproductive as a moral issue for helping the poor because they know deep down they are unproductive losers. But they aren’t beyond saving. They just need to make different choices. This cultural change starts with you. Private equity, market manipulators, real estate, lawyers, lobbyists. This is no longer okay. You know the type of person I’m talking about. Let’s elevate farmers, engineers, manufacturing, miners, construction, food prep, delivery, operations. Jobs that produce value that you can point to. There’s a role for everyone in society. From productive billionaires to the fry cook at McDonalds. They are both good people. But negative sum jobs need to no longer be socially okay. The days of living off the work of everyone else are over. We live in a society. You have to produce more than you consume.