Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
17
No newsletter next week I’ll be speaking at Systems Distributed. My talk isn't close to done yet, which is why this newsletter is both late and short. Solving LinkedIn Queens in SMT The article Modern SAT solvers: fast, neat and underused claims that SAT solvers1 are "criminally underused by the industry". A while back on the newsletter I asked "why": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. I was reminded of this when I read Ryan Berger's post on solving “LinkedIn Queens” as a SAT problem. A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they cannot be adjacently diagonal. (Important note: Linkedin “Queens” is a variation on the puzzle game Star Battle, which is the...
3 weeks ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Computer Things

Logical Quantifiers in Software

I realize that for all I've talked about Logic for Programmers in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! Sets and quantifiers A set is a collection of unordered, unique elements. {1, 2, 3, …} is a set, as are "every programming language", "every programming language's Wikipedia page", and "every function ever defined in any programming language's standard library". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.2 Once we have a set, we can ask "is something true for all elements of the set" and "is something true for at least one element of the set?" IE, is it true that every programming language has a set collection type in the core language? We would write it like this: # all of them all l in ProgrammingLanguages: HasSetType(l) # at least one some l in ProgrammingLanguages: HasSetType(l) This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was ∀x ∈ set: P(x) to mean all x in set, and ∃ to mean some. I use these when writing for just myself, but find them confusing to programmers when communicating. "All" and "some" are respectively referred to as "universal" and "existential" quantifiers. Some cool properties We can simplify expressions with quantifiers, in the same way that we can simplify !(x && y) to !x || !y. First of all, quantifiers are commutative with themselves. some x: some y: P(x,y) is the same as some y: some x: P(x, y). For this reason we can write some x, y: P(x,y) as shorthand. We can even do this when quantifying over different sets, writing some x, x' in X, y in Y instead of some x, x' in X: some y in Y. We can not do this with "alternating quantifiers": all p in Person: some m in Person: Mother(m, p) says that every person has a mother. some m in Person: all p in Person: Mother(m, p) says that someone is every person's mother. Second, existentials distribute over || while universals distribute over &&. "There is some url which returns a 403 or 404" is the same as "there is some url which returns a 403 or some url that returns a 404", and "all PRs pass the linter and the test suites" is the same as "all PRs pass the linter and all PRs pass the test suites". Finally, some and all are duals: some x: P(x) == !(all x: !P(x)), and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign. All these rules together mean we can manipulate quantifiers almost as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. Speaking of which, how do we use this in in programming? How we use this in programming First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form: for x in list: if P(x): return true return false That's just some x in list: P(x). And this is a prevalent pattern, as you can see by using GitHub code search. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be any(P(x) for x in list). (Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.) More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That all i, j in 0..<len(l): if i < j then l[i] <= l[j]. When should a ratchet test fail? When some f in functions - exceptions: Uses(f, bad_function). Should the image classifier work upside down? all i in images: classify(i) == classify(rotate(i, 180)). These are the properties we verify with tests and types and MISU and whatnot;1 it helps to be able to make them explicit! One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like all a in accounts: a.balance > 0. That's enforceable with a CHECK constraint. But what about something like all i, i' in intervals: NoOverlap(i, i')? That isn't covered by CHECK, since it spans two rows. Quantifier duality to the rescue! The invariant is equivalent to !(some i, i' in intervals: Overlap(i, i')), so is preserved if the query SELECT COUNT(*) FROM intervals CROSS JOIN intervals … returns 0 rows. This means we can test it via a database trigger.3 There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's crazy how crude v0.1 was compared to the current version. MISU ("make illegal states unrepresentable") means using data representations that rule out invalid values. For example, if you have a location -> Optional(item) lookup and want to make sure that each item is in exactly one location, consider instead changing the map to item -> location. This is a means of implementing the property all i in item, l, l' in location: if ItemIn(i, l) && l != l' then !ItemIn(i, l'). ↩ Specifically, a set can't be an element of itself, which rules out constructing things like "the set of all sets" or "the set of sets that don't contain themselves". ↩ Though note that when you're inserting or updating an interval, you already have that row's fields in the trigger's NEW keyword. So you can just query !(some i in intervals: Overlap(new, i')), which is more efficient. ↩

4 days ago 10 votes
You can cheat a test suite with a big enough polynomial

Hi nerds, I'm back from Systems Distributed! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter. In an earlier version of my talk, I had a gag about unit tests. First I showed the test f([1,2,3]) == 3, then said that this was satisfied by f(l) = 3, f(l) = l[-1], f(l) = len(l), f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test. If you're given some function of f(x: int, y: int, …): int and a set of unit tests asserting specific inputs give specific outputs, then you can find a polynomial that passes every single unit test. To find the gag, and as SMT practice, I wrote a Python program that finds a polynomial that passes a test suite meant for max. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort. The code Full code here, breakdown below. from z3 import * # type: ignore s1, s2 = Solver(), Solver() Z3 is just the particular SMT solver we use, as it has good language bindings and a lot of affordances. As part of learning SMT I wanted to do this two ways. First by putting the polynomial "outside" of the SMT solver in a python function, second by doing it "natively" in Z3. I created two solvers so I could test both versions in one run. a0, a, b, c, d, e, f = Consts('a0 a b c d e f', IntSort()) x, y, z = Ints('x y z') t = "a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0" Both Const('x', IntSort()) and Int('x') do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. To keep the two versions in sync I represented the equation as a string, which I later eval. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a "2nd-order polynomial", even though it doesn't have x^2 terms, as it has xy and xz terms. lambdamax = lambda x, y, z: eval(t) z3max = Function('z3max', IntSort(), IntSort(), IntSort(), IntSort()) s1.add(ForAll([x, y, z], z3max(x, y, z) == eval(t))) lambdamax is pretty straightforward: create a lambda with three parameters and eval the string. The string "a*x" then becomes the python expression a*x, a is an SMT symbol, while the x SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster. z3max function is a little more complex. Function takes an identifier string and N "sorts" (roughly the same as programming types). The first N-1 sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier "z3max" to be a function with signature (int, int, int) -> int. I can load the function into the model by specifying constraints on what z3max could be. This could either be a strict input/output, as will be done later, or a ForAll over all possible inputs. Here I just use that directly to say "for all inputs, the function should match this polynomial." But I could do more complicated constraints, like commutativity (f(x, y) == f(y, x)) or monotonicity (Implies(x < y, f(x) <= f(y))). Note ForAll takes a list of z3 symbols to quantify over. That's the only reason we need to define x, y, z in the first place. The lambda version doesn't need them. inputs = [(1,2,3), (4, 2, 2), (1, 1, 1), (3, 5, 4)] for g in inputs: s1.add(z3max(*g) == max(*g)) s2.add(lambdamax(*g) == max(*g)) This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet. for s, func in [(s1, z3max), (s2, lambdamax)]: if s.check() == sat: m = s.model() for x, y, z in inputs: print(f"max([{x}, {y}, {z}]) =", m.evaluate(func(x, y, z))) print(f"max([x, y, z]) = {m[a]}x + {m[b]}y", f"+ {m[c]}z +", # linebreaks added for newsletter rendering f"{m[d]}xy + {m[e]}xz + {m[f]}yz + {m[a0]}\n") Output: max([1, 2, 3]) = 3 # etc max([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0 max([1, 2, 3]) = 3 # etc max([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0 I find that z3max (top) consistently finds larger coefficients than lambdamax does. I don't know why. Practical Applications Test-Driven Development recommends a strict "red-green refactor" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests! Pedagogical Notes Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to learn SMT and LLMs may decrease learning retention.1 Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it! Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did. Caveat I have not actually read the study, for all I know it could have a sample size of three people, I'll get around to it eventually ↩

a week ago 14 votes
AI is a gamechanger for TLA+ users

New Logic for Programmers Release v0.10 is now available! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at the changelog page. Due to conference pressure v0.11 will also likely be a minor release. AI is a gamechanger for TLA+ users TLA+ is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. That's why The Coming AI Revolution in Distributed Systems caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work". This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that LLMs are an immense specification force multiplier. All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc. Things Claude was good at Fixing syntax errors TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax: NotThree(x) = \* should be ==, not = x != 3 \* should be #, not != The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet: Was expecting "==== or more Module body" Encountered "NotThree" at line 6, column 1 That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this. The TLA+ foundation has made LLM integration a priority, so the VSCode extension naturally supports several agents actions. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+. This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry. Understanding error traces When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table: Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well. Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it. I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt. As with syntax checking, if this was the only thing LLMs could effectively do, that would already be enough1 to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. Boilerplate tasks TLA+ has a lot of boilerplate. One of the most notorious examples is UNCHANGED rules. Specifications are extremely precise — so precise that you have to specify what variables don't change in every step. This takes the form of an UNCHANGED clause at the end of relevant actions: RemoveObjectFromStore(srv, o, s) == /\ o \in stored[s] /\ stored' = [stored EXCEPT ![s] = @ \ {o}] /\ UNCHANGED <<capacity, log, objectsize, pc>> Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for myself. I took a spec and prompted Claude Add UNCHANGED <> for each variable not changed in an action. And it worked! It successfully updated the UNCHANGED in every action. (Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an UNCHANGED clause. I haven't tested how Claude handles that!) That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating vars (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like Int, converting them to types like [Process -> Int], and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work. Writing properties from an informal description You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with. Things it is less good at Generating model config files To model check TLA+, you need both a specification (.tla) and a model config file (.cfg), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. Fixing specs Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. But that's a huge mistake in specification, because race conditions happen if we don't have coordination. We need to specify the mechanism that is supposed to prevent them. Finding properties of the spec After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated". Generating code from specs I have to be specific here: Claude could sometimes convert Python into a passable spec, an vice versa. It wasn't good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called pc. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires pc = "Get" and sets the new value to "Inc", then another that requires it be "Inc" and sets it to "Done". I found that Claude would try to somehow convert pc into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like sleep into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism. For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though. Unexplored Applications Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI: Writing Java Overrides Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like executing programs during model-checking or dynamically constrain the depth of the searched state space. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. Writing specs, given a reference mechanism In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. (Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?) Connecting specs and code This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. This blog post discusses both. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies. Thoughts Right now, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was. I have mixed thoughts on this. As an advocate, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a professional TLA+ consultant, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies! Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a free book available online, as does the inventor of TLA+. I like this guide too. Happy modeling! Dayenu. ↩

a month ago 15 votes
What does "Undecidable" mean, anyway

Systems Distributed I'll be speaking at Systems Distributed next month! The talk is brand new and will aim to showcase some of the formal methods mental models that would be useful in mainstream software development. It has added some extra stress on my schedule, though, so expect the next two monthly releases of Logic for Programmers to be mostly minor changes. What does "Undecidable" mean, anyway Last week I read Against Curry-Howard Mysticism, which is a solid article I recommend reading. But this newsletter is actually about one comment: I like to see posts like this because I often feel like I can’t tell the difference between BS and a point I’m missing. Can we get one for questions like “Isn’t XYZ (Undecidable|NP-Complete|PSPACE-Complete)?” I've already written one of these for NP-complete, so let's do one for "undecidable". Step one is to pull a technical definition from the book Automata and Computability: A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (pg 220) Step two is to translate the technical computer science definition into more conventional programmer terms. Warning, because this is a newsletter and not a blog post, I might be a little sloppy with terms. Machines and Decision Problems In automata theory, all inputs to a "program" are strings of characters, and all outputs are "true" or "false". A program "accepts" a string if it outputs "true", and "rejects" if it outputs "false". You can think of this as automata studying all pure functions of type f :: string -> boolean. Problems solvable by finding such an f are called "decision problems". This covers more than you'd think, because we can bootstrap more powerful functions from these. First, as anyone who's programmed in bash knows, strings can represent any other data. Second, we can fake non-boolean outputs by instead checking if a certain computation gives a certain result. For example, I can reframe the function add(x, y) = x + y as a decision problem like this: IS_SUM(str) { x, y, z = split(str, "#") return x + y == z } Then because IS_SUM("2#3#5") returns true, we know 2 + 3 == 5, while IS_SUM("2#3#6") is false. Since we can bootstrap parameters out of strings, I'll just say it's IS_SUM(x, y, z) going forward. A big part of automata theory is studying different models of computation with different strengths. One of the weakest is called "DFA". I won't go into any details about what DFA actually can do, but the important thing is that it can't solve IS_SUM. That is, if you give me a DFA that takes inputs of form x#y#z, I can always find an input where the DFA returns true when x + y != z, or an input which returns false when x + y == z. It's really important to keep this model of "solve" in mind: a program solves a problem if it correctly returns true on all true inputs and correctly returns false on all false inputs. (total) Turing Machines A Turing Machine (TM) is a particular type of computation model. It's important for two reasons: By the Church-Turing thesis, a Turing Machine is the "upper bound" of how powerful (physically realizable) computational models can get. This means that if an actual real-world programming language can solve a particular decision problem, so can a TM. Conversely, if the TM can't solve it, neither can the programming language.1 It's possible to write a Turing machine that takes a textual representation of another Turing machine as input, and then simulates that Turing machine as part of its computations. Property (1) means that we can move between different computational models of equal strength, proving things about one to learn things about another. That's why I'm able to write IS_SUM in a pseudocode instead of writing it in terms of the TM computational model (and why I was able to use split for convenience). Property (2) does several interesting things. First of all, it makes it possible to compose Turing machines. Here's how I can roughly ask if a given number is the sum of two primes, with "just" addition and boolean functions: IS_SUM_TWO_PRIMES(z): x := 1 y := 1 loop { if x > z {return false} if IS_PRIME(x) { if IS_PRIME(y) { if IS_SUM(x, y, z) { return true; } } } y := y + 1 if y > x { x := x + 1 y := 0 } } Notice that without the if x > z {return false}, the program would loop forever on z=2. A TM that always halts for all inputs is called total. Property (2) also makes "Turing machines" a possible input to functions, meaning that we can now make decision problems about the behavior of Turing machines. For example, "does the TM M either accept or reject x within ten steps?"2 IS_DONE_IN_TEN_STEPS(M, x) { for (i = 0; i < 10; i++) { `simulate M(x) for one step` if(`M accepted or rejected`) { return true } } return false } Decidability and Undecidability Now we have all of the pieces to understand our original definition: A property P of strings is said to be decidable if ... there is a total Turing machine that accepts input strings that have property P and rejects those that do not. (220) Let IS_P be the decision problem "Does the input satisfy P"? Then IS_P is decidable if it can be solved by a Turing machine, ie, I can provide some IS_P(x) machine that always accepts if x has property P, and always rejects if x doesn't have property P. If I can't do that, then IS_P is undecidable. IS_SUM(x, y, z) and IS_DONE_IN_TEN_STEPS(M, x) are decidable properties. Is IS_SUM_TWO_PRIMES(z) decidable? Some analysis shows that our corresponding program will either find a solution, or have x>z and return false. So yes, it is decidable. Notice there's an asymmetry here. To prove some property is decidable, I need just to need to find one program that correctly solves it. To prove some property is undecidable, I need to show that any possible program, no matter what it is, doesn't solve it. So with that asymmetry in mind, do are there any undecidable problems? Yes, quite a lot. Recall that Turing machines can accept encodings of other TMs as input, meaning we can write a TM that checks properties of Turing machines. And, by Rice's Theorem, almost every nontrivial semantic3 property of Turing machines is undecidable. The conventional way to prove this is to first find a single undecidable property H, and then use that to bootstrap undecidability of other properties. The canonical and most famous example of an undecidable problem is the Halting problem: "does machine M halt on input i?" It's pretty easy to prove undecidable, and easy to use it to bootstrap other undecidability properties. But again, any nontrivial property is undecidable. Checking a TM is total is undecidable. Checking a TM accepts any inputs is undecidable. Checking a TM solves IS_SUM is undecidable. Etc etc etc. What this doesn't mean in practice I often see the halting problem misconstrued as "it's impossible to tell if a program will halt before running it." This is wrong. The halting problem says that we cannot create an algorithm that, when applied to an arbitrary program, tells us whether the program will halt or not. It is absolutely possible to tell if many programs will halt or not. It's possible to find entire subcategories of programs that are guaranteed to halt. It's possible to say "a program constructed following constraints XYZ is guaranteed to halt." The actual consequence of undecidability is more subtle. If we want to know if a program has property P, undecidability tells us We will have to spend time and mental effort to determine if it has P We may not be successful. This is subtle because we're so used to living in a world where everything's undecidable that we don't really consider what the counterfactual would be like. In such a world there might be no need for Rust, because "does this C program guarantee memory-safety" is a decidable property. The entire field of formal verification could be unnecessary, as we could just check properties of arbitrary programs directly. We could automatically check if a change in a program preserves all existing behavior. Lots of famous math problems could be solved overnight. (This to me is a strong "intuitive" argument for why the halting problem is undecidable: a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's too powerful, so we should expect it to be impossible.) But because we don't live in that world, all of those things are hard problems that take effort and ingenuity to solve, and even then we often fail. To be pendantic, a TM can't do things like "scrape a webpage" or "render a bitmap", but we're only talking about computational decision problems here. ↩ One notation I've adopted in Logic for Programmers is marking abstract sections of pseudocode with backticks. It's really handy! ↩ Nontrivial meaning "at least one TM has this property and at least one TM doesn't have this property". Semantic meaning "related to whether the TM accepts, rejects, or runs forever on a class of inputs". IS_DONE_IN_TEN_STEPS is not a semantic property, as it doesn't tell us anything about inputs that take longer than ten steps. ↩

a month ago 17 votes

More in programming

My first year since coming back to Linux

<![CDATA[It has been a year since I set up my System76 Merkaat with Linux Mint. In July of 2024 I migrated from ChromeOS and the Merkaat has been my daily driver on the desktop. A year later I have nothing major to report, which is the point. Despite the occasional unplanned reinstallation I have been enjoying the stability of Linux and just using the PC. This stability finally enabled me to burn bridges with mainstream operating systems and fully embrace Linux and open systems. I'm ready to handle the worst and get back to work. Just a few years ago the frustration of troubleshooting a broken system would have made me seriously consider the switch to a proprietary solution. But a year of regular use, with an ordinary mix of quiet moments and glitches, gave me the confidence to stop worrying and learn to love Linux. linux a href="https://remark.as/p/journal.paoloamoroso.com/my-first-year-since-coming-back-to-linux"Discuss.../a Email | Reply @amoroso@oldbytes.space !--emailsub--]]>

2 hours ago 2 votes
Overanalyzing a minor quirk of Espressif’s reset circuit

The mystery In the previous article, I briefly mentioned a slight difference between the ESP-Prog and the reproduced circuit, when it comes to EN: Focusing on EN, it looks like the voltage level goes back to 3.3V much faster on the ESP-Prog than on the breadboard circuit. The grid is horizontally spaced at 2ms, so … Continue reading Overanalyzing a minor quirk of Espressif’s reset circuit → The post Overanalyzing a minor quirk of Espressif’s reset circuit appeared first on Quentin Santos.

2 hours ago 1 votes
Explaining nil interface{} gotcha in Go

Explaining nil interface{} gotcha in Go A footgun In Go empty interface is an interface without any methods, typed as interface{}. A zero value of interface{} is nil: var v interface{} // compiler sets this to nil, you could explicitly write = nil if v == nil { fmt.Printf("v is nil\n") } else { fmt.Printf("v is NOT nil\n") } Try online This prints: v is nil. However, this sometimes trips people up: type Foo struct { } var v interface{} var nilFoo *Foo // implicilty initialized by compiler to nil if nilFoo == nil { fmt.Printf("nilFoo is nil.") } else { fmt.Printf("nilFoo is NOT nil.") } v = nilFoo if v == nil { fmt.Printf("v is nil\n") } else { fmt.Printf("v is NOT nil\n") } Try online This prints: nilFoo is nil. v is NOT nil. On surface level, this is wrong: t is a nil. We assigned a nil to v but it doesn’t equal to nil? How to check if interface{} is nil of any pointer type? func isNilPointer(i interface{}) bool { if i == nil { return false // interface itself is nil } v := reflect.ValueOf(i) return v.Kind() == reflect.Ptr && v.IsNil() } type Foo struct { } var pf *Foo var v interface{} = pf if isNilPointer(v) { fmt.Printf("v is nil pointer\n") } else { fmt.Printf("v is NOT nil pointer\n") } Try online Why There’s a reason for this perplexing behavior. nil is an abstract value. If you come from C/C++ or Java/C#, you might think that this is equivalent of NULL pointer or null reference. It isn’t. nil is a symbol that represents a zero value of pointers, channels, maps, slices. Logically interface{} combines type and value. You can think of it as a tuple (type, value). An uninitialized value of interface{} is a tuple without a type and value (no type, no value). In Go uninitialized value is zero value and since nil is an abstract value representing zero value for several types, it makes sense to use it for zero value of interface{}. So: zero value of interface{} is nil which is (no type, no value). When we assigned nilFoo to v, the value is (*Foo, nil). Are you surprised that (no type, no value) is not the same as (*Foo, nil)? To understand this gotcha, you have to understand two things. One: nil is an abstract value that only has a meaning in context. Consider this: var ch chan (bool) var m map[string]bool if ch == m { fmt.Printf("ch is equal to m\n") } Try online This snippet doesn’t even compile: Error:./prog.go:8:11: invalid operation: ch == m (mismatched types chan bool and map[string]bool). Both ch and m are nil but you can’t compare them because they are of different types. nil != nil because nil is an abstract concept, not an actual value. Two: nil value of interface{} is (no type, no value). Once you understand the above, you’ll understand why nil doesn’t compare to (type, nil) e.g. (*Foo, nil) or (map[string]bool, nil) or (int, 0) or (string, ""). Bad design or inevitable consequence of previous decisions? Many claim it’s a bad design. No-one describes what a better design would look like. Let’s play act a Go language designer. You’ve already designed concrete types, you came up with notion of zero value and created nil to denote zero value for pointers, channels, maps, slices. You’re now designing interface{} as a logical tuple of (type, value). The zero value is obviously (no type, no value). You have to figure how to represent the zero value. A different symbol for interface{} zero value Instead of using nil you could create a different symbol e.g. zeroInteface. You could then write: var v interface{} var v2 interface{} = &Foo{nil} var v3 interface{} = int(0) if v == zeroInteface { // this is true } if v2 == nil { // tihs is true } if v3 == nil { // is it true or not? } Is this a better design? I don’t think so. We don’t have zeroPointer, zeroMap, zeroChanel etc. so this breaks consistency. It sticks out like a sore zeroInterface. And v == nil is subtle. Not all values wrapped in an interface{} have zero value of nil. What should happen if you compare to (int, 0) given that 0 is zero value of int? Damn the consistency, let’s do what user expects You could ditch the strict logic of nil values and special case the if v == nil for interface{} to do what people superficially expect to happen. You then have to answer the question below: what happens when you do if (int, 0) == nil? The biggest issue is that you’ve lost ability to distinguish between (no type, no value) and (type, nil). They both compare to nil so how would you test for (no type, no value) but not (type, nil)? It doesn’t seem like a better design either. Your proposal Now that you understand the problem and seen two ideas for how to fix it, it’s your turn to design a better solution. I tried and the above 2 are the only ideas I had. We are boxed by existing notions of zero values and using nil to represent them. We could explore designs that re-think those assumptions but would that be Go anymore? It’s easy to complain that something is a bad design. It’s much harder, often impossible, to design something better.

10 hours ago 1 votes
Are we the baddies?

I signed up for Hinge. Holy shit with the boosts. How does someone who works on this wake up every morning and feel okay about themselves? Similarly with the tip screens, Uber algorithm, all the zero sum bullshit using all the tricks of psychology to extract a little bit more from every interaction in society. Nudge. Nudge. NUDGE. Want to partake in normal society like buying a coffee, going on a date, getting a ride, paying a friend. Oh, there’s a middle man now. An evil ominous middleman using state of the art AI algorithms to extract just a little bit more from you. But eventually the market will fix this, right? People will feel sick of being manipulated and move elsewhere? Ahhh, but they see that coming long before you do. They have dashboards. Quick Jeeves, tune the AI to make people feel less manipulated. Give them a little bit more for now, we have to think about maximizing lifetime customer value here. Oh the AI already did this on its own? Jeeves you’ve been replaced! People perpetually on the edge. You want to opt out of this all you say? Good luck running a competitive business! Every metric is now a target. You better maximize engagement or you will lose engagement this is a red queen’s race we can’t afford to lose! Burn all the social capital, burn all your values, FEED IT ALL TO MOLOCH! Someday, people will have to realize we live in a society. What will it take? A complete self cannibalization to the point you can’t eat your own mouth? It sure as hell isn’t going to be people opting out, that’s a collective action problem you can’t solve. Democracy, haha, you think the algorithms will let you vote to kill them? Your vote is as decoupled from action as the amount Uber pays the driver is decoupled from the fare that you pay. There’s no reform here, there’s only revolution. Will it simply be a huge financial collapse? Or do we need World War 3? And even World War 3 is on a spectrum. Will mass starvation fix this? Or will the attitude of thinking it’s okay to manipulate others at scale persist even past that? He’s got his, and I’ve got mine… If you open a government S&P 500 account for everyone with $1,000 at birth that’ll pay their social security cause it like…goes up…wait who’s creating this value again? It’s not okay. Advertising is not okay. Price discrimination is not okay. Using big data, machine learning, and psychology to manipulate others at scale is not okay. But you aren’t going to learn this lesson until you have fed a huge majority of your customers to Moloch. Modern capitialism is wireheading. Release the hypnodrones. How many cans of Pepsi did you want them to consume an hour again?

yesterday 5 votes
Get in losers, we're moving to Linux!

I've never seen so many developers curious about leaving the Mac and giving Linux a go. Something has really changed in the last few years. Maybe Linux just got better? Maybe powerful mini PCs made it easier? Maybe Apple just fumbled their relationship with developers one too many times? Maybe it's all of it. But whatever the reason, the vibe shift is noticeable. This is why the future is so hard to predict! People have been joking about "The Year of Linux on the Desktop" since the late 90s. Just like self-driving cars were supposed to be a thing back in 2017. And now, in the year of our Lord 2025, it seems like we're getting both! I also wouldn't underestimate the cultural influence of a few key people. PewDiePie sharing his journey into Arch and Hyprland with his 110 million followers is important. ThePrimeagen moving to Arch and Hyprland is important. Typecraft teaching beginners how to build an Arch and Hyprland setup from scratch is important (and who I just spoke to about Omarchy). Gabe Newell's Steam Deck being built on Arch and pushing Proton to over 20,000 compatible Linux games is important. You'll notice a trend here, which is that Arch Linux, a notoriously "difficult" distribution, is at the center of much of this new engagement. Despite the fact that it's been around since 2003! There's nothing new about Arch, but there's something new about the circles of people it's engaging. I've put Arch at the center of Omarchy too. Originally just because that was what Hyprland recommended. Then, after living with the wonders of 90,000+ packages on the community-driven AUR package repository, for its own sake. It's really good! But while Arch (and Hyprland) are having a moment amongst a new crowd, it's also "just" Linux at its core. And Linux really is the star of the show. The perfect, free, and open alternative that was just sitting around waiting for developers to finally have had enough of the commercial offerings from Apple and Microsoft. Now obviously there's a taste of "new vegan sees vegans everywhere" here. You start talking about Linux, and you'll hear from folks already in the community or those considering the move too. It's easy to confuse what you'd like to be true with what is actually true. And it's definitely true that Linux is still a niche operating system on the desktop. Even among developers. Apple and Microsoft sit on the lion's share of the market share. But the mind share? They've been losing that fast. The window is open for a major shift to happen. First gradually, then suddenly. It feels like morning in Linux land!

3 days ago 6 votes