More from Letters of Note
The following letter was written exactly 100 years ago. It’s a remarkable piece of writing that never fails to move me. And because you deserve it, above the transcript I’ve included audio of the letter being read by the lovely Juliet Stevenson, taken from the Letters of Note: Music audiobook. Born in Alabama in 1880, Helen Keller
On this day in 1876, shortly after making history, Alexander Graham Bell wrote the following letter to his father. To read his diary entry from that same day, visit Diaries of Note. And don’t forget to sign up for the Letters of Note newsletter. Born in Scotland in 1847, Alexander Graham Bell hailed from a
Dear all, On 1st January of this year, after what feels like centuries of preparation, I launched Diaries of Note, and though incredibly difficult to plan, the premise is simple: every single day, on the Diaries of Note website, I am featuring a single diary entry from history, published on the same day and month
Although they only existed for seven years and released just three albums, Nirvana were a band of immeasurable influence in the music world thanks in no small part to Smells Like Teen Spirit, a single track on Nevermind, their second album. It was this song that brought them out into the open, going on to sell millions
Born in Cambridge in 1952, Douglas Adams was best known for creating The Hitchhiker’s Guide to the Galaxy, a wildly successful project that began in 1978 as a science-fiction comedy radio series and eventually evolved to become something much larger, in many formats and in many languages, adored by many millions of people around the world.
More in programming
New Logic for Programmers Release v0.10 is now available! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at the changelog page. Due to conference pressure v0.11 will also likely be a minor release. AI is a gamechanger for TLA+ users TLA+ is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. That's why The Coming AI Revolution in Distributed Systems caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work". This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that LLMs are an immense specification force multiplier. All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc. Things Claude was good at Fixing syntax errors TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax: NotThree(x) = \* should be ==, not = x != 3 \* should be #, not != The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet: Was expecting "==== or more Module body" Encountered "NotThree" at line 6, column 1 That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this. The TLA+ foundation has made LLM integration a priority, so the VSCode extension naturally supports several agents actions. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+. This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry. Understanding error traces When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table: Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well. Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it. I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt. As with syntax checking, if this was the only thing LLMs could effectively do, that would already be enough1 to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. Boilerplate tasks TLA+ has a lot of boilerplate. One of the most notorious examples is UNCHANGED rules. Specifications are extremely precise — so precise that you have to specify what variables don't change in every step. This takes the form of an UNCHANGED clause at the end of relevant actions: RemoveObjectFromStore(srv, o, s) == /\ o \in stored[s] /\ stored' = [stored EXCEPT ![s] = @ \ {o}] /\ UNCHANGED <<capacity, log, objectsize, pc>> Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for myself. I took a spec and prompted Claude Add UNCHANGED <> for each variable not changed in an action. And it worked! It successfully updated the UNCHANGED in every action. (Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an UNCHANGED clause. I haven't tested how Claude handles that!) That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating vars (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like Int, converting them to types like [Process -> Int], and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work. Writing properties from an informal description You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with. Things it is less good at Generating model config files To model check TLA+, you need both a specification (.tla) and a model config file (.cfg), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. Fixing specs Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. But that's a huge mistake in specification, because race conditions happen if we don't have coordination. We need to specify the mechanism that is supposed to prevent them. Finding properties of the spec After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated". Generating code from specs I have to be specific here: Claude could sometimes convert Python into a passable spec, an vice versa. It wasn't good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called pc. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires pc = "Get" and sets the new value to "Inc", then another that requires it be "Inc" and sets it to "Done". I found that Claude would try to somehow convert pc into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like sleep into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism. For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though. Unexplored Applications Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI: Writing Java Overrides Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like executing programs during model-checking or dynamically constrain the depth of the searched state space. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. Writing specs, given a reference mechanism In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. (Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?) Connecting specs and code This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. This blog post discusses both. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies. Thoughts Right now, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was. I have mixed thoughts on this. As an advocate, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a professional TLA+ consultant, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies! Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a free book available online, as does the inventor of TLA+. I like this guide too. Happy modeling! Dayenu. ↩
Today we go over the recent announcement blog post of react-router where they talk about their open governance model and what the future will look like
I quite enjoyed this talk. Some of the technical details went over my head (I don’t know what “split 16-bit mask into two 8-bit LTUs” means) but I could still follow the underlying point. First off, Andreas has a great story at the beginning about how he has a friend with a browser bookmarklet that replaces every occurrence of the word “dependency” with the word “liability”. Can you imagine npm working that way? Inside package.json: { "liabilities": { "react": "^19.0.0", "typescript": "^5.0.0" }, "devLiabilities": {...} } But I digress, back to Andreas. He points out that the context of your problems and the context of someone else’s problems do not overlap as often as we might think. It’s so unlikely that someone else tried to solve exactly our same problem with exactly our same constraints that [their solution or abstraction] will be the most economical or the best choice for us. It might be ok, but it won’t be the best thing. So while we immediately jump to tools built by others, the reality is that their tools were built for their problems and therefore won’t overlap with our problems as much or as often as we’re led to believe. In Andreas’ example, rather than using a third-party library to parse JSON and turn it into something, he writes his own bespoke parser for the problem at hand. His parser ignores a whole swath of abstractions a more generalized parser solves for, and guess what? His is an order of magnitude faster! Solving problems in the wrong domain and then glueing things together is always much, much worse [in terms of performance] than solving for what you actually need to solve. It’s fun watching him step through the performance gains as he goes from a generalized solution to one more tailored to his own specific context. What really resonates in his step-by-step process is how, as problems present themselves, you see how much easier it is to deal with performance issues for stuff you wrote vs. stuff others wrote. Not only that, but you can debug way faster! (Just think of the last time you tried to debug a file 1) you wrote, vs. 2) one you vendored vs. 3) one you installed deep down in node_modules somewhere.) Andreas goes from 41MB/s throughput to 1.25GB/s throughput without changing the behavior of the program. He merely removed a bunch of generalized abstractions he wasn’t using and didn’t need. Surprise, surprise: not doing unnecessary things is faster! You should always consider the unique context of your situation and weigh trade-offs. A “generic” solution means a solution “not tuned for your use case”. Email · Mastodon · Bluesky
<![CDATA[DandeGUI now does graphics and this is what it looks like. Some text and graphics output windows created with DandeGUI on Medley Interlisp. In addition to the square root table text output demo, I created the other graphics windows with the newly implemented functionality. For example, this code draws the random circles of the top window: (DEFUN RANDOM-CIRCLES (&KEY (N 200) (MAX-R 50) (WIDTH 640) (HEIGHT 480)) (LET ((RANGE-X (- WIDTH ( 2 MAX-R))) (RANGE-Y (- HEIGHT ( 2 MAX-R))) (SHADES (LIST IL:BLACKSHADE IL:GRAYSHADE (RANDOM 65536)))) (DANDEGUI:WITH-GRAPHICS-WINDOW (STREAM :TITLE "Random Circles") (DOTIMES (I N) (DECLARE (IGNORE I)) (IL:FILLCIRCLE (+ MAX-R (RANDOM RANGE-X)) (+ MAX-R (RANDOM RANGE-Y)) (RANDOM MAX-R) (ELT SHADES (RANDOM 3)) STREAM))))) GUI:WITH-GRAPHICS-WINDOW, GUI:OPEN-GRAPHICS-STREAM, and GUI:WITH-GRAPHICS-STREAM are the main additions. These functions and macros are the equivalent for graphics of what GUI:WITH-OUTPUT-TO-WINDOW, GUI:OPEN-WINDOW-STREAM, and GUI:WITH-WINDOW-STREAM, respectively, do for text. The difference is the text facilities send output to TEXTSTREAM streams whereas the graphics facilities to IMAGESTREAM, a type of device-independent graphics streams. Under the hood DandeGUI text windows are customized TEdit windows with an associated TEXTSTREAM. TEdit is the rich text editor of Medley Interlisp. Similarly, the graphics windows of DandeGUI run the Sketch line drawing editor under the hood. Sketch windows have an IMAGESTREAM which Interlisp graphics primitives like IL:DRAWLINE and IL:DRAWPOINT accept as an output destination. DandeGUI creates and manages Sketch windows with the type of stream the graphics primitives require. In other words, IMAGESTREAM is to Sketch what TEXTSTREAM is to TEdit. The benefits of programmatically using Sketch for graphics are the same as TEdit windows for text: automatic window repainting, scrolling, and resizing. The downside is overhead. Scrolling more than a few thousand graphics elements is slow and adding even more may crash the system. However, this is an acceptable tradeoff. The new graphics functions and macros work similarly to the text ones, with a few differences. First, DandeGUI now depends on the SKETCH and SKETCH-STREAM library modules which it automatically loads. Since Sketch has no notion of a read-only drawing area GUI:OPEN-GRAPHICS-STREAM achieves the same effect by other means: (DEFUN OPEN-GRAPHICS-STREAM (&KEY (TITLE "Untitled")) "Open a new window and return the associated IMAGESTREAM to send graphics output to. Sets the window title to TITLE if supplied." (LET ((STREAM (IL:OPENIMAGESTREAM '|Untitled| 'IL:SKETCH '(IL:FONTS ,DEFAULT-FONT*))) (WINDOW (IL:\\SKSTRM.WINDOW.FROM.STREAM STREAM))) (IL:WINDOWPROP WINDOW 'IL:TITLE TITLE) ;; Disable left and middle-click title bar menu (IL:WINDOWPROP WINDOW 'IL:BUTTONEVENTFN NIL) ;; Disable sketch editing via right-click actions (IL:WINDOWPROP WINDOW 'IL:RIGHTBUTTONFN NIL) ;; Disable querying the user whether to save changes (IL:WINDOWPROP WINDOW 'IL:DONTQUERYCHANGES T) STREAM)) Only the mouse gestures and commands of the middle-click title bar menu and the right-click menu change the drawing area interactively. To disable these actions GUI:OPEN-GRAPHICS-STREAM removes their menu handlers by setting to NIL the window properties IL:BUTTONEVENTFN and IL:RIGHTBUTTONFN. This way only programmatic output can change the drawing area. The function also sets IL:DONTQUERYCHANGES to T to prevent querying whether to save the changes at window close. By design output to DandeGUI windows is not permanent, so saving isn't necessary. GUI:WITH-GRAPHICS-STREAM and GUI:WITH-GRAPHICS-WINDOW are straightforward: (DEFMACRO WITH-GRAPHICS-STREAM ((VAR STREAM) &BODY BODY) "Perform the operations in BODY with VAR bound to the graphics window STREAM. Evaluates the forms in BODY in a context in which VAR is bound to STREAM which must already exist, then returns the value of the last form of BODY." `(LET ((,VAR ,STREAM)) ,@BODY)) (DEFMACRO WITH-GRAPHICS-WINDOW ((VAR &KEY TITLE) &BODY BODY) "Perform the operations in BODY with VAR bound to a new graphics window stream. Creates a new window titled TITLE if supplied, binds VAR to the IMAGESTREAM associated with the window, and executes BODY in this context. Returns the value of the last form of BODY." `(WITH-GRAPHICS-STREAM (,VAR (OPEN-GRAPHICS-STREAM :TITLE (OR ,TITLE "Untitled"))) ,@BODY)) Unlike GUI:WITH-TEXT-STREAM and GUI:WITH-TEXT-WINDOW, which need to call GUI::WITH-WRITE-ENABLED to establish a read-only environment after every output operation, GUI:OPEN-GRAPHICS-STREAM can do this only once at window creation. GUI:CLEAR-WINDOW, GUI:WINDOW-TITLE, and GUI:PRINT-MESSAGE now work with graphics streams in addition to text streams. For IMAGESTREAM arguments GUI:PRINT-MESSAGE prints to the system prompt window as Sketch stream windows have no prompt area. The random circles and fractal triangles graphics demos round up the latest additions. #DandeGUI #CommonLisp #Interlisp #Lisp a href="https://remark.as/p/journal.paoloamoroso.com/adding-graphics-support-to-dandegui"Discuss.../a Email | Reply @amoroso@oldbytes.space !--emailsub--]]>
By the end of 1995, the web moved outward and into the hands of everyone. The post Exploring the web in 1995 appeared first on The History of the Web.