Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
13
log4j: between a rock and a hard place 2021-12-11 What does backwards compatibility mean to me? Backwards compatibility should not have forced log4j to keep LDAP/JNDI URLs The other side of compatibility: being cautious adding features There is more than enough written on the mechanics of and mitigations for the recent . On prevention, this is the most interesting widely-reshared I have seen:severe RCE in log4jinsight This is making the rounds because highly-profitable companies are using infrastructure they do not pay for. That is a worthy topic, but not the most interesting thing in this particular case because it would not clearly have contributed to preventing this bug. It is the second statement in this tweet that is worthy of attention: the long ago, but could not because of the backwards compatibility promises they are held to.maintainers of log4j would have loved to remove this bad feature I am often heard to say that I love backwards compatibility, and that it is underrated....
over a year ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from David Crawshaw

How I program with LLMs

How I program with LLMs 2025-01-06 This document is a summary of my personal experiences using generative models while programming over the past year. It has not been a passive process. I have intentionally sought ways to use LLMs while programming to learn about them. The result has been that I now regularly use LLMs while working and I consider their benefits net-positive on my productivity. (My attempts to go back to programming without them are unpleasant.) Along the way I have found oft-repeated steps that can be automated, and a few of us are working on building those into a tool specifically for Go programming: . It’s very early but so far the experience has been positive.sketch.dev I am typically curious about new technology. It took very little experimentation with LLMs for me to want to see if I could extract practical value. There is an allure to a technology that can (at least some of the time) craft sophisticated responses to challenging questions. It is even more exciting to watch a computer attempt to write a piece of a program as requested, and make solid progress. The only technological shift I have experienced that feels similar to me happened in 1995, when we first configured our LAN with a usable default route. We replaced the shared computer in the other room running Trumpet Winsock with a machine that could route a dialup connection, and all at once I had The Internet on tap. Having the internet all the time was astonishing, and felt like the future. Probably far more to me in that moment than to many who had been on the internet longer at universities, because I was immediately dropped into high internet technology: web browsers, JPEGs, and millions of people. Access to a powerful LLM feels like that. So I followed this curiosity, to see if a tool that can generate something mostly not wrong most of the time could be a net benefit in my daily work. The answer appears to be yes, generative models are useful for me when I program. It has not been easy to get to this point. My underlying fascination with the new technology is the only way I have managed to figure it out, so I am sympathetic when other engineers claim LLMs are “useless.” But as I have been asked more than once how I can possibly use them effectively, this post is my attempt to describe what I have found so far. There are three ways I use LLMs in my day-to-day programming: As this is about the of programming, this has been a fundamentally qualitative process that is hard to write about with quantitative rigor. The closest I will get to data is to say: it appears from my records that for every two hours of programming I do now, I accept more than 10 autocomplete suggestions, use LLM for a search-like task once, and program in a chat session once.practice The rest of this is about extracting value from chat-driven programming. Let me try to motivate this for the skeptical. A lot of the value I personally get out of chat-driven programming is I reach a point in the day when I know what needs to be written, I can describe it, but I don’t have the energy to create a new file, start typing, then start looking up the libraries I need. (I’m an early-morning person, so this is usually any time after 11am for me, though it can also be any time I context-switch into a different language/framework/etc.) LLMs perform that service for me in programming. They give me a first draft, with some good ideas, with several of the dependencies I need, and often some mistakes. Often, .I find fixing those mistakes is a lot easier than starting from scratch This means chat-based programming may not be for you. I am doing a particular kind of programming, product development, which could be roughly described as trying to bring programs to a user through a robust interface. That means I am building a lot, throwing away a lot, and bouncing around between environments. Some days I mostly write typescript, some days mostly Go. I spent a week in a C++ codebase last month exploring an idea, and just had an opportunity to learn the HTTP server-side events format. I am all over the place, constantly forgetting and relearning. If you spend more time proving your optimization of a cryptographic algorithm is not vulnerable to timing attacks than you do writing the code, I don’t think any of my observations here are going to be useful to you. Give an LLM a specific objective and all the background material it needs so it can craft a well-contained code review packet and expect it to adjust as you question it. There are two major elements to this: The ideal task for an LLM is one where it needs to use a lot of common libraries (more than a human can remember, so it is doing a lot of small-scale research for you), working to an interface you designed or produces a small interface you can verify as sensible quickly, and it can write readable tests. Sometimes this means choosing the library for it, if you want something obscure (though with open source code LLMs are quite good at this). You always need to pass an LLM’s code through a compiler and run the tests before spending time reading it. They all produce code that doesn’t compile sometimes. (Always making errors I find surprisingly human, every time I see one I think, there but for the grace of God go I.) The better LLMs are very good at recovering from their mistakes, often all they need is for you to paste the compiler error or test failure into the chat and they fix the code. There are vague tradeoffs we make every day around the cost of writing, the cost of reading, and the cost of refactoring code. Let’s take Go package boundaries as an example. The standard library has a package “net/http” that contains some fundamental types for dealing with wire format encoding, MIME types, etc. It contains an HTTP client, and an HTTP server. Should it be one package, or several? Reasonable people can disagree! So much so, I do not know if there is a correct answer today. What we have works, after 15 years of use it is still not clear to me that some other package arrangement would work better. Advantages of a larger package include: centralized documentation for callers, easier initial writing, easier refactoring, easier sharing of helper code without devising robust interfaces for them (which often involves pulling the fundamental types of a package out into yet another leaf package filled with types). The disadvantages include the package being harder to read because many different things are going on (try reading the net/http client implementation without tripping up and finding yourself in the server code for a few minutes), or it being harder to use because there is too much going on in it. For example I have a codebase that uses a C library in some fundamental types, but parts of the codebase need to be in a binary widely distributed to many platforms that does not technically need the C library, so have more packages than you might expect in the codebase isolating the use of the C library to avoid cgo in the multi-platform binary. There are no right answers here, instead we are trading off different types of work that an engineer will have to do (upfront and ongoing). LLMs influence those tradeoffs: Let me work an example to combine a few of the discussed ideas: Write a reservoir sampler for the quartiles of floats. First off, package structure. Were I doing this before LLMs, I would have chosen to have some sort of streamstat package that contained several algorithms, maybe one per file. This does not seem to be a unique opinion, here is an open source package following that model. Now, I want just this one algorithm in its own package. Other variants or related algorithms can have their own package.quantile Next up, what do we get from an LLM. The first pass is not bad. That prompt, with some details about wanting it in Go got me quartile_sampler.go: The core interface is good too: Great! There are also tests. An aside: this may be the place to stop. Sometimes I use LLM codegen as a form of specialized search. E.g. I’m curious about reservoir sampling, but want to see how the algorithm would be applied under some surprising constraint, for example time-windowed sampling. Instead of doing a literature search I might amend my prompt for an implementation that tracks freshness. (I could also ask it to include references to the literature in the comments, which I could manually check to see if it’s making things up or if there’s some solid research to work from.) Often I spend 60 seconds reading some generated code, see an obvious trick I hadn’t thought of, then throw it away and start over. Now I know the trick is possible. This is why it is so hard to attribute value generated by LLMs. Yes sometimes it makes bad code, gets stuck in a rut, makes up something impossible (it hallucinated a part of the monaco API I wish existed the other day) and wastes my time. It can also save me hours by pointing out something relevant I don’t know. Back to the code. Fascinatingly, the initial code produced didn’t compile. In the middle of the Quartiles implementation there was the line: Which is a fine line, sorted is a slice defined a few lines earlier. But the value is never used so gopls (and the Go compiler if you run go build) immediately says: This is a very easy fix. If I paste the error back into the LLM it will correct it. Though in this case, as I’m reading the code, it’s quite clear to me that I can just delete the line myself, so I do. Now the tests. I got what I expected. In quartile_sampler_test.go: Exactly the sort of thing I would write! I would run some cases through another implementation to generate expected outputs and copy them into a test like this. But there are two issues with this. The first is the LLM did run these numbers through another implementation. (To the best of my knowledge. When using a sophisticated LLM service, it is hard to say for sure what is happening behind the scenes.) It made them up, and LLMs have a reputation for being weak at arithmetic. So this sort of test, while reasonable for a human to write because we base it on the output of another tool, or if we are particularly old-school do some arithmetic ourselves, is not great from an LLM.not The second issue with this is we can do better. I am happy we now live in a time when programmers write their own tests, but we do not hold ourselves to the same standards with tests as we do with production code. That is a reasonable tradeoff, there are only so many hours in the day. But what LLMs lack in arithmetical prowess, they make up for in enthusiasm. Let’s ask for an even better test. This got us some new test code: The original test from above has been reworked to to use checkQuartiles and we have something new: This is fun, because it's wrong. My running tool immediately says:gopls Pasting that error back into the LLM gets it to regenerate the fuzz test such that it is built around a function that uses to extract floats from the data slice. Interactions like this point us towards automating the feedback from tools: all it needed was the obvious error message to make solid progress towards something useful. I was not needed.func(t *testing.T, data []byte)math.Float64frombits Doing a quick survey of the last few weeks of my LLM chat history shows (which as I mentioned earlier, is not a proper quantitative analysis by any measure) that more than 80% of the time there is a tooling error, the LLM can make useful progress without me adding any insight. About half the time it can completely resolve the issue without me saying anything of note, I am just acting as the messenger. There was a programming movement some 25 years ago focused around the principle “don’t repeat yourself.” As is so often the case with short snappy principles taught to undergrads, it got taken too far. There is a lot of cost associated with abstracting out a piece of code so it can be reused, it requires creating intermediate abstractions that must be learned, and it requires adding features to the factored out code to make it maximally useful to the maximum number of people, which means we depend on libraries filled with useless distracting features. The past 10-15 years has seen a far more tempered approach to writing code, with many programmers understanding it is better to reimplement a concept if the cost of sharing the implementation is higher than the cost of implementing and maintaining separate code. It is far less common for me to write on a code review “this isn’t worth it, separate the implementations.” (Which is fortunate, because people really don’t want to hear things like that after they have done all the work.) Programmers are getting better at tradeoffs. What we have now is a world where the tradeoffs have shifted. It is now easier to write more comprehensive tests. You can have the LLM write the fuzz test implementation you want but didn’t have the hours to build properly. You can spend a lot more time writing tests to be readable, because the LLM is not sitting there constantly thinking “it would be better for the company if I went and picked another bug off the issue tracker than doing this.” So the tradeoff shifts in favor of having more specialized implementations. The place where I expect this to be most visible is language-specific . Every major company API comes with dozens of these, usually low quality, wrappers written by people who aren’t actually using their implementations for a specific goal, instead are trying to capture every nook and cranny of an API in a large and complex interface. Even when it is done well, I have found it easier to go to the REST documentation (usually a set of curl commands), and implement a language wrapper for the 1% of the API I actually care about. It cuts down the amount of the API I need to learn upfront, and it cuts down how much future programmers (myself) reading the code need to understand.REST API wrappers For example, as part of my recent work on sketch.dev I implemented a Gemini API wrapper in Go. Even though the in Go has been carefully handcrafted by people who know the language well and clearly care, there is a lot to read to understand it:official wrapper My simplistic initial wrapper was 200 lines of code total, one method, three types. Reading the entire implementation is 20% of the work of reading the documentation of the official package, and if you decide to try digging into its implementation you will discover that it is a wrapper around another largely code-generated implementation with protos and grpc and the works. All I want is to cURL and parse a JSON object. There obviously comes a point in a project, where Gemini is the foundation of the entire app, where nearly every feature is used, where building on gRPC aligns well with the telemetry system elsewhere in your organization, where you should use the large official wrapper. But most of the time it is so much more time consuming, both upfront and ongoing, to do so given we almost always want only some wafer-thin sliver of whatever API we need to use today, that custom clients, largely written by a GPU, are far more effective for getting work done. So I foresee a world with far more specialized code, with fewer generalized packages, and more readable tests. Reusable code will continue to thrive around small robust interfaces and otherwise will be pulled apart into specialized code. Depending how well this is done, it will lead to either better software or worse software. I would expect both, with a long-term trend towards better software by the metrics that matter. As a programmer my instinct is to make computers do work for me. It is a lot of work getting value out of LLMs, how can a computer do it? I believe the key to solving a problem is not to overgeneralize. Solve a particular problem and then expand slowly. So instead of building a general-purpose UI for chat programming that is just as good at COBOL as it is for Haskell, we want to focus on one particular environment. The bulk of my programming is in Go, and so what I want is easy to imagine for a Go programmer: A few of us have built an early prototype of this: .sketch.dev The goal is not a “Web IDE” but rather to challenge the notion that chat-based programming even belongs in what is traditionally called an IDE. IDEs are collections of tools arranged for people. It is a delicate environment where I know what is going on. While an LLM is ultimately a developer tool, it is one that needs its own IDE to get the feedback it needs to operate effectively.I do not want an LLM spewing its first draft all over my current branch. Put another way: we didn’t embed goimports into sketch for it to be used by humans, but to get Go code closer to compiling using automatic signals, so that the compiler can provide better error feedback to the LLM driving it. It might be better to think of sketch.dev as a “Go IDE for LLMs”. This is all very recent work with a lot left to do, e.g. git integration so we can load existing packages for editing and drop the results on a branch. Better test feedback. More console control. (If the answer is to run sed, run sed. Be you the human or the LLM.) We are still exploring, but are convinced that focusing an environment for a particular kind of programming will give us better results than the generalized tool. Background Overview Why use chat at all? Chat-based LLMs do best with exam-style questions Extra code structure is much cheaper An example Where are we going? Better tests, maybe even less DRY Automating these observations: sketch.dev . This makes me more productive by doing a lot of the more-obvious typing for me. It turns out the current state of the art can be improved on here, but that’s a conversation for another day. Even the standard products you can get off the shelf are better for me than nothing. I convinced myself of that by trying to give them up. I could not go a week without getting frustrated by how much mundane typing I had to do before having a FIM model. This is the place to experiment first. Autocomplete . If I have a question about a complex environment, say “how do I make a button transparent in CSS” I will get a far better answer asking any consumer-based LLM, o1, sonnet 3.5, etc, than I do using an old fashioned web search engine and trying to parse the details out of whatever page I land on. (Sometimes the LLM is wrong. So are people. The other day I put my shoe on my head and asked my two year old what she thought of my hat. She dealt with it and gave me a proper scolding. I can deal with LLMs being wrong sometimes too.) Search . This is the hardest of the three. This is where I get the most value of LLMs, but also the one that bothers me the most. It involves learning a lot and adjusting how you program, and on principle I don’t like that. It requires at least as much messing about to get value out of LLM chat as it does to learn to use a slide rule, with the added annoyance that it is a non-deterministic service that is regularly changing its behavior and user interface. Indeed, the long-term goal in my work is to replace the need for chat-driven programming, to bring the power of these models to a developer in a way that is not so off-putting. But as of now I am dedicated to approaching the problem incrementally, which means figuring out how to do best with what we have and improve it.Chat-driven programming Avoid creating a situation with so much complexity and ambiguity that the LLM gets confused and produces bad results. This is why I have had little success with chat inside my IDE. My workspace is often messy, the repository I am working on is by default too large, it is filled with distractions. One thing humans appear to be much better than LLMs at (as of January 2025) is not getting distracted. That is why I still use an LLM via a web browser, because I want a blank slate on which to craft a well-contained request. Ask for work that is easy to verify. Your job as a programmer using an LLM is to read the code it produces, think about it, and decide if the work is good. You can ask an LLM to do things you would never ask a human to do. “Rewrite all of your new tests introducing an <intermediate concept designed to make the tests easier to read>” is an appalling thing to ask a human, you’re going to have days of tense back-and-forth about whether the cost of the work is worth the benefit. An LLM will do it in 60 seconds and not make you fight to get it done. Take advantage of the fact that .redoing work is extremely cheap As LLMs do better with exam-style questions, more and smaller packages make it easier to give a complete and yet isolated context for a piece of work. This is true for humans too, which is why we use packages at all, but we trade off package size against the extra typing/plumbing/filing to make more readable code. With an LLM both doing and benefiting from a big chunk of that extra work, the tradeoff shifts. (As a bonus, we humans get more readable code!) Smaller and more numerous packages can be compiled and tested independently of unrelated code. This is helpful for the LLM dev cycle as it means changes can be compiled and tested before other dependent packages are refactored, and the package has a simpler test environment. something like the Go playground, built around editing a package and tests with a chat interface onto editable code a little UNIX env where we can run go get and go test goimports integration gopls integration automatic model feedback: on model edit run go get, go build, go test, feedback missing packages, compiler errors, test failures to the model to try and get them fixed automatically // QuartileSampler maintains a reservoir sample to estimate quartiles of a stream // of float64 values. It uses reservoir sampling to maintain a representative // sample when processing large streams of data. type QuartileSampler struct { reservoir []float64 seen int maxSize int rng *rand.Rand } // NewQuartileSampler creates a new sampler that maintains up to maxSize samples. // A larger maxSize provides more accurate estimates but uses more memory. func NewQuartileSampler(maxSize int) *QuartileSampler { … } // Add adds a new value to the sampler. func (qs *QuartileSampler) Add(value float64) { … } // Quartiles returns the estimated 25th, 50th (median), and 75th percentiles // of the data seen so far. Returns (0,0,0) if no data has been added. func (qs *QuartileSampler) Quartiles() (q1, median, q3 float64) { if len(qs.reservoir) == 0 { return 0, 0, 0 } // Make a copy to avoid modifying the reservoir sorted := make([]float64, len(qs.reservoir)) copy(sorted, qs.reservoir) slices.Sort(sorted) n := len(sorted) q1 = percentile(sorted, 0.25) median = percentile(sorted, 0.50) q3 = percentile(sorted, 0.75) return q1, median, q3 } n := len(sorted) declared and not used: n func TestQuartileSampler(t *testing.T) { tests := []struct { name string data []float64 wantQ1 float64 wantMed float64 wantQ3 float64 epsilon float64 // allowed error for floating point comparison }{ { name: "basic sequence", data: []float64{1, 2, 3, 4, 5, 6, 7}, wantQ1: 2, wantMed: 4, wantQ3: 6, epsilon: 0.1, }, … } for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { qs := NewQuartileSampler(100) // Use large reservoir for accurate testing for _, v := range tt.data { qs.Add(v) } q1, med, q3 := qs.Quartiles() if math.Abs(q1-tt.wantQ1) > tt.epsilon { t.Errorf("Q1 = %v, want %v (±%v)", q1, tt.wantQ1, tt.epsilon) } if math.Abs(med-tt.wantMed) > tt.epsilon { t.Errorf("Median = %v, want %v (±%v)", med, tt.wantMed, tt.epsilon) } if math.Abs(q3-tt.wantQ3) > tt.epsilon { t.Errorf("Q3 = %v, want %v (±%v)", q3, tt.wantQ3, tt.epsilon) } }) } } // referenceQuartiles calculates the exact quartiles for a slice of float64 values // using linear interpolation, matching the behavior expected from the sampler. func referenceQuartiles(data []float64) (q1, median, q3 float64) { … } // compareQuartiles checks if two sets of quartiles are within epsilon of each other. // Returns true if they match within the tolerance, false otherwise. func compareQuartiles(q1a, meda, q3a, q1b, medb, q3b, epsilon float64) bool { … } // checkQuartiles is a test helper that compares sampler output against the reference // implementation and reports any differences. func checkQuartiles(t *testing.T, data []float64, epsilon float64) { t.Helper() // Get reference values wantQ1, wantMed, wantQ3 := referenceQuartiles(data) // Get sampler values using a large reservoir for accuracy qs := NewQuartileSampler(1000) for _, v := range data { qs.Add(v) } gotQ1, gotMed, gotQ3 := qs.Quartiles() if !compareQuartiles(gotQ1, gotMed, gotQ3, wantQ1, wantMed, wantQ3, epsilon) { t.Errorf("Quartiles mismatch:\ngot (q1=%v, med=%v, q3=%v)\nwant (q1=%v, med=%v, q3=%v)\nepsilon=%v", gotQ1, gotMed, gotQ3, wantQ1, wantMed, wantQ3, epsilon) } } func FuzzQuartileSampler(f *testing.F) { // Add some seed corpus f.Add([]float64{1, 2, 3, 4, 5}) f.Fuzz(func(t *testing.T, data []float64) { // Use a larger epsilon for fuzzing since we might get more extreme values checkQuartiles(t, data, 0.2) }) } fuzzing arguments can only have the following types: string, bool, float32, float64, int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, []byte $ go doc -all genai | wc -l 1155 In the tests, implement the simplest, most readable version of the standard code for quartiles over a fixed set of known values in a slice. Then pass the test cases through the standard code and the reservoir sampler and confirm they are within an epsilon of each other. Structure the comparison code such that it can be used in a fuzz test too.

3 months ago 16 votes
jsonfile: a quick hack for tinkering

jsonfile: a quick hack for tinkering Consider your requirements! A reference implementation A final thought 2024-02-06 The year is 2024. I am on vacation and dream up a couple of toy programs I would like to build. It has been a few years since I built a standalone toy, I have . So instead of actually building any of the toys I think of, I spend my time researching if anything has changed since the last time I did it. Should pick up new tools or techniques?been busy It turns out lots of things have changed! There’s some great stuff out there, including decent quorum-write regional cloud databases now. Oh and the ability to have a fascinating hour-long novel conversation with transistors. But things are still awkward for small fast tinkering. Going back in time, I struggled constantly rewriting the database for the prototype for Tailscale, so I ended up writing my in-memory objects out as . It went far further than I planned. Somewhere in the intervening years I convinced myself it must have been a bad idea even for toys, given all the pain migrating away from it caused. But now that I find myself in an empty text editor wanting to write a little web server, I am not so sure. The migration was painful, and a lot of that pain was born by others (which is unfortunate, I find handing a mess to someone else deeply unpleasant). Much of that pain came from the brittle design of the caching layers on top (also my doing), which came from not moving to an SQL system soon enough.a JSON file I suspect, considering the process retrospect, a great deal of that pain can be avoided by committing to migrating directly to an SQL system the moment you need an index. You can pay down a lot of exploratory design work in a prototype before you need an index, which n is small, full scans are fine. But you don’t make it very far into production before one of your values of n crosses something around a thousand and you long for an index. With a clear exit strategy for avoiding big messes, that means the JSON file as database is still a valid technique for prototyping. And having spent a couple of days remembering what a misery it is to write a unit test for software that uses postgresql (mocks? docker?? for a database program I first ran on a computer with less power than my 2024 wrist watch?) and struggling figuring out how to make my cgo sqlite cross-compile to Windows, I’m firmly back to thinking a JSON file can be a perfectly adequate database for a 200-line toy. Before you jump into this and discover it won’t work, or just as bad, dismiss the small and unscaling as always a bad idea, consider the requirements of your software. Using a JSON file as a database means your software: Programming is the art of tradeoffs. You have to decide what matters and what does not. Some of those decisions need to be made early, usually with imperfect information. You may very well need a powerful SQL DBMS from the moment you start programming, depending on the kind of program you’re writing! An implementation of jsonfile (which Brad called JSONMutexDB, which is cooler because it has an x in it, but requires more typing) can fit in about 70 lines of Go. But there are a couple of lessons we ran into in the early days of Tailscale that can be paid down relatively easily, growing the implementation to 85 lines. (More with comments!) I think it’s worth describing the interesting things we ran into, both in code and here. You can find the implementation of jsonfile here: . The interface is:https://github.com/crawshaw/jsonfile/blob/main/jsonfile.go There is some experience behind this design. In no particular order: One of the early pain points in the transition was figuring out the equivalent of when to , , and . The first version exposed the mutex directly (which was later converted into a RWMutex).BEGINCOMMITROLLBACK There is no advantage to paying this transition cost later. It is easy to box up read/write transactions with a callback. This API does that, and provides a great point to include other safety mechanisms. There are two forms of this. The first is if the write fn fails half-way through, having edited the db object in some way. To avoid this, the implementation first creates an entirely new copy of the DB before applying the edit, so the entire change set can be thrown away on error. Yes, this is inefficient. No, it doesn’t matter. Inefficiency in this design is dominated by the I/O required to write the entire database on every edit. If you are concerned about the duplicate-on-write cost, you are not modeling I/O cost appropriately (which is important, because if I/O matters, switch to SQL). The second is from a full disk. The easy to write a file in Go is to call os.WriteFile, which the first implementation did. But that means: A failure can occur in any of those system calls, resulting in a corrupt DB. So this implementation creates a new file, loads the DB into it, and when that has all succeeded, uses . It is not a panacea, our operating systems do not make all the promises we wish they would about rename. But it is much better than the default.rename(2) A nasty issue I have run into twice is aliasing memory. This involves doing something like: An intermediate version of this code kept the previous database file on write. But there’s an easier and even more robust strategy: never rename the file back to the original. Always create a new file, . On starting, load the most recent file. Then when your data is worth backing up (if ever), have a separate program prune down the number of files and send them somewhere robust.Backups.mydb.json.<timestamp> Not in this implementation but you may want to consider, is removing the risk of a Read function editing memory. You can do that with View* types generated by the tool. It’s neat, but more than quadruples the complexity of JSONFileDB, complicates the build system, and initially isn’t very important in the sorts of programs I write. I have found several memory aliasing bugs in all the code I’ve written on top of a JSON file, but have yet to accidentally write when reading. Still, for large code bases Views are quite pleasant and well-worth considering about the point when a project should move to a real SQL.Constant memory.viewer There is some room for performance improvements too (using cloner instead of unmarshalling a fresh copy of the data for writing), though I must point out again that needing more performance is a good sign it is time to move on to SQLite, or something bigger. It’s a tiny library. Copy and edit as needed. It is an all-new implementation so I will be fixing bugs as I find them. (As a bonus: this was my first time using a Go generic! 👴 It went fine. Parametric polymorphism is ok.) Why go out of my way to devise an inadequate replacement for a database? Most projects fail before they start. They fail because the is too high. Our dreams are big and usually too much, as dreams should be.activation energy But software is not building a house or traveling the world. You can realize a dream with the tools you have on you now, in a few spare hours. This is the great joy of it, you are free from physical and economic constraint. If you start. Be willing to compromise almost everything to start. Doesn’t have a lot of data. Keep it to a few megabytes. The data structure is boring enough not to require indexes. You don’t need something interesting like full-text search. You do plenty of reads, but writes are infrequent. Ideally no more than one every few seconds. Truncating the database file Making multiple system calls to .write(2) Calling .close(2) type JSONFile[Data any] struct { … } func New[Data any](path string) (*JSONFile[Data], error) func Load[Data any](path string) (*JSONFile[Data], error) func (p *JSONFile[Data]) Read(fn func(data *Data)) func (p *JSONFile[Data]) Write(fn func(*Data) error) error list := []int{1, 2, 3} db.Write(func() { db.List = list }) list[0] = 10 // editing the database! Transactions Database corruption through partial writes Memory aliasing Some changes you may want to consider

a year ago 12 votes
new year, same plan

new year, same plan 2022-12-31 Some months ago, the bill from GCE for hosting this blog jumped from nearly nothing to far too much for what it is, so I moved provider and needed to write a blog post to test it all. I could have figured out why my current provider hiked the price. Presumably I was Holding It Wrong and with just a few grip adjustments I could get the price dropped. But if someone mysteriously starts charging you more money, and there are other people who offer the same service, why would you stay? This has not been a particularly easy year, for a variety of reasons. But here I am at the end of it, and beyond a few painful mistakes that in retrospect I did not have enough information to get right, I made mostly the same decisions I would again. There were a handful of truly wonderful moments. So the plan for 2023 is the same: keep the kids intact, and make programming more fun. There is also the question of Twitter. It took me a few years to develop the skin to handle the generally unpleasant environment. (I can certainly see why almost no old Twitter employees used their product.) The experience recently has degraded, there are still plenty of funny tweets, but far less moments of interesting content. Here is a recent exception, but it is notable because it's the first time in weeks I learned anything from twitter: . I now find more new ideas hiding in HN comments than on Twitter.https://twitter.com/lrocket/status/1608883621980704768 Many people I know have sort-of moved to Mastodon, but it has a pretty horrible UX that is just enough work that I, on the whole, don't enjoy it much. And the fascinating insights don't seem to be there yet, but I'm still reading and waiting. On the writing side, it might be a good idea to lower the standards (and length) of my blog posts to replace writing tweets. But maybe there isn't much value in me writing short notes anyway, are my contributions as fascinating as the ones I used to sift through Twitter to read? Not really. So maybe the answer is to give up the format entirely. That might be something new for 2023. Here is something to think about for the new year: http://www.shoppbs.pbs.org/now/transcript/transcriptNOW140_full.html DAVID BRANCACCIO: There's a little sweet moment, I've got to say, in a very intense book– your latest– in which you're heading out the door and your wife says what are you doing? I think you say– I'm getting– I'm going to buy an envelope. KURT VONNEGUT: Yeah. DAVID BRANCACCIO: What happens then? KURT VONNEGUT: Oh, she says well, you're not a poor man. You know, why don't you go online and buy a hundred envelopes and put them in the closet? And so I pretend not to hear her. And go out to get an envelope because I'm going to have a hell of a good time in the process of buying one envelope. I meet a lot of people. And, see some great looking babes. And a fire engine goes by. And I give them the thumbs up. And, and ask a woman what kind of dog that is. And, and I don't know. The moral of the story is, is we're here on Earth to fart around. And, of course, the computers will do us out of that. And, what the computer people don't realize, or they don't care, is we're dancing animals. You know, we love to move around. And, we're not supposed to dance at all anymore.

over a year ago 12 votes
Software I’m thankful for

Software I’m thankful for 2021-11-25 A few of the things that come to mind, this thanksgiving. Most Unix-ish APIs, from files to sockets are a bit of a mess today. Endless poorly documented sockopts, unexpected changes in write semantics across FSs and OSes, good luck trying to figure out . But despite the mess, I can generally wrap my head around open/read/write/close. I can strace a binary and figure out the sequence and decipher what’s going on. Sprinkle in some printfs and state is quickly debuggable. Stack traces are useful!mtimes Enormous effort has been spent on many projects to replace this style of I/O programming, for efficiency or aesthetics, often with an asynchronous bent. I am thankful for this old reliable standby of synchronous open/read/write/close, and hope to see it revived and reinvented throughout my career to be cleaner and simpler. Goroutines are coroutines with compiler/runtime optimized yielding, to make them behave like threads. This breathes new life into the previous technology I’m thankful for: simple blocking I/O. With goroutines it becomes cheap to write large-scale blocking servers without running out of OS resources (like heavy threads, on OSes where they’re heavy, or FDs). It also makes it possible to use blocking interfaces between “threads” within a process without paying the ever-growing price of a context switch in the post- world.spectre This is the first year where the team working on Tailscale has outgrown and eclipsed me to the point where I can be thankful for Tailscale without feeling like I’m thanking myself. Many of the wonderful new features that let me easily wire machines together wherever they are, like userspace networking or MagicDNS, are not my doing. I’m thankful for the product, and the opportunity to work with the best engineering team I’ve ever had the privilege of being part of. Much like open/read/write/close, SQLite is an island of stability in a constantly changing technical landscape. Techniques I learned 10 or 15 years ago using SQLite work today. As a bonus, it does so much more than then: WAL mode for highly-concurrent servers, advanced SQL like window functions, excellent ATTACH semantics. It has done all of this while keeping the number of, in the projects own language, “goofy design” decisions to a minimum and holding true to its mission of being “lite”. I aspire to write such wonderful software. JSON is the worst form of encoding — except for all the others that have been tried. It’s complicated, but not too complicated. It’s not easily read by humans, but it can be read by humans. It is possible to extend it in intuitive ways. When it gets printed onto your terminal, you can figure out what’s going on without going and finding the magic decoder ring of the week. It makes some things that are extremely hard with XML or INI easy, without introducing accidental Turing completeness or turning . Writing software is better for it, and shows the immense effect carefully describing something can do for programming. JSON was everywhere in our JavaScript before the term was defined, the definition let us see it and use it elsewhere.country codes into booleans WireGuard is a great demonstration of why the total complexity of the implementation ends up affecting the UX of the product. In theory I could have been making tunnels between my devices for years with IPSec or TLS, in practice I’d completely given it up until something came along that made it easier. It didn’t make it easier by putting a slick UI over complex technology, it made the underlying technology simpler, so even I could (eventually) figure out the configuration. Most importantly, by not eating my entire complexity budget with its own internals, I could suddenly see it as a building block in larger projects. Complexity makes more things possible, and fewer things possible, simultaneously. WireGuard is a beautiful example of simplicity and I’m thankful for it. Before Go became popular, the fast programming language compilers of the 90s had mostly fallen by the wayside, to be replaced with a bimodal world of interpreters/JITs on one side and creaky slow compilers attempting to produce extremely optimal code on the other. The main Go toolchain found, or rediscovered, a new optimal point in the plane of tradeoffs for programming languages to sit: ahead of time compiled, but with a fast less-than-optimal compiler. It has managed to continue to hold that interesting, unstable equilibrium for a decade now, which is incredibly impressive. (E.g. I personally would love to improve its inliner, but know that it’s nearly impossible to get too far into that project without sacrificing a lot of the compiler’s speed.) I’ve always been cranky about GCC: I find its codebase nearly impossible to modify, it’s slow, the associated ducks I need to line up to make it useful (binutils, libc, etc) blow out the complexity budget on any project I try to start before I get far, and it is associated with GNU, which I used to view as an oddity and now view as a millstone around the neck of an otherwise excellent software project. But these are all the sorts of complaints you only make when using something truly invaluable. GCC is invaluable. I would never have learned to program if a free C compiler hadn’t been available in the 90s, so I owe it my career. To this day, it vies neck-and-neck with LLVM for best performing object code. Without the competition between them, compiler technology would stagnate. And while LLVM now benefits from $10s or $100s of millions a year in Silicon Valley salaries working on it, GCC does it all with far less investment. I’m thankful it keeps on going. I keep trying to quit vim. I keep ending up inside a terminal, inside vim, writing code. Like SQLite, vim is an island of stability over my career. While I wish IDEs were better, I am extremely thankful for tools that work and respect the effort I have taken to learn them, decade after decade. SSH gets me from here to there, and has done since ~1999. There is a lot about ssh that needs reinventing, but I am thankful for stable, reliable tools. It takes a lot of work to keep something like ssh working and secure, and if the maintainers are ever looking for someone to buy them a round they know where to find me. How would I get anything done without all the wonderful information on the public web and search engines to find it? What an amazing achievement. Thanks everyone, for making computers so great. open/read/write/close goroutines Tailscale SQLite JSON WireGuard The speed of the Go compiler GCC vim ssh The public web and search engines

over a year ago 15 votes

More in programming

We'll always need junior programmers

We received over 2,200 applications for our just-closed junior programmer opening, and now we're going through all of them by hand and by human. No AI screening here. It's a lot of work, but we have a great team who take the work seriously, so in a few weeks, we'll be able to invite a group of finalists to the next phase. This highlights the folly of thinking that what it'll take to land a job like this is some specific list of criteria, though. Yes, you have to present a baseline of relevant markers to even get into consideration, like a great cover letter that doesn't smell like AI slop, promising projects or work experience or educational background, etc. But to actually get the job, you have to be the best of the ones who've applied! It sounds self-evident, maybe, but I see questions time and again about it, so it must not be. Almost every job opening is grading applicants on the curve of everyone who has applied. And the best candidate of the lot gets the job. You can't quantify what that looks like in advance. I'm excited to see who makes it to the final stage. I already hear early whispers that we got some exceptional applicants in this round. It would be great to help counter the narrative that this industry no longer needs juniors. That's simply retarded. However good AI gets, we're always going to need people who know the ins and outs of what the machine comes up with. Maybe not as many, maybe not in the same roles, but it's truly utopian thinking that mankind won't need people capable of vetting the work done by AI in five minutes.

11 hours ago 4 votes
Requirements change until they don't

Recently I got a question on formal methods1: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is "building the right thing", not "building the thing right". One possible response: "why write tests"? You shouldn't write tests, especially lots of unit tests ahead of time, if you might just throw them all away when the requirements change. This is a bad response because we all know the difference between writing tests and formal methods: testing is easy and FM is hard. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, "high(ish) cost" isn't affordable and "high correctness" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't. But eventually you get something that solves the problem, and what then? Most of us don't work for Google, we can't axe features and products on a whim. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when you add new features that come into conflict. And just as importantly, it should never stop solving their problem. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems. Every successful requirement met spawns a new requirement: "keep this working". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes. (Is this all a pretentious of way of saying "software maintenance is hard?" Maybe!) Phase changes In physics there's a concept of a phase transition. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.2 This continues until you raise it to 100°C, then it stops. After you've added two thousand joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely. Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a "before" and "after" in the system form. Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be Set(key, val), which updates data[key] to val.3 A model of asynchronous updates would be AsyncSet(key, val, priority) adds a (key, val, priority, server_time()) tuple to a tasks set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls Set(key, val). Here are some properties the client may need preserved as a requirement: If AsyncSet(key, val, _, _) is called, then eventually db[key] = val (possibly violated if higher-priority tasks keep coming in) If someone calls AsyncSet(key1, val1, low) and then AsyncSet(key2, val2, low), they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times) If someone calls AsyncSet(key, val, _) and immediately reads db[key] they should get val (obviously violated, though the client may accept a slightly weaker property) If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug before releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like automatically generate test suites. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements stop changing, and then you're stuck with them forever. That's where models shine. As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because "formal specification" is really awkward to say. ↩ Also called a "calorie". The US "dietary Calorie" is actually a kilocalorie. ↩ This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax ↩

8 hours ago 2 votes
How should Stripe deprecate APIs? (~2016)

While Stripe is a widely admired company for things like its creation of the Sorbet typer project, I personally think that Stripe’s most interesting strategy work is also among its most subtle: its willingness to significantly prioritize API stability. This strategy is almost invisible externally. Internally, discussions around it were frequent and detailed, but mostly confined to dedicated API design conversations. API stability isn’t just a technical design quirk, it’s a foundational decision in an API-driven business, and I believe it is one of the unsung heroes of Stripe’s business success. This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts. Reading this document To apply this strategy, start at the top with Policy. To understand the thinking behind this strategy, read sections in reverse order, starting with Explore. More detail on this structure in Making a readable Engineering Strategy document. Policy & Operation Our policies for managing API changes are: Design for long API lifetime. APIs are not inherently durable. Instead we have to design thoughtfully to ensure they can support change. When designing a new API, build a test application that doesn’t use this API, then migrate to the new API. Consider how integrations might evolve as applications change. Perform these migrations yourself to understand potential friction with your API. Then think about the future changes that we might want to implement on our end. How would those changes impact the API, and how would they impact the application you’ve developed. At this point, take your API to API Review for initial approval as described below. Following that approval, identify a handful of early adopter companies who can place additional pressure on your API design, and test with them before releasing the final, stable API. All new and modified APIs must be approved by API Review. API changes may not be enabled for customers prior to API Review approval. Change requests should be sent to api-review email group. For examples of prior art, review the api-review archive for prior requests and the feedback they received. All requests must include a written proposal. Most requests will be approved asynchronously by a member of API Review. Complex or controversial proposals will require live discussions to ensure API Review members have sufficient context before making a decision. We never deprecate APIs without an unavoidable requirement to do so. Even if it’s technically expensive to maintain support, we incur that support cost. To be explicit, we define API deprecation as any change that would require customers to modify an existing integration. If such a change were to be approved as an exception to this policy, it must first be approved by the API Review, followed by our CEO. One example where we granted an exception was the deprecation of TLS 1.2 support due to PCI compliance obligations. When significant new functionality is required, we add a new API. For example, we created /v1/subscriptions to support those workflows rather than extending /v1/charges to add subscriptions support. With the benefit of hindsight, a good example of this policy in action was the introduction of the Payment Intents APIs to maintain compliance with Europe’s Strong Customer Authentication requirements. Even in that case the charge API continued to work as it did previously, albeit only for non-European Union payments. We manage this policy’s implied technical debt via an API translation layer. We release changed APIs into versions, tracked in our API version changelog. However, we only maintain one implementation internally, which is the implementation of the latest version of the API. On top of that implementation, a series of version transformations are maintained, which allow us to support prior versions without maintaining them directly. While this approach doesn’t eliminate the overhead of supporting multiple API versions, it significantly reduces complexity by enabling us to maintain just a single, modern implementation internally. All API modifications must also update the version transformation layers to allow the new version to coexist peacefully with prior versions. In the future, SDKs may allow us to soften this policy. While a significant number of our customers have direct integrations with our APIs, that number has dropped significantly over time. Instead, most new integrations are performed via one of our official API SDKs. We believe that in the future, it may be possible for us to make more backwards incompatible changes because we can absorb the complexity of migrations into the SDKs we provide. That is certainly not the case yet today. Diagnosis Our diagnosis of the impact on API changes and deprecation on our business is: If you are a small startup composed of mostly engineers, integrating a new payments API seems easy. However, for a small business without dedicated engineers—or a larger enterprise involving numerous stakeholders—handling external API changes can be particularly challenging. Even if this is only marginally true, we’ve modeled the impact of minimizing API changes on long-term revenue growth, and it has a significant impact, unlocking our ability to benefit from other churn reduction work. While we believe API instability directly creates churn, we also believe that API stability directly retains customers by increasing the migration overhead even if they wanted to change providers. Without an API change forcing them to change their integration, we believe that hypergrowth customers are particularly unlikely to change payments API providers absent a concrete motivation like an API change or a payment plan change. We are aware of relatively few companies that provide long-term API stability in general, and particularly few for complex, dynamic areas like payments APIs. We can’t assume that companies that make API changes are ill-informed. Rather it appears that they experience a meaningful technical debt tradeoff between the API provider and API consumers, and aren’t willing to consistently absorb that technical debt internally. Future compliance or security requirements—along the lines of our upgrade from TLS 1.2 to TLS 1.3 for PCI—may necessitate API changes. There may also be new tradeoffs exposed as we enter new markets with their own compliance regimes. However, we have limited ability to predict these changes at this point.

6 hours ago 1 votes
Bike Brooklyn! zine

I've been biking in Brooklyn for a few years now! It's hard for me to believe it, but I'm now one of the people other bicyclists ask questions to now. I decided to make a zine that answers the most common of those questions: Bike Brooklyn! is a zine that touches on everything I wish I knew when I started biking in Brooklyn. A lot of this information can be found in other resources, but I wanted to collect it in one place. I hope to update this zine when we get significantly more safe bike infrastructure in Brooklyn and laws change to make streets safer for bicyclists (and everyone) over time, but it's still important to note that each release will reflect a specific snapshot in time of bicycling in Brooklyn. All text and illustrations in the zine are my own. Thank you to Matt Denys, Geoffrey Thomas, Alex Morano, Saskia Haegens, Vishnu Reddy, Ben Turndorf, Thomas Nayem-Huzij, and Ryan Christman for suggestions for content and help with proofreading. This zine is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, so you can copy and distribute this zine for noncommercial purposes in unadapted form as long as you give credit to me. Check out the Bike Brooklyn! zine on the web or download pdfs to read digitally or print here!

yesterday 5 votes
Announcing Hotwire Native 1.2

We’ve just launched Hotwire Native v1.2 and it’s the biggest update since the initial launch last year. The update has several key improvements, bug fixes, and more API consistency between platforms. And we’ve created all new iOS and Android demo apps to show it off! A web-first framework for building native mobile apps Improvements There are a few significant changes in v1.2 that are worth specifically highlighting. Route decision handlers Hotwire Native apps route internal urls to screens in your app, and route external urls to the device’s browser. Historically, though, it wasn’t straightforward to customize the default behavior for unique app needs. In v1.2, we’ve introduced the RouteDecisionHandler concept to iOS (formerly only on Android). Route decisions handlers offer a flexible way to decide how to route urls in your app. Out-of-the-box, Hotwire Native registers these route decision handlers to control how urls are routed: AppNavigationRouteDecisionHandler: Routes all internal urls on your app’s domain through your app. SafariViewControllerRouteDecisionHandler: (iOS Only) Routes all external http/https urls to a SFSafariViewController in your app. BrowserTabRouteDecisionHandler: (Android Only) Routes all external http/https urls to a Custom Tab in your app. SystemNavigationRouteDecisionHandler: Routes all remaining external urls (such as sms: or mailto:) through device’s system navigation. If you’d like to customize this behavior you can register your own RouteDecisionHandler implementations in your app. See the documentation for details. Server-driven historical location urls If you’re using Ruby on Rails, the turbo-rails gem provides the following historical location routes. You can use these to manipulate the navigation stack in Hotwire Native apps. recede_or_redirect_to(url, **options) — Pops the visible screen off of the navigation stack. refresh_or_redirect_to(url, **options) — Refreshes the visible screen on the navigation stack. resume_or_redirect_to(url, **options) — Resumes the visible screen on the navigation stack with no further action. In v1.2 there is now built-in support to handle these “command” urls with no additional path configuration setup necessary. We’ve also made improvements so they handle dismissing modal screens automatically. See the documentation for details. Bottom tabs When starting with Hotwire Native, one of the most common questions developers ask is how to support native bottom tab navigation in their apps. We finally have an official answer! We’ve introduced a HotwireTabBarController for iOS and a HotwireBottomNavigationController for Android. And we’ve updated the demo apps for both platforms to show you exactly how to set them up. New demo apps To better show off all the features in Hotwire Native, we’ve created new demo apps for iOS and Android. And there’s a brand new Rails web app for the native apps to leverage. Hotwire Native demo app Clone the GitHub repos to build and run the demo apps to try them out: iOS repo Android repo Rails app Huge thanks to Joe Masilotti for all the demo app improvements. If you’re looking for more resources, Joe even wrote a Hotwire Native for Rails Developers book! Release notes v1.2 contains dozens of other improvements and bug fixes across both platforms. See the full release notes to learn about all the additional changes: iOS release notes Android release notes Take a look If you’ve been curious about using Hotwire Native for your mobile apps, now is a great time to take a look. We have documentation and guides available on native.hotwired.dev and we’ve created really great demo apps for iOS and Android to help you get started.

yesterday 3 votes