Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
2
New Logic for Programmers Release! v0.11 is now available! This is over 20% longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! Full release notes here. Software books I wish I could read I'm writing Logic for Programmers because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way. Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as Data and Reality or Making Software. There is no blog or talk about debugging as good as the Debugging book. It might not be anything deeper than "people spend more time per word on writing books than blog posts". I dunno. So here are some other...
an hour ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Computer Things

2000 words about arrays and tables

I'm way too discombobulated from getting next month's release of Logic for Programmers ready, so I'm pulling a idea from the slush pile. Basically I wanted to come up with a mental model of arrays as a concept that explained APL-style multidimensional arrays and tables but also why there weren't multitables. So, arrays. In all languages they are basically the same: they map a sequence of numbers (I'll use 1..N)1 to homogeneous values (values of a single type). This is in contrast to the other two foundational types, associative arrays (which map an arbitrary type to homogeneous values) and structs (which map a fixed set of keys to heterogeneous values). Arrays appear in PLs earlier than the other two, possibly because they have the simplest implementation and the most obvious application to scientific computing. The OG FORTRAN had arrays. I'm interested in two structural extensions to arrays. The first, found in languages like nushell and frameworks like Pandas, is the table. Tables have string keys like a struct and indexes like an array. Each row is a struct, so you can get "all values in this column" or "all values for this row". They're heavily used in databases and data science. The other extension is the N-dimensional array, mostly seen in APLs like Dyalog and J. Think of this like arrays-of-arrays(-of-arrays), except all arrays at the same depth have the same length. So [[1,2,3],[4]] is not a 2D array, but [[1,2,3],[4,5,6]] is. This means that N-arrays can be queried on any axis. ]x =: i. 3 3 0 1 2 3 4 5 6 7 8 0 { x NB. first row 0 1 2 0 {"1 x NB. first column 0 3 6 So, I've had some ideas on a conceptual model of arrays that explains all of these variations and possibly predicts new variations. I wrote up my notes and did the bare minimum of editing and polishing. Somehow it ended up being 2000 words. 1-dimensional arrays A one-dimensional array is a function over 1..N for some N. To be clear this is math functions, not programming functions. Programming functions take values of a type and perform computations on them. Math functions take values of a fixed set and return values of another set. So the array [a, b, c, d] can be represented by the function (1 -> a ++ 2 -> b ++ 3 -> c ++ 4 -> d). Let's write the set of all four element character arrays as 1..4 -> char. 1..4 is the function's domain. The set of all character arrays is the empty array + the functions with domain 1..1 + the functions with domain 1..2 + ... Let's call this set Array[Char]. Our compilers can enforce that a type belongs to Array[Char], but some operations care about the more specific type, like matrix multiplication. This is either checked with the runtime type or, in exotic enough languages, with static dependent types. (This is actually how TLA+ does things: the basic collection types are functions and sets, and a function with domain 1..N is a sequence.) 2-dimensional arrays Now take the 3x4 matrix i. 3 4 0 1 2 3 4 5 6 7 8 9 10 11 There are two equally valid ways to represent the array function: A function that takes a row and a column and returns the value at that index, so it would look like f(r: 1..3, c: 1..4) -> Int. A function that takes a row and returns that column as an array, aka another function: f(r: 1..3) -> g(c: 1..4) -> Int.2 Man, (2) looks a lot like currying! In Haskell, functions can only have one parameter. If you write (+) 6 10, (+) 6 first returns a new function f y = y + 6, and then applies f 10 to get 16. So (+) has the type signature Int -> Int -> Int: it's a function that takes an Int and returns a function of type Int -> Int.3 Similarly, our 2D array can be represented as an array function that returns array functions: it has type 1..3 -> 1..4 -> Int, meaning it takes a row index and returns 1..4 -> Int, aka a single array. (This differs from conventional array-of-arrays because it forces all of the subarrays to have the same domain, aka the same length. If we wanted to permit ragged arrays, we would instead have the type 1..3 -> Array[Int].) Why is this useful? A couple of reasons. First of all, we can apply function transformations to arrays, like "combinators". For example, we can flip any function of type a -> b -> c into a function of type b -> a -> c. So given a function that takes rows and returns columns, we can produce one that takes columns and returns rows. That's just a matrix transposition! Second, we can extend this to any number of dimensions: a three-dimensional array is one with type 1..M -> 1..N -> 1..O -> V. We can still use function transformations to rearrange the array along any ordering of axes. Speaking of dimensions: What are dimensions, anyway Okay, so now imagine we have a Row × Col grid of pixels, where each pixel is a struct of type Pixel(R: int, G: int, B: int). So the array is Row -> Col -> Pixel But we can also represent the Pixel struct with a function: Pixel(R: 0, G: 0, B: 255) is the function where f(R) = 0, f(G) = 0, f(B) = 255, making it a function of type {R, G, B} -> Int. So the array is actually the function Row -> Col -> {R, G, B} -> Int And then we can rearrange the parameters of the function like this: {R, G, B} -> Row -> Col -> Int Even though the set {R, G, B} is not of form 1..N, this clearly has a real meaning: f[R] is the function mapping each coordinate to that coordinate's red value. What about Row -> {R, G, B} -> Col -> Int? That's for each row, the 3 × Col array mapping each color to that row's intensities. Really any finite set can be a "dimension". Recording the monitor over a span of time? Frame -> Row -> Col -> Color -> Int. Recording a bunch of computers over some time? Computer -> Frame -> Row …. This is pretty common in constraint satisfaction! Like if you're conference trying to assign talks to talk slots, your array might be type (Day, Time, Room) -> Talk, where Day/Time/Room are enumerations. An implementation constraint is that most programming languages only allow integer indexes, so we have to replace Rooms and Colors with numerical enumerations over the set. As long as the set is finite, this is always possible, and for struct-functions, we can always choose the indexing on the lexicographic ordering of the keys. But we lose type safety. Why tables are different One more example: Day -> Hour -> Airport(name: str, flights: int, revenue: USD). Can we turn the struct into a dimension like before? In this case, no. We were able to make Color an axis because we could turn Pixel into a Color -> Int function, and we could only do that because all of the fields of the struct had the same type. This time, the fields are different types. So we can't convert {name, flights, revenue} into an axis. 4 One thing we can do is convert it to three separate functions: airport: Day -> Hour -> Str flights: Day -> Hour -> Int revenue: Day -> Hour -> USD But we want to keep all of the data in one place. That's where tables come in: an array-of-structs is isomorphic to a struct-of-arrays: AirportColumns( airport: Day -> Hour -> Str, flights: Day -> Hour -> Int, revenue: Day -> Hour -> USD, ) The table is a sort of both representations simultaneously. If this was a pandas dataframe, df["airport"] would get the airport column, while df.loc[day1] would get the first day's data. I don't think many table implementations support more than one axis dimension but there's no reason they couldn't. These are also possible transforms: Hour -> NamesAreHard( airport: Day -> Str, flights: Day -> Int, revenue: Day -> USD, ) Day -> Whatever( airport: Hour -> Str, flights: Hour -> Int, revenue: Hour -> USD, ) In my mental model, the heterogeneous struct acts as a "block" in the array. We can't remove it, we can only push an index into the fields or pull a shared column out. But there's no way to convert a heterogeneous table into an array. Actually there is a terrible way Most languages have unions or product types that let us say "this is a string OR integer". So we can make our airport data Day -> Hour -> AirportKey -> Int | Str | USD. Heck, might as well just say it's Day -> Hour -> AirportKey -> Any. But would anybody really be mad enough to use that in practice? Oh wait J does exactly that. J has an opaque datatype called a "box". A "table" is a function Dim1 -> Dim2 -> Box. You can see some examples of what that looks like here Misc Thoughts and Questions The heterogeneity barrier seems like it explains why we don't see multiple axes of table columns, while we do see multiple axes of array dimensions. But is that actually why? Is there a system out there that does have multiple columnar axes? The array x = [[a, b, a], [b, b, b]] has type 1..2 -> 1..3 -> {a, b}. Can we rearrange it to 1..2 -> {a, b} -> 1..3? No. But we can rearrange it to 1..2 -> {a, b} -> PowerSet(1..3), which maps rows and characters to columns with that character. [(a -> {1, 3} ++ b -> {2}), (a -> {} ++ b -> {1, 2, 3}]. We can also transform Row -> PowerSet(Col) into Row -> Col -> Bool, aka a boolean matrix. This makes sense to me as both forms are means of representing directed graphs. Are other function combinators useful for thinking about arrays? Does this model cover pivot tables? Can we extend it to relational data with multiple tables? Systems Distributed Talk (will be) Online The premier will be August 6 at 12 CST, here! I'll be there to answer questions / mock my own performance / generally make a fool of myself. Sacrilege! But it turns out in this context, it's easier to use 1-indexing than 0-indexing. In the years since I wrote that article I've settled on "each indexing choice matches different kinds of mathematical work", so mathematicians and computer scientists are best served by being able to choose their index. But software engineers need consistency, and 0-indexing is overall a net better consistency pick. ↩ This is right-associative: a -> b -> c means a -> (b -> c), not (a -> b) -> c. (1..3 -> 1..4) -> Int would be the associative array that maps length-3 arrays to integers. ↩ Technically it has type Num a => a -> a -> a, since (+) works on floats too. ↩ Notice that if each Airport had a unique name, we could pull it out into AirportName -> Airport(flights, revenue), but we still are stuck with two different values. ↩

a week ago 12 votes
Programming Language Escape Hatches

The excellent-but-defunct blog Programming in the 21st Century defines "puzzle languages" as languages were part of the appeal is in figuring out how to express a program idiomatically, like a puzzle. As examples, he lists Haskell, Erlang, and J. All puzzle languages, the author says, have an "escape" out of the puzzle model that is pragmatic but stigmatized. But many mainstream languages have escape hatches, too. Languages have a lot of properties. One of these properties is the language's capabilities, roughly the set of things you can do in the language. Capability is desirable but comes into conflicts with a lot of other desirable properties, like simplicity or efficiency. In particular, reducing the capability of a language means that all remaining programs share more in common, meaning there's more assumptions the compiler and programmer can make ("tractability"). Assumptions are generally used to reason about correctness, but can also be about things like optimization: J's assumption that everything is an array leads to high-performance "special combinations". Rust is the most famous example of mainstream language that trades capability for tractability.1 Rust has a lot of rules designed to prevent common memory errors, like keeping a reference to deallocated memory or modifying memory while something else is reading it. As a consequence, there's a lot of things that cannot be done in (safe) Rust, like interface with an external C function (as it doesn't have these guarantees). To do this, you need to use unsafe Rust, which lets you do additional things forbidden by safe Rust, such as deference a raw pointer. Everybody tells you not to use unsafe unless you absolutely 100% know what you're doing, and possibly not even then. Sounds like an escape hatch to me! To extrapolate, an escape hatch is a feature (either in the language itself or a particular implementation) that deliberately breaks core assumptions about the language in order to add capabilities. This explains both Rust and most of the so-called "puzzle languages": they need escape hatches because they have very strong conceptual models of the language which leads to lots of assumptions about programs. But plenty of "kitchen sink" mainstream languages have escape hatches, too: Some compilers let C++ code embed inline assembly. Languages built on .NET or the JVM has some sort of interop with C# or Java, and many of those languages make assumptions about programs that C#/Java do not. The SQL language has stored procedures as an escape hatch and vendors create a second escape hatch of user-defined functions. Ruby lets you bypass any form of encapsulation with send. Frameworks have escape hatches, too! React has an entire page on them. (Does eval in interpreted languages count as an escape hatch? It feels different, but it does add a lot of capability. Maybe they don't "break assumptions" in the same way?) The problem with escape hatches In all languages with escape hatches, the rule is "use this as carefully and sparingly as possible", to the point where a messy solution without an escape hatch is preferable to a clean solution with one. Breaking a core assumption is a big deal! If the language is operating as if its still true, it's going to do incorrect things. I recently had this problem in a TLA+ contract. TLA+ is a language for modeling complicated systems, and assumes that the model is a self-contained universe. The client wanted to use the TLA+ to test a real system. The model checker should send commands to a test device and check the next states were the same. This is straightforward to set up with the IOExec escape hatch.2 But the model checker assumed that state exploration was pure and it could skip around the state randomly, meaning it would do things like set x = 10, then skip to set x = 1, then skip back to inc x; assert x == 11. Oops! We eventually found workarounds but it took a lot of clever tricks to pull off. I'll probably write up the technique when I'm less busy with The Book. The other problem with escape hatches is the rest of the language is designed around not having said capabilities, meaning it can't support the feature as well as a language designed for them from the start. Even if your escape hatch code is clean, it might not cleanly integrate with the rest of your code. This is why people complain about unsafe Rust so often. It should be noted though that all languages with automatic memory management are trading capability for tractability, too. If you can't deference pointers, you can't deference null pointers. ↩ From the Community Modules (which come default with the VSCode extension). ↩

a week ago 16 votes
Maybe writing speed actually is a bottleneck for programming

I'm a big (neo)vim buff. My config is over 1500 lines and I regularly write new scripts. I recently ported my neovim config to a new laptop. Before then, I was using VSCode to write, and when I switched back I immediately saw a big gain in productivity. People often pooh-pooh vim (and other assistive writing technologies) by saying that writing code isn't the bottleneck in software development. Reading, understanding, and thinking through code is! Now I don't know how true this actually is in practice, because empirical studies of time spent coding are all over the place. Most of them, like this study, track time spent in the editor but don't distinguish between time spent reading code and time spent writing code. The only one I found that separates them was this study. It finds that developers spend only 5% of their time editing. It also finds they spend 14% of their time moving or resizing editor windows, so I don't know how clean their data is. But I have a bigger problem with "writing is not the bottleneck": when I think of a bottleneck, I imagine that no amount of improvement will lead to productivity gains. Like if a program is bottlenecked on the network, it isn't going to get noticeably faster with 100x more ram or compute. But being able to type code 100x faster, even with without corresponding improvements to reading and imagining code, would be huge. We'll assume the average developer writes at 80 words per minute, at five characters a word, for 400 characters a minute.What could we do if we instead wrote at 8,000 words/40k characters a minute? Writing fast Boilerplate is trivial Why do people like type inference? Because writing all of the types manually is annoying. Why don't people like boilerplate? Because it's annoying to write every damn time. Programmers like features that help them write less! That's not a problem if you can write all of the boilerplate in 0.1 seconds. You still have the problem of reading boilerplate heavy code, but you can use the remaining 0.9 seconds to churn out an extension that parses the file and presents the boilerplate in a more legible fashion. We can write more tooling This is something I've noticed with LLMs: when I can churn out crappy code as a free action, I use that to write lots of tools that assist me in writing good code. Even if I'm bottlenecked on a large program, I can still quickly write a script that helps me with something. Most of these aren't things I would have written because they'd take too long to write! Again, not the best comparison, because LLMs also shortcut learning the relevant APIs, so also optimize the "understanding code" part. Then again, if I could type real fast I could more quickly whip up experiments on new apis to learn them faster. We can do practices that slow us down in the short-term Something like test-driven development significantly slows down how fast you write production code, because you have to spend a lot more time writing test code. Pair programming trades speed of writing code for speed of understanding code. A two-order-of-magnitude writing speedup makes both of them effectively free. Or, if you're not an eXtreme Programming fan, you can more easily follow the The Power of Ten Rules and blanket your code with contracts and assertions. We could do more speculative editing This is probably the biggest difference in how we'd work if we could write 100x faster: it'd be much easier to try changes to the code to see if they're good ideas in the first place. How often have I tried optimizing something, only to find out it didn't make a difference? How often have I done a refactoring only to end up with lower-quality code overall? Too often. Over time it makes me prefer to try things that I know will work, and only "speculatively edit" when I think it be a fast change. If I could code 100x faster it would absolutely lead to me trying more speculative edits. This is especially big because I believe that lots of speculative edits are high-risk, high-reward: given 50 things we could do to the code, 49 won't make a difference and one will be a major improvement. If I only have time to try five things, I have a 10% chance of hitting the jackpot. If I can try 500 things I will get that reward every single time. Processes are built off constraints There are just a few ideas I came up with; there are probably others. Most of them, I suspect, will share the same property in common: they change the process of writing code to leverage the speedup. I can totally believe that a large speedup would not remove a bottleneck in the processes we currently use to write code. But that's because those processes are developed work within our existing constraints. Remove a constraint and new processes become possible. The way I see it, if our current process produces 1 Utils of Software / day, a 100x writing speedup might lead to only 1.5 UoS/day. But there are other processes that produce only 0.5 UoS/d because they are bottlenecked on writing speed. A 100x speedup would lead to 10 UoS/day. The problem with all of this that 100x speedup isn't realistic, and it's not obvious whether a 2x improvement would lead to better processes. Then again, one of the first custom vim function scripts I wrote was an aid to writing unit tests in a particular codebase, and it lead to me writing a lot more tests. So maybe even a 2x speedup is going to be speed things up, too. Patreon Stuff I wrote a couple of TLA+ specs to show how to model fork-join algorithms. I'm planning on eventually writing them up for my blog/learntla but it'll be a while, so if you want to see them in the meantime I put them up on Patreon.

2 weeks ago 18 votes
Logic for Programmers Turns One

I released Logic for Programmers exactly one year ago today. It feels weird to celebrate the anniversary of something that isn't 1.0 yet, but software projects have a proud tradition of celebrating a dozen anniversaries before 1.0. I wanted to share about what's changed in the past year and the work for the next six+ months. The Road to 0.1 I had been noodling on the idea of a logic book since the pandemic. The first time I wrote about it on the newsletter was in 2021! Then I said that it would be done by June and would be "under 50 pages". The idea was to cover logic as a "soft skill" that helped you think about things like requirements and stuff. That version sucked. If you want to see how much it sucked, I put it up on Patreon. Then I slept on the next draft for three years. Then in 2024 a lot of business fell through and I had a lot of free time, so with the help of Saul Pwanson I rewrote the book. This time I emphasized breadth over depth, trying to cover a lot more techniques. I also decided to self-publish it instead of pitching it to a publisher. Not going the traditional route would mean I would be responsible for paying for editing, advertising, graphic design etc, but I hoped that would be compensated by much higher royalties. It also meant I could release the book in early access and use early sales to fund further improvements. So I wrote up a draft in Sphinx, compiled it to LaTeX, and uploaded the PDF to leanpub. That was in June 2024. Since then I kept to a monthly cadence of updates, missing once in November (short-notice contract) and once last month (Systems Distributed). The book's now on v0.10. What's changed? A LOT v0.1 was very obviously an alpha, and I have made a lot of improvements since then. For one, the book no longer looks like a Sphinx manual. Compare! Also, the content is very, very different. v0.1 was 19,000 words, v.10 is 31,000.1 This comes from new chapters on TLA+, constraint/SMT solving, logic programming, and major expansions to the existing chapters. Originally, "Simplifying Conditionals" was 600 words. Six hundred words! It almost fit in two pages! The chapter is now 2600 words, now covering condition lifting, quantifier manipulation, helper predicates, and set optimizations. All the other chapters have either gotten similar facelifts or are scheduled to get facelifts. The last big change is the addition of book assets. Originally you had to manually copy over all of the code to try it out, which is a problem when there are samples in eight distinct languages! Now there are ready-to-go examples for each chapter, with instructions on how to set up each programming environment. This is also nice because it gives me breaks from writing to code instead. How did the book do? Leanpub's all-time visualizations are terrible, so I'll just give the summary: 1180 copies sold, $18,241 in royalties. That's a lot of money for something that isn't fully out yet! By comparison, Practical TLA+ has made me less than half of that, despite selling over 5x as many books. Self-publishing was the right choice! In that time I've paid about $400 for the book cover (worth it) and maybe $800 in Leanpub's advertising service (probably not worth it). Right now that doesn't come close to making back the time investment, but I think it can get there post-release. I believe there's a lot more potential customers via marketing. I think post-release 10k copies sold is within reach. Where is the book going? The main content work is rewrites: many of the chapters have not meaningfully changed since 1.0, so I am going through and rewriting them from scratch. So far four of the ten chapters have been rewritten. My (admittedly ambitious) goal is to rewrite three of them by the end of this month and another three by the end of next. I also want to do final passes on the rewritten chapters; as most of them have a few TODOs left lying around. (Also somehow in starting this newsletter and publishing it I realized that one of the chapters might be better split into two chapters, so there could well-be a tenth technique in v0.11 or v0.12!) After that, I will pass it to a copy editor while I work on improving the layout, making images, and indexing. I want to have something worthy of printing on a dead tree by 1.0. In terms of timelines, I am very roughly estimating something like this: Summer: final big changes and rewrites Early Autumn: graphic design and copy editing Late Autumn: proofing, figuring out printing stuff Winter: final ebook and initial print releases of 1.0. (If you know a service that helps get self-published books "past the finish line", I'd love to hear about it! Preferably something that works for a fee, not part of royalties.) This timeline may be disrupted by official client work, like a new TLA+ contract or a conference invitation. Needless to say, I am incredibly excited to complete this book and share the final version with you all. This is a book I wished for years ago, a book I wrote because nobody else would. It fills a critical gap in software educational material, and someday soon I'll be able to put a copy on my bookshelf. It's exhilarating and terrifying and above all, satisfying. It's also 150 pages vs 50 pages, but admittedly this is partially because I made the book smaller with a larger font. ↩

4 weeks ago 20 votes

More in programming

HTML is Dead, Long Live HTML

Rethinking DOM from first principles Browsers are in a very weird place. While WebAssembly has succeeded, even on the server, the client still feels largely the same as it did 10 years ago. Enthusiasts will tell you that accessing native web APIs via WASM is a solved problem, with some minimal JS glue. But the question not asked is why you would want to access the DOM. It's just the only option. So I'd like to explain why it really is time to send the DOM and its assorted APIs off to a farm somewhere, with some ideas on how. I won't pretend to know everything about browsers. Nobody knows everything anymore, and that's the problem. The 'Document' Model Few know how bad the DOM really is. In Chrome, document.body now has 350+ keys, grouped roughly like this: This doesn't include the CSS properties in document.body.style of which there are... 660. The boundary between properties and methods is very vague. Many are just facades with an invisible setter behind them. Some getters may trigger a just-in-time re-layout. There's ancient legacy stuff, like all the onevent properties nobody uses anymore. The DOM is not lean and continues to get fatter. Whether you notice this largely depends on whether you are making web pages or web applications. Most devs now avoid working with the DOM directly, though occasionally some purist will praise pure DOM as being superior to the various JS component/templating frameworks. What little declarative facilities the DOM has, like innerHTML, do not resemble modern UI patterns at all. The DOM has too many ways to do the same thing, none of them nice. connectedCallback() { const shadow = this.attachShadow({ mode: 'closed' }), template = document.getElementById('hello-world') .content.cloneNode(true), hwMsg = `Hello ${ this.name }`; Array.from(template.querySelectorAll('.hw-text')) .forEach(n => n.textContent = hwMsg); shadow.append(template); } Web Components deserve a mention, being the web-native equivalent of JS component libraries. But they came too late and are unpopular. The API seems clunky, with its Shadow DOM introducing new nesting and scoping layers. Proponents kinda read like apologetics. The achilles heel is the DOM's SGML/XML heritage, making everything stringly typed. React-likes do not have this problem, their syntax only looks like XML. Devs have learned not to keep state in the document, because it's inadequate for it. For HTML itself, there isn't much to critique because nothing has changed in 10-15 years. Only ARIA (accessibility) is notable, and only because this was what Semantic HTML was supposed to do and didn't. Semantic HTML never quite reached its goal. Despite dating from around 2011, there is e.g. no <thread> or <comment> tag, when those were well-established idioms. Instead, an article inside an article is probably a comment. The guidelines are... weird. There's this feeling that HTML always had paper-envy, and couldn't quite embrace or fully define its hypertext nature, and did not trust its users to follow clear rules. Stewardship of HTML has since firmly passed to WHATWG, really the browser vendors, who have not been able to define anything more concrete as a vision, and have instead just added epicycles at the margins. Along the way even CSS has grown expressions, because every templating language wants to become a programming language. Editability of HTML remains a sad footnote. While technically supported via contentEditable, actually wrangling this feature into something usable for applications is a dark art. I'm sure the Google Docs and Notion people have horror stories. Nobody really believes in the old gods of progressive enhancement and separating markup from style anymore, not if they make apps. Most of the applications you see nowadays will kitbash HTML/CSS/SVG into a pretty enough shape. But this comes with immense overhead, and is looking more and more like the opposite of a decent UI toolkit. The Slack input box Off-screen clipboard hacks Lists and tables must be virtualized by hand, taking over for layout, resizing, dragging, and so on. Making a chat window's scrollbar stick to the bottom is somebody's TODO, every single time. And the more you virtualize, the more you have to reinvent find-in-page, right-click menus, etc. The web blurred the distinction between UI and fluid content, which was novel at the time. But it makes less and less sense, because the UI part is a decade obsolete, and the content has largely homogenized. CSS is inside-out CSS doesn't have a stellar reputation either, but few can put their finger on exactly why. Where most people go wrong is to start with the wrong mental model, approaching it like a constraint solver. This is easy to show with e.g.: <div> <div style="height: 50%">...</div> <div style="height: 50%">...</div> </div> <div> <div style="height: 100%">...</div> <div style="height: 100%">...</div> </div> The first might seem reasonable: divide the parent into two halves vertically. But what about the second? Viewed as a set of constraints, it's contradictory, because the parent div is twice as tall as... itself. What will happen instead in both cases is the height is ignored. The parent height is unknown and CSS doesn't backtrack or iterate here. It just shrink-wraps the contents. If you set e.g. height: 300px on the parent, then it works, but the latter case will still just spill out. Outside-in and inside-out layout modes Instead, your mental model of CSS should be applying two passes of constraints, first going outside-in, and then inside-out. When you make an application frame, this is outside-in: the available space is divided, and the content inside does not affect sizing of panels. When paragraphs stack on a page, this is inside-out: the text stretches out its containing parent. This is what HTML wants to do naturally. By being structured this way, CSS layouts are computationally pretty simple. You can propagate the parent constraints down to the children, and then gather up the children's sizes in the other direction. This is attractive and allows webpages to scale well in terms of elements and text content. CSS is always inside-out by default, reflecting its document-oriented nature. The outside-in is not obvious, because it's up to you to pass all the constraints down, starting with body { height: 100%; }. This is why they always say vertical alignment in CSS is hard. Use flex grow and shrink for spill-free auto-layouts with completely reasonable gaps The scenario above is better handled with a CSS3 flex box (display: flex), which provides explicit control over how space is divided. Unfortunately flexing muddles the simple CSS model. To auto-flex, the layout algorithm must measure the "natural size" of every child. This means laying it out twice: first speculatively, as if floating in aether, and then again after growing or shrinking to fit: This sounds reasonable but can come with hidden surprises, because it's recursive. Doing speculative layout of a parent often requires full layout of unsized children. e.g. to know how text will wrap. If you nest it right, it could in theory cause an exponential blow up, though I've never heard of it being an issue. Instead you will only discover this when someone drops some large content in somewhere, and suddenly everything gets stretched out of whack. It's the opposite of the problem on the mug. To avoid the recursive dependency, you need to isolate the children's contents from the outside, thus making speculative layout trivial. This can be done with contain: size, or by manually setting the flex-basis size. CSS has gained a few constructs like contain or will-transform, which work directly with the layout system, and drop the pretense of one big happy layout. It reveals some of the layer-oriented nature underneath, and is a substitute for e.g. using position: absolute wrappers to do the same. What these do is strip off some of the semantics, and break the flow of DOM-wide constraints. These are overly broad by default and too document-oriented for the simpler cases. This is really a metaphor for all DOM APIs. The Good Parts? That said, flex box is pretty decent if you understand these caveats. Building layouts out of nested rows and columns with gaps is intuitive, and adapts well to varying sizes. There is a "CSS: The Good Parts" here, which you can make ergonomic with sufficient love. CSS grids also work similarly, they're just very painfully... CSSy in their syntax. But if you designed CSS layout from scratch, you wouldn't do it this way. You wouldn't have a subtractive API, with additional extra containment barrier hints. You would instead break the behavior down into its component facets, and use them à la carte. Outside-in and inside-out would both be legible as different kinds of containers and placement models. The inline-block and inline-flex display models illustrate this: it's a block or flex on the inside, but an inline element on the outside. These are two (mostly) orthogonal aspects of a box in a box model. Text and font styles are in fact the odd ones out, in hypertext. Properties like font size inherit from parent to child, so that formatting tags like <b> can work. But most of those 660 CSS properties do not do that. Setting a border on an element does not apply the same border to all its children recursively, that would be silly. It shows that CSS is at least two different things mashed together: a system for styling rich text based on inheritance... and a layout system for block and inline elements, nested recursively but without inheritance, only containment. They use the same syntax and APIs, but don't really cascade the same way. Combining this under one style-umbrella was a mistake. Worth pointing out: early ideas of relative em scaling have largely become irrelevant. We now think of logical vs device pixels instead, which is a far more sane solution, and closer to what users actually expect. SVG is natively integrated as well. Having SVGs in the DOM instead of just as <img> tags is useful to dynamically generate shapes and adjust icon styles. But while SVG is powerful, it's neither a subset nor superset of CSS. Even when it overlaps, there are subtle differences, like the affine transform. It has its own warts, like serializing all coordinates to strings. CSS has also gained the ability to round corners, draw gradients, and apply arbitrary clipping masks: it clearly has SVG-envy, but falls very short. SVG can e.g. do polygonal hit-testing for mouse events, which CSS cannot, and SVG has its own set of graphical layer effects. Whether you use HTML/CSS or SVG to render any particular element is based on specific annoying trade-offs, even if they're all scalable vectors on the back-end. In either case, there are also some roadblocks. I'll just mention three: text-ellipsis can only be used to truncate unwrapped text, not entire paragraphs. Detecting truncated text is even harder, as is just measuring text: the APIs are inadequate. Everyone just counts letters instead. position: sticky lets elements stay in place while scrolling with zero jank. While tailor-made for this purpose, it's subtly broken. Having elements remain unconditionally sticky requires an absurd nesting hack, when it should be trivial. The z-index property determines layering by absolute index. This inevitably leads to a z-index-war.css where everyone is putting in a new number +1 or -1 to make things layer correctly. There is no concept of relative Z positioning. For each of these features, we got stuck with v1 of whatever they could get working, instead of providing the right primitives. Getting this right isn't easy, it's the hard part of API design. You can only iterate on it, by building real stuff with it before finalizing it, and looking for the holes. Oil on Canvas So, DOM is bad, CSS is single-digit X% good, and SVG is ugly but necessary... and nobody is in a position to fix it? Well no. The diagnosis is that the middle layers don't suit anyone particularly well anymore. Just an HTML6 that finally removes things could be a good start. But most of what needs to happen is to liberate the functionality that is there already. This can be done in good or bad ways. Ideally you design your system so the "escape hatch" for custom use is the same API you built the user-space stuff with. That's what dogfooding is, and also how you get good kernels. A recent proposal here is HTML in Canvas, to draw HTML content into a <canvas>, with full control over the visual output. It's not very good. While it might seem useful, the only reason the API has the shape that it does is because it's shoehorned into the DOM: elements must be descendants of <canvas> to fully participate in layout and styling, and to make accessibility work. There are also "technical concerns" with using it off-screen. One example is this spinny cube: To make it interactive, you attach hit-testing rectangles and respond to paint events. This is a new kind of hit-testing API. But it only works in 2D... so it seems 3D-use is only cosmetic? I have many questions. Again, if you designed it from scratch, you wouldn't do it this way! In particular, it's absurd that you'd have to take over all interaction responsibilities for an element and its descendants just to be able to customize how it looks i.e. renders. Especially in a browser that has projective CSS 3D transforms. The use cases not covered by that, e.g. curved re-projection, will also need more complicated hit-testing than rectangles. Did they think this through? What happens when you put a dropdown in there? To me it seems like they couldn't really figure out how to unify CSS and SVG filters, or how to add shaders to CSS. Passing it thru canvas is the only viable option left. "At least it's programmable." Is it really? Screenshotting DOM content is 1 good use-case, but not what this is sold as at all. The whole reason to do "complex UIs on canvas" is to do all the things the DOM doesn't do, like virtualizing content, just-in-time layout and styling, visual effects, custom gestures and hit-testing, and so on. It's all nuts and bolts stuff. Having to pre-stage all the DOM content you want to draw sounds... very counterproductive. From a reactivity point-of-view it's also a bad idea to route this stuff back through the same document tree, because it sets up potential cycles with observers. A canvas that's rendering DOM content isn't really a document element anymore, it's doing something else entirely. Canvas-based spreadsheet that skips the DOM entirely The actual achilles heel of canvas is that you don't have any real access to system fonts, text layout APIs, or UI utilities. It's quite absurd how basic it is. You have to implement everything from scratch, including Unicode word splitting, just to get wrapped text. The proposal is "just use the DOM as a black box for content." But we already know that you can't do anything except more CSS/SVG kitbashing this way. text-ellipsis and friends will still be broken, and you will still need to implement UIs circa 1990 from scratch to fix it. It's all-or-nothing when you actually want something right in the middle. That's why the lower level needs to be opened up. Where To Go From Here The goals of "HTML in Canvas" do strike a chord, with chunks of HTML used as free-floating fragments, a notion that has always existed under the hood. It's a composite value type you can handle. But it should not drag 20 years of useless baggage along, while not enabling anything truly novel. The kitbashing of the web has also resulted in enormous stagnation, and a loss of general UI finesse. When UI behaviors have to be mined out of divs, it limits the kinds of solutions you can even consider. Fixing this within DOM/HTML seems unwise, because there's just too much mess inside. Instead, new surfaces should be opened up outside of it. WebGPU-based box model My schtick here has become to point awkwardly at Use.GPU's HTML-like renderer, which does a full X/Y flex model in a fraction of the complexity or code. I don't mean my stuff is super great, no, it's pretty bare-bones and kinda niche... and yet definitely nicer. Vertical centering is easy. Positioning makes sense. There is no semantic HTML or CSS cascade, just first-class layout. You don't need 61 different accessors for border* either. You can just attach shaders to divs. Like, that's what people wanted right? Here's a blueprint, it's mostly just SDFs. Font and markup concerns only appear at the leaves of the tree, where the text sits. It's striking how you can do like 90% of what the DOM does here, with a fraction of the complexity of HTML/CSS/SVG, if you just reinvent that wheel. Done by 1 guy. And yes, I know about the second 90% too. The classic data model here is of a view tree and a render tree. What should the view tree actually look like? And what can it be lowered into? What is it being lowered into right now, by a giant pile of legacy crud? Alt-browser projects like Servo or Ladybird are in a position to make good proposals here. They have the freshest implementations, and are targeting the most essential features first. The big browser vendors could also do it, but well, taste matters. Good big systems grow from good small ones, not bad big ones. Maybe if Mozilla hadn't imploded... but alas. Platform-native UI toolkits are still playing catch up with declarative and reactive UI, so that's that. Native Electron-alternatives like Tauri could be helpful, but they don't treat origin isolation as a design constraint, which makes security teams antsy. There's a feasible carrot to dangle for them though, namely in the form of better process isolation. Because of CPU exploits like Spectre, multi-threading via SharedArrayBuffer and Web Workers is kinda dead on arrival anyway, and that affects all WASM. The details are boring but right now it's an impossible sell when websites have to have things like OAuth and Zendesk integrated into them. Reinventing the DOM to ditch all legacy baggage could coincide with redesigning it for a more multi-threaded, multi-origin, and async web. The browser engines are already multi-process... what did they learn? A lot has happened since Netscape, with advances in structured concurrency, ownership semantics, FP effects... all could come in handy here. * * * Step 1 should just be a data model that doesn't have 350+ properties per node tho. Don't be under the mistaken impression that this isn't entirely fixable.

16 hours ago 4 votes
Omarchy is on the move

Omarchy has been improving at a furious pace. Since it was first released on June 26, I've pushed out 18(!) new releases together with a rapidly growing community of collaborators, users, and new-to-Linux enthusiasts. We have about 3,500 early adopters on the Omarchy Discord, 250 pull requests processed, and one heck of an awesome Arch + Hyprland Linux environment to show for it! The latest release is 1.11.0, and it brings an entirely overhauled control menu to the experience. Now everything is controlled through a single, unified system that makes it super fast to operate Omarchy's settings and options through the keyboard. It's exactly the kind of hands-off-the-mouse operation that I've always wanted, and with Linux, I've been able to build it just to my tastes. It's a delight. There's really something special going on in Linux at the moment. Arch has been around for twenty years, but with Hyprland on top, it's been catapulted in front of an entirely new audience. Folks who'd never thought that open source could be able to deliver a desktop experience worth giving up Windows or macOS for. Of course, Linux isn't for everyone. It's still an adventure! An awesome, teach-you-about-computers adventure, but not everyone is into computer adventures. Plenty of people are content with a computer appliance where they never have to look under the hood. All good. Microsoft and Apple have those people covered. But the world is a big place! And in that big place, there are a growing number of computer enthusiasts who've grown very disillusioned with both Microsoft and Apple. Folks who could be enticed to give Linux a look, if the barrier was a little lower and the benefits a little clearer. Those are the folks I'm building Omarchy for.

22 hours ago 3 votes
Digital hygiene: Notifications

Take back your attention.

5 hours ago 2 votes
We Are Still the Web

Twenty years ago, Kevin Kelly wrote an absolutely seminal piece for Wired. This week is a great opportunity to look back at it. The post We Are Still the Web appeared first on The History of the Web.

yesterday 4 votes