More from the singularity is nearer
This is not going to be a cakewalk like self driving cars. Most of comma’s competition is now out of business, taking billions and billions of dollars with it. Re: Tesla and FSD, we always expected Tesla to have the lead, but it’s not a winner take all market, it will look more like iOS vs Android. comma has been around for 10 years, is profitable, and is now growing rapidly. In self driving, most of the competition wasn’t even playing the right game. This isn’t how it is for ML frameworks. tinygrad’s competition is playing the right game, open source, and run by some quite smart people. But this is my second startup, so hopefully taking a bit more risk is appropriate. For comma to win, all it would take is people in 2016 being wrong about LIDAR, mapping, end to end, and hand coding, which hopefully we all agree now that they were. For tinygrad to win, it requires something much deeper to be wrong about software development in general. As it stands now, tinygrad is 14556 lines. Line count is not a perfect proxy for complexity, but when you have differences of multiple orders of magnitude, it might mean something. I asked ChatGPT to estimate the lines of code in PyTorch, JAX, and MLIR. JAX = 400k MLIR = 950k PyTorch = 3300k They range from one to two orders of magnitude off. And this isn’t even including all the libraries and drivers the other frameworks rely on, CUDA, cuBLAS, Triton, nccl, LLVM, etc…. tinygrad includes every single piece of code needed to drive an AMD RDNA3 GPU except for LLVM, and we plan to remove LLVM in a year or two as well. But so what? What does line count matter? One hypothesis is that tinygrad is only smaller because it’s not speed or feature competitive, and that if and when it becomes competitive, it will also be that many lines. But I just don’t think that’s true. tinygrad is already feature competitive, and for speed, I think the bitter lesson also applies to software. When you look at the machine learning ecosystem, you realize it’s just the same problems over and over again. The problem of multi machine, multi GPU, multi SM, multi ALU, cross machine memory scheduling, DRAM scheduling, SRAM scheduling, register scheduling, it’s all the same underlying problem at different scales. And yet, in all the current ecosystems, there are completely different codebases and libraries at each scale. I don’t think this stands. I suspect there is a simple formulation of the problem underlying all of the scheduling. Of course, this problem will be in NP and hard to optimize, but I’m betting the bitter lesson wins here. The goal of the tinygrad project is to abstract away everything except the absolute core problem in the cleanest way possible. This is why we need to replace everything. A model for the hardware is simple compared to a model for CUDA. If we succeed, tinygrad will not only be the fastest NN framework, but it will be under 25k lines all in, GPT-5 scale training job to MMIO on the PCIe bus! Here are the steps to get there: Expose the underlying search problem spanning several orders of magnitude. Due to the execution of neural networks not being data dependent, this problem is very amenable to search. Make sure your formulation is simple and complete. Fully capture all dimensions of the search space. The optimization goal is simple, run faster. Apply the state of the art in search. Burn compute. Use LLMs to guide. Use SAT solvers. Reinforcement learning. It doesn’t matter, there’s no way to cheat this goal. Just see if it runs faster. If this works, not only do we win with tinygrad, but hopefully people begin to rethink software in general. Of course, it’s a big if, this isn’t like comma where it was hard to lose. But if it wins… The main thing to watch is development speed. Our bet has to be that tinygrad’s development speed is outpacing the others. We have the AMD contract to train LLaMA 405B as fast as NVIDIA due in a year, let’s see if we succeed.
I signed up for Hinge. Holy shit with the boosts. How does someone who works on this wake up every morning and feel okay about themselves? Similarly with the tip screens, Uber algorithm, all the zero sum bullshit using all the tricks of psychology to extract a little bit more from every interaction in society. Nudge. Nudge. NUDGE. Want to partake in normal society like buying a coffee, going on a date, getting a ride, paying a friend. Oh, there’s a middle man now. An evil ominous middleman using state of the art AI algorithms to extract just a little bit more from you. But eventually the market will fix this, right? People will feel sick of being manipulated and move elsewhere? Ahhh, but they see that coming long before you do. They have dashboards. Quick Jeeves, tune the AI to make people feel less manipulated. Give them a little bit more for now, we have to think about maximizing lifetime customer value here. Oh the AI already did this on its own? Jeeves you’ve been replaced! People perpetually on the edge. You want to opt out of this all you say? Good luck running a competitive business! Every metric is now a target. You better maximize engagement or you will lose engagement this is a red queen’s race we can’t afford to lose! Burn all the social capital, burn all your values, FEED IT ALL TO MOLOCH! Someday, people will have to realize we live in a society. What will it take? A complete self cannibalization to the point you can’t eat your own mouth? It sure as hell isn’t going to be people opting out, that’s a collective action problem you can’t solve. Democracy, haha, you think the algorithms will let you vote to kill them? Your vote is as decoupled from action as the amount Uber pays the driver is decoupled from the fare that you pay. There’s no reform here, there’s only revolution. Will it simply be a huge financial collapse? Or do we need World War 3? And even World War 3 is on a spectrum. Will mass starvation fix this? Or will the attitude of thinking it’s okay to manipulate others at scale persist even past that? He’s got his, and I’ve got mine… If you open a government S&P 500 account for everyone with $1,000 at birth that’ll pay their social security cause it like…goes up…wait who’s creating this value again? It’s not okay. Advertising is not okay. Price discrimination is not okay. Using big data, machine learning, and psychology to manipulate others at scale is not okay. But you aren’t going to learn this lesson until you have fed a huge majority of your customers to Moloch. Modern capitialism is wireheading. Release the hypnodrones. How many cans of Pepsi did you want them to consume an hour again?
“For example, if one believes that affirmative action is good for black people, does it make sense to demand affirmative action in hostile or dogmatic terms? Obviously it would be more productive to take a diplomatic and conciliatory approach that would make at least verbal and symbolic concessions to white people who think that affirmative action discriminates against them. But leftist activists do not take such an approach because it would not satisfy their emotional needs.” – Unabomber Manifesto To date, the Trump administration has been an absolute tragedy. It has been the acting out of emotions. There are no adults in the room. I’m not saying there would have been adults in the room with the Kamala regime either, but I had some hopes for positive change with the Trump tech-bro alliance and now they are gone. At least truths are being laid bare versus heads being buried in the sands of joy, but I think there was a much better way. For example, I don’t support America funding the war in Ukraine. But the way Zelensky was treated is just dumb. See the Unabomber quote above, between this and the Munich speech, Mr. JD Vance, I hope your emotional needs are being met (at the expense of the good will of our allies). It's the economy, stupid Regardless of anyone’s long-term objectives in the US, be they decoupling from China, bringing manufacturing to the US, bringing lifestyle improvements to US citizens; I think it’s unquestionable that uncertainty about the future was needlessly increased. And unless the uncertainty was the goal, I can’t figure out why things were done the way they were. And if the uncertainty was the goal…uhhh…is our government captured by Russian or Chinese agents? Because that’s who benefits. I don’t trust the news very much. I have no idea if the guy in the El Salavdor prison had a fair trial, if the students being deported are criminals, or even if they are being deported at all. It’s really hard and time consuming to get to the truth about any of these things. However, when markets crash. that is obviously real. With the news, there’s usually no way to trade on it being real or fake, there’s nobody to take the other side. But with big public markets, there’s very deep liquidity if you think they are priced wrong. In addition to the 10% the market is down, the dollar is also down 10%. Considering the market is priced in dollars, it’s closer to 20% down. And even worse on top of all of this, prices are going up due to the tariffs. Was crashing the economy the goal? A side-effect of a greater plan? Because given how this was executed, a 3-week old LLM could have told you shit was gonna crash. And it’s not going to bring manufacturing back. I have done manufacturing in America for years, and anyone with any experience could have told you that this wouldn’t work, manufacturing requires long term investment and long term investment requires stability. What was the real goal here? Elon, you need to reconcile with your daughter Andrew Callaghan did a good piece on Elon’s radicalization. I get it, we have all been there. For me it was Gamergate (which still has a terrible wikipedia page that doesn’t explain what it was). But this doesn’t have to be you forever. You are the closest thing to an adult in any room in America. When you compare America to China, it’s really more like comparing Elon to China. ULA is a little joke compared to CNSA. And look into what percent of US car exports are Teslas. The man is singlehandedly beating the rest of the US combined. If you want any hope of standing against China, your political coalition better include him. Elon has been pretty politically quiet lately. I’m sure he knew exactly what would happen with the tariffs, but he couldn’t stop them. I got fooled too, thought it could be different this time. But it’s no different from 2017. (btw, we are finally beating climate change thanks to cheap solar panels from China) I know the idea of PR is against a lot of what you believe in, but you need to heads down put together a large scale PR campaign, distance yourself from this train wreck, denounce stupid fake right wing conspiracy theories, reconcile with your daughter (from a reader of sci-fi and The Culture, is the trans thing that hard to understand?), resolve your stupid beef with OpenAI (we are all disappointed, but you don’t have a great track record for open source either), and start building a new political party. Pro large scale legal immigration, not a single illegal border crosser. Pro choice (within reason), and also pro gun (within reason). Inclusive and diverse, with an unwavering focus on merit. Anti crime, with an understanding that victimless crime is not crime. Expose higher education and the medical system to the free market (watch how fast prices fall) Free market and trade, but not an unregulated market. Markets require regulation to be free. It’s probably the only shot we have against China. The current Republicans and Democrats are just far too stupid; the Chinese are watching this tariff drama and laughing their asses off. Their plans are measured in centuries. America, do you want to be a protectionist backwater? If so, and all the thymos is gone, then there’s no place for me there. If this is really the way things are going, the only thing for anyone to do is leave. We’ll see how it shapes up in the next few years. But if the racists or the other racists are still running the show, we really are just cooked. Enjoy your handouts to black people and your handouts to white people in a poverty stricken shithole.
You know about Critical Race Theory, right? It says that if there’s an imbalance in, say, income between races, it must be due to discrimination. This is what wokism seems to be, and it’s moronic and false. The right wing has invented something equally stupid. Introducing Critical Trade Theory, stolen from this tweet. If there’s an imbalance in trade between countries, it must be due to unfair practices. (not due to the obvious, like one country is 10x richer than the other) There’s really only one way the trade deficits will go away, and that’s if trade goes to zero (or maybe if all these countries become richer than America). Same thing with the race deficits, no amount of “leg up” bullshit will change them. Why are all the politicians in America anti-growth anti-reality idiots who want to drive us into the poor house? The way this tariff shit is being done is another stupid form of anti-merit benefits to chosen groups of people, with a whole lot of grift to go along with it. Makes me just not want to play.
More in programming
Am I a good programmer? The short answer is: I don’t know what that means. I have been programming for 52 years now, having started in a public high school class in 1973, which is pretty rare because few high schools offered such an opportunity back then. I
The world is waking to the fact that talk therapy is neither the only nor the best way to cure a garden-variety petite depression. Something many people will encounter at some point in their lives. Studies have shown that exercise, for example, is a more effective treatment than talk therapy (and pharmaceuticals!) when dealing with such episodes. But I'm just as interested in the role building competence can have in warding off the demons. And partly because of this meme: I've talked about it before, but I keep coming back to the fact that it's exactly backwards. That signing up for an educational quest into Linux, history, or motorcycle repair actually is an incredibly effective alternative to therapy! At least for men who'd prefer to feel useful over being listened to, which, in my experience, is most of them. This is why I find it so misguided when people who undertake those quests sell their journey short with self-effacing jibes about how much an unattractive nerd it makes them to care about their hobby. Mihaly Csikszentmihalyi detailed back in 1990 how peak human happiness arrives exactly in these moments of flow when your competence is stretched by a difficult-but-doable challenge. Don't tell me those endorphins don't also help counter the darkness. But it's just as much about the fact that these pursuits of competence usually offer a great opportunity for community as well that seals the deal. I've found time and again that people are starved for the kind of topic-based connections that, say, learning about Linux offers in spades. You're not just learning, you're learning with others. That is a time-tested antidote to depression: Forming and cultivating meaningful human connections. Yes, doing so over the internet isn't as powerful as doing it in person, but it's still powerful. It still offers community, involvement, and plenty of invitation to carry a meaningful burden. Open source nails this trifecta of motivations to a T. There are endless paths of discovery and mastery available. There are tons of fellow travelers with whom to connect and collaborate. And you'll find an unlimited number of meaningful burdens in maintainerships open for the taking. So next time you see that meme, you should cheer that the talk therapy table is empty. Leave it available for the severe, pathological cases that exercise and the pursuit of competence can't cure. Most people just don't need therapy, they need purpose, they need competence, they need exercise, and they need community.
The excellent-but-defunct blog Programming in the 21st Century defines "puzzle languages" as languages were part of the appeal is in figuring out how to express a program idiomatically, like a puzzle. As examples, he lists Haskell, Erlang, and J. All puzzle languages, the author says, have an "escape" out of the puzzle model that is pragmatic but stigmatized. But many mainstream languages have escape hatches, too. Languages have a lot of properties. One of these properties is the language's capabilities, roughly the set of things you can do in the language. Capability is desirable but comes into conflicts with a lot of other desirable properties, like simplicity or efficiency. In particular, reducing the capability of a language means that all remaining programs share more in common, meaning there's more assumptions the compiler and programmer can make ("tractability"). Assumptions are generally used to reason about correctness, but can also be about things like optimization: J's assumption that everything is an array leads to high-performance "special combinations". Rust is the most famous example of mainstream language that trades capability for tractability.1 Rust has a lot of rules designed to prevent common memory errors, like keeping a reference to deallocated memory or modifying memory while something else is reading it. As a consequence, there's a lot of things that cannot be done in (safe) Rust, like interface with an external C function (as it doesn't have these guarantees). To do this, you need to use unsafe Rust, which lets you do additional things forbidden by safe Rust, such as deference a raw pointer. Everybody tells you not to use unsafe unless you absolutely 100% know what you're doing, and possibly not even then. Sounds like an escape hatch to me! To extrapolate, an escape hatch is a feature (either in the language itself or a particular implementation) that deliberately breaks core assumptions about the language in order to add capabilities. This explains both Rust and most of the so-called "puzzle languages": they need escape hatches because they have very strong conceptual models of the language which leads to lots of assumptions about programs. But plenty of "kitchen sink" mainstream languages have escape hatches, too: Some compilers let C++ code embed inline assembly. Languages built on .NET or the JVM has some sort of interop with C# or Java, and many of those languages make assumptions about programs that C#/Java do not. The SQL language has stored procedures as an escape hatch and vendors create a second escape hatch of user-defined functions. Ruby lets you bypass any form of encapsulation with send. Frameworks have escape hatches, too! React has an entire page on them. (Does eval in interpreted languages count as an escape hatch? It feels different, but it does add a lot of capability. Maybe they don't "break assumptions" in the same way?) The problem with escape hatches In all languages with escape hatches, the rule is "use this as carefully and sparingly as possible", to the point where a messy solution without an escape hatch is preferable to a clean solution with one. Breaking a core assumption is a big deal! If the language is operating as if its still true, it's going to do incorrect things. I recently had this problem in a TLA+ contract. TLA+ is a language for modeling complicated systems, and assumes that the model is a self-contained universe. The client wanted to use the TLA+ to test a real system. The model checker should send commands to a test device and check the next states were the same. This is straightforward to set up with the IOExec escape hatch.2 But the model checker assumed that state exploration was pure and it could skip around the state randomly, meaning it would do things like set x = 10, then skip to set x = 1, then skip back to inc x; assert x == 11. Oops! We eventually found workarounds but it took a lot of clever tricks to pull off. I'll probably write up the technique when I'm less busy with The Book. The other problem with escape hatches is the rest of the language is designed around not having said capabilities, meaning it can't support the feature as well as a language designed for them from the start. Even if your escape hatch code is clean, it might not cleanly integrate with the rest of your code. This is why people complain about unsafe Rust so often. It should be noted though that all languages with automatic memory management are trading capability for tractability, too. If you can't deference pointers, you can't deference null pointers. ↩ From the Community Modules (which come default with the VSCode extension). ↩
I wrote a lot of blog posts over my time at Parse, but they all evaporated after Facebook killed the product. Most of them I didn’t care about (there were, ahem, a lot of status updates and “service reliability announcements”, but I was mad about losing this one in particular, a deceptively casual retrospective of […]
It's only been two months since I discovered the power and joy of this new generation of mini PCs. My journey started out with a Minisforum UM870, which is a lovely machine, but since then, I've come to really appreciate the work of Beelink. In a crowded market for mini PCs, Beelink stands out with their superior build quality, their class-leading cooling and silent operation, and their use of fully Linux-compatible components (the UM870 shipped with a MediaTek bluetooth/wifi card that doesn't work with Linux!). It's the complete package at three super compelling price points. For $289, you can get the EQR5, which runs an 8-core AMD Zen3 5825U that puts out 1723/6419 in Geekbench, and comes with 16GB RAM and 500GB NVMe. I've run Omarchy on it, and it flies. For me, the main drawback was the lack of a DisplayPort, which kept me from using it with an Apple display, and the fact that the SER8 exists. But if you're on a budget, and you're fine with HDMI only, it's a wild bargain. For $499, you can get the SER8. That's the price-to-performance sweet spot in the range. It uses the excellent 8-core AMD Zen4 8745HS that puts out 2595/12985 in Geekbench (~M4 multi-core numbers!), and runs our HEY test suite with 30,000 assertions almost as fast as an M4 Max! At that price, you get 32GB RAM + 1TB NVMe, as well as a DisplayPort, so it works with both the Apple 5K Studio Display and the Apple 6K XDR Display (you just need the right cable). Main drawback is limited wifi/bluetooth range, but Beelink tells me there's a fix on the way for that. For $929, you can get the SER9 HX370. This is the top dog in this form factor. It uses the incredible 12-core AMD Zen5 HX370 that hits 2990/15611 in Geekbench, and runs our HEY test suite faster than any Apple M chip I've ever tested. The built-in graphics are also very capable. Enough to play a ton of games at 1080p. It also sorted the SER8's current wifi/bluetooth range issue. I ran the SER8 as my main computer for a while, but now I'm using the SER9, and I just about never feel like I need anything more. Yes, the Framework Desktop, with its insane AMD Max 395+ chip, is even more bonkers. It almost cuts the HEY test suite time in half(!), but it's also $1,795, and not yet generally available. (But preorders are open for the ballers!). Whichever machine fits your budget, it's frankly incredible that we have this kind of performance and efficiency available at these prices with all of these Beelinks drawing less than 10 watt at idle and no more than 100 watt at peak! So it's no wonder that Beelink has been selling these units like hotcakes since I started talking about them on X as the ideal, cheap Omarchy desktop computers. It's such a symbiotic relationship. There are a ton of programmers who have become Linux curious, and Beelink offers no-brainer options to give that a try at a bargain. I just love when that happens. The perfect intersection of hardware, software, and timing. That's what we got here. It's a Beelink, baby! (And no, before you ask, I don't get any royalties, there's no affiliate link, and I don't own any shares in Beelink. I just love discovering great technology and seeing people start their Linux journey with an awesome, affordable computer!)