More from the singularity is nearer
Intel is sitting on a huge amount of card inventory they can’t move, largely because of bad software. Most of this is a summary of the public #intel-hardware channel in the tinygrad discord. Intel currently is sitting on: 15,000 Gaudi 2 cards (with baseboards) 5,100 Intel Data Center GPU Max 1450s (without baseboards) If you were Intel, what would you do with them? First, starting with the Gaudi cards. The open source repo needed to control them was archived on Feb 4, 2025. There’s a closed source version of this that’s maybe still maintained, but eww closed source and do you think it’s really maintained? The architecture is kind of tragic, and that’s likely why they didn’t open source it. Unlike every other accelerator I have seen, the MMEs, which is where all the FLOPS are, are not controllable by the TPCs. While the TPCs have an LLVM port, the MME is not documented. After some poking around, I found the spec: It’s highly fixed function, looks very similar to the Apple ANE. But that’s not even the real problem with it. The problem is that it is controlled by queues, not by the TPCs. Unpacking habanalabs-dkms-1.19.2-32.all.deb you can find the queues. There is some way to push a command stream to the device so you don’t actually have to deal with the host itself for the queues. But that doesn’t prevent you having to decompose the network you are trying to run into something you can put on this fixed function block. Programmability is on a spectrum, ranging from CPUs being the easiest, to GPUs, to things like the Qualcomm DSP / Google TPU (where at least you drive the MME from the program), to this and the Apple ANE being the hardest. While it’s impressive that they actually got on MLPerf Training v4.0 training GPT3, I suspect it’s all hand coded, and if you even can deviate off the trodden path you’ll get almost no perf. Accelerators like this are okay for low power inference where you can adjust the model architecture for the target, Apple does a great job of this. But this will never be acceptable for a training chip. Then there’s the Data Center GPU Max 1450. Intel actually sent us a few of these. You quickly run into a problem…how do you plug them in? They need OAM sockets, 48V power, and a cooling solution that can sink 600W. As far as I can tell, they were only ever deployed in two systems, the Aurora Supercomputer and the Dell XE9640. It’s hard to know, but I really doubt many of these Dell systems were sold. Intel then sent us this carrier board. In some ways it’s helpful, but in other ways it’s not at all. It still doesn’t solve cooling or power, and you need to buy 16x MCIO cables (cheap in quantity, but expensive and hard to find off the shelf). Also, I never got a straight answer, but I really doubt Intel has many of these boards. And that board doesn’t look cheap to manufacturer more of. The connectors alone, which you need two of per GPU, cost $26 each. That’s $104 for just the OAM connectors. tiny corp was in discussions to buy these GPUs. How much would you pay for one of these on a PCIe card? The specs look great. 839 TFLOPS, 128 GB of ram, 3.3 TB/s of bandwidth. However…read this article. Even in simple synthetic benchmarks, the chip doesn’t get anywhere near its max performance, and it looks to be for fundamental reasons like memory latency. We estimate we could sell PCIe versions of these GPUs for $1,000; I don’t think most people know how hard it is to move non NVIDIA hardware. Before you say you’d pay more, ask yourself, do you really want to deal with the software? An adapter card has four pieces. A PCB for the card, a 12->48V voltage converter, a heatsink, and a fan. My quote from the guy who makes an OAM adapter board was $310 for 10+ PCBs and $75 for the voltage converter. A heatsink that can handle 600W (heat pipes + vapor chamber) is going to cost $100, then maybe $20 more for the fan. That’s $505, and you still need to assemble and test them, oh and now there’s tariffs. Maybe you can get this down to $400 in ~1000 quantity. So $200 for the GPU, $400 for the adapter, $100 for shipping/fulfillment/returns (more if you use Amazon), and 30% profit if you sell at $1k. tiny would net $1M on this, which has to cover NRE and you have risk of unsold inventory. We offered Intel $200 per GPU (a $680k wire) and they said no. They wanted $600. I suspect that unless a supercomputer person who already uses these GPUs wants to buy more, they will ride it to zero. tl;dr: there’s 5100 of these GPUs with no simple way to plug them in. It’s unclear if they worth the cost of the slot they go in. I bet they end up shredded, or maybe dumped on eBay for $50 each in a year like the Xeon Phi cards. If you buy one, good luck plugging it in! The reason Meta and friends buy some AMD is as a hedge against NVIDIA. Even if it’s not usable, AMD has progressed on a solid steady roadmap, with a clear continuation from the 2018 MI50 (which you can now buy for 99% off), to the MI325X which is a super exciting chip (AMD is king of chiplets). They are even showing signs of finally investing in software, which makes me bullish. If NVIDIA stumbles for a generation, this is AMD’s game. The ROCm “copy each NVIDIA repo” strategy actually works if your competition stumbles. They can win GPUs with slow and steady improvement + competition stumbling, that’s how AMD won server CPUs. With these Intel chips, I’m not sure who they would appeal to. Ponte Vecchio is cancelled. There’s no point in investing in the platform if there’s not going to be a next generation, and therefore nobody can justify the cost of developing software, therefore there won’t be software, therefore they aren’t worth plugging in. Where does this leave Intel’s AI roadmap? The successor to Ponte Vecchio was Rialto Bridge, but that was cancelled. The successor to that was Falcon Shores, but that was also cancelled. Intel claims the next GPU will be “Jaguar Shores”, but fool me once… To quote JazzLord1234 from reddit “No point even bothering to listen to their roadmaps anymore. They have squandered all their credibility.” Gaudi 3 is a flop due to “unbaked software”, but as much as I usually do blame software, nothing has changed from Gaudi 2 and it’s just a really hard chip to program for. So there’s no future there either. I can’t say that “Jaguar Shores” square instills confidence. It didn’t inspire confidence for “Joseph B.” on LinkedIn either. From my interactions with Intel people, it seems there’s no individuals with power there, it’s all committee like leadership. The problem with this is there’s nobody who can say yes, just many people who can say no. Hence all the cancellations and the nonsense strategy. AMD’s dysfunction is different. from the beginning they had leadership that can do things (Lisa Su replied to my first e-mail), they just didn’t see the value in investing in software until recently. They sort of had a point if they were only targeting hyperscalars. but it seems like SemiAnalysis got through to them that hyperscalars aren’t going to deal with bad software either. It remains to be seem if they can shift culture to actually deliver good software, but there’s movement in that direction, and if they succeed AMD is so undervalued. Their hardware is good. With Intel, until that committee style leadership is gone, there’s 0 chance for success. Committee leadership is fine if you are trying to maintain, but Intel’s AI situation is even more hopeless than AMDs, and you’d need something major to turn it around. At least with AMD, you can try installing ROCm and be frustrated when there are bugs. Every time I have tried Intel’s software I can’t even recall getting the import to work, and the card wasn’t powerful enough that I cared. Intel needs actual leadership to turn this around, or there’s 0 future in Intel AI.
AMD is sending us the two MI300X boxes we asked for. They are in the mail. It took a bit, but AMD passed my cultural test. I now believe they aren’t going to shoot themselves in the foot on software, and if that’s true, there’s absolutely no reason they should be worth 1/16th of NVIDIA. CUDA isn’t really the moat people think it is, it was just an early ecosystem. tiny corp has a fully sovereign AMD stack, and soon we’ll port it to the MI300X. You won’t even have to use tinygrad proper, tinygrad has a torch frontend now. Either NVIDIA is super overvalued or AMD is undervalued. If the petaflop gets commoditized (tiny corp’s mission), the current situation doesn’t make any sense. The hardware is similar, AMD even got the double throughput Tensor Cores on RDNA4 (NVIDIA artificially halves this on their cards, soon they won’t be able to). I’m betting on AMD being undervalued, and that the demand for AI has barely started. With good software, the MI300X should outperform the H100. In for a quarter million. Long term. It can always dip short term, but check back in 5 years.
This is a map of primary trading partners, US vs China, and how it has evolved over the last 20 years. Think about it, and realize this probably reflects your experience. I know there was a similar panic about Japan in the 80s, but Japan by population has always been 3x smaller than the US, whereas China is 3x larger. In addition, we had and have military bases in Japan. This is not the same situation. The US, since I have been born, has been coasting. The main product made by the US is the dollar, and it used those manufactured dollars to outsource everything. Most jobs in the US are now basically fake. It’s basically an economy in which five people stick a pipe in the ground, but that pipe is the fed and the oil was the good will built up over 1870-1970. In 2008, with the bailouts, it was made clear that the US has no interest in reform. The next decade, in perhaps a spitting in your face move, the fed made the interest rate 0. Known as ZIRP, this had never been done before. This led to insane perversions. When I got into business, I didn’t understand that business in America was mostly a total scam. Sure, you might look at a single business, and be like, oh, that sounds reasonable, but then you zoom out and look at the entire system, and it doesn’t really make sense. It’s scams feeding other scams. Wanna each start a business, pass dollars back and forth over and over again, and drive both our revenues super high? Sure, we don’t produce anything, but we have companies with high revenues and we can raise money based on those revenues. We’ll both be rich! Let’s do it with a bunch of extra steps so people don’t catch on though. They’ll only see it reflected in the lack of movement of real macro metrics. You see, the US is a “developed” country, which means real growth is over? You do understand that guns and boats are made of steel, right? Oh, airplanes aren’t, they are made of aluminum. Oh…right, yea, it’s not just steel it is absolutely everything. The future is chips you say? All the good chips are made in the Republic of China you say? This 2021 article lays it out clearly, and it also explains why nothing I saw in Silicon Valley made any sense. I’m not going to go into the personal stories, but I just had an underlying assumption that the goal was growth and value production. It isn’t. It’s self licking ice cream cone scams, and any growth or value is incidental to that. It isn’t until you understand this that people’s behavior starts to make sense. America really is at a fork in the road. In one world, they abandon all hopes of being an empire, becoming a regional power with highly protectionist economics. This happened before, and it’s called Europe. I know it’s hard to believe now, but Europe used to be the seat of power for the whole world. The sun never set on the British empire. Now they put you in jail for memes. Protectionist America is a boring place and not somewhere I want to be. It kicks the can further down the road of poverty, basically embraces socialism, is stagnant, is stale, is a museum…etc, again there’s a contemporary example of this. When I said on Lex they were gonna nationalize NVIDIA, look at the AI Diffusion Framework, and notice how Trump hasn’t repealed it. It allows export of GPUs to only 18 countries. Nationalization with American characteristics. It tells the other 177 countries that they should plan on purchasing their AI infrastructure from China. The other path, which is the exciting path, is the attempt to maintain an empire. An empire has to compete on its merits. There’s two simple steps to restore American greatness: 1) Brain drain the world. Work visas for every person who can produce more than they consume. I’m talking doubling the US population, bringing in all the factory workers, farmers, miners, engineers, literally anyone who produces value. Can we raise the average IQ of America to be higher than China? 2) Back the dollar by gold (not socially constructed crypto), and bring major crackdowns to finance to tie it to real world value. Trading is not a job. Passive income is not a thing. Instead, go produce something real and exchange it for gold. The first will bring the value of “American” labor in line with its global market value. It is a particularly unique advantage of the US over China, the US has a potentially much larger pool of talent. Non ironically, diversity is our strength. Unfortunately, there’s a lot of resistance to American labor finding its market value. The second will prevent a lot of the scams. The reason the banking industry is so big is that it is close to the source of the made up dollars. If currency is gold backed, you could imagine something similar happening to the mining industry instead. However, the mining industry is real! It uses steel and aluminum to build physical things. And imagine when we start to mine space. That’s a way better reward function than scamming politicians out of fake dollars. Unfortunately, I doubt either will happen. They very much both can, but people haven’t been demoralized enough yet.
A lot of smooth brains on Hacker News about the last post. I’m sorry if you spent your whole life worshipping money, but hey, the Bible warned you about false idols, don’t shoot the messenger. “It’s easier to imagine the end of the world than the end of capitalism” – Mark Fisher It’s actually very easy to imagine the end of capitalism. Imagine capitalism as a game of sharks, where eventually the biggest shark ends up gobbling up all the fish, and that one shark is the last player left standing with all the money. When one person (or company) has all the money, do you see how the money would be worthless? I’ll spell this out clearly. Money is a map, it is not a territory. Please understand what I mean by this before continuing to read. You can erase the mountains from the map, but you still have to climb over them in real life, and even worse, now you don’t have a map! “Everything around you that you call ‘life’ was made up by people who were no smarter than you” – Steve Jobs So, if money is the map, what territory is it attempting to capture? Presumably something having to do with value, but increasingly, as we are buying and selling baskets of derivatives of memecoins, nothing. A map that doesn’t accurately capture a territory is not a Schelling point. It’s not a useful map. And maps are only as good as their usefulness. Useless maps die out. Do you agree or disagree that money is supposed to be a map of value? If you disagree, that’s an ought and I can’t use logic to convince you otherwise, I can just call you a moron who refuses to burn paper $100 bills for warmth on a deserted island. Many capitalists I meet are as stupid as communists, trying to give a moral justification for their system. This is my money, I deserve it. I should be able to passively deploy my capital into the markets and live off the returns. “Moral victories are for minor league coaches.” – JAY-Z A economic system is only good in so much as it effectively deploys capital for real growth. If real economic growth is only 3 percent, any time you are earning beyond that, somebody else is losing. And yet somehow, today, you can put your money in money market accounts and earn a “risk-free” 5 percent…hmm something doesn’t make sense. Who is losing? You will eventually be unable to squeeze the productive people any further. The worst was an e-mail I got with someone who supposedly agreed with me. “Value creation (for all stakeholders) is at the core of the organization/ business model I am putting together…Anyway I wanted to let you know others out there who share your vision.” – anon email Fuck your stakeholders. Fuck your business model. You don’t understand me at all. Stop worrying so much about the distribution of the pie. Start thinking about how to make the pie bigger. With exponential (what 3 percent year over year is) growth, the latter outstrips the former by so much. The right distribution is simply: From each according to his ability, to each according to his ability to effectively deploy capital to achieve real economic growth. Communism is dumb cause it goes to the poor (who routinely demonstrate that they poorly deploy capital). Capitalism is dumb cause it goes to the rent-seekers (who frequently deploy capital to increase their moat). Acceleration is the way.
More in programming
Everyone wants the software they work on to produce quality products, but what does that mean? In addition, how do you know when you have it? This is the longest single blog post I have ever written. I spent four decades writing software used by people (most of the server
My April Cools is out! Gaming Games for Non-Gamers is a 3,000 word essay on video games worth playing if you've never enjoyed a video game before. Patreon notes here. (April Cools is a project where we write genuine content on non-normal topics. You can see all the other April Cools posted so far here. There's still time to submit your own!) April Cools' Club
The Ware for March 2025 is shown below. I was just taking this thing apart to see what went wrong, and thought it had some merit as a name that ware. But perhaps more interestingly, I was also experimenting with my cross-polarized imaging setup. This is a technique a friend of mine told me about […]
Picasso got it right: Great artists steal. Even if he didn’t actually say it, and we all just repeat the quote because Steve Jobs used it. Because it strikes at the heart of creativity: None of it happens in a vacuum. Everything is inspired by something. The best ideas, angles, techniques, and tones are stolen to build everything that comes after the original. Furthermore, the way to learn originality is to set it aside while you learn to perfect a copy. You learn to draw by imitating the masters. I learned photography by attempting to recreate great compositions. I learned to program by aping the Ruby standard library. Stealing good ideas isn’t a detour on the way to becoming a master — it’s the straight route. And it’s nothing to be ashamed of. This, by the way, doesn’t just apply to art but to the economy as well. Japan became an economic superpower in the 80s by first poorly copying Western electronics in the decades prior. China is now following exactly the same playbook to even greater effect. You start with a cheap copy, then you learn how to make a good copy, and then you don’t need to copy at all. AI has sped through the phase of cheap copies. It’s now firmly established in the realm of good copies. You’re a fool if you don’t believe originality is a likely next step. In all likelihood, it’s a matter of when, not if. (And we already have plenty of early indications that it’s actually already here, on the edges.) Now, whether that’s good is a different question. Whether we want AI to become truly creative is a fair question — albeit a theoretical or, at best, moral one. Because it’s going to happen if it can happen, and it almost certainly can (or even has). Ironically, I think the peanut gallery disparaging recent advances — like the Ghibli fever — over minor details in the copying effort will only accelerate the quest toward true creativity. AI builders, like the Japanese and Chinese economies before them, eager to demonstrate an ability to exceed. All that is to say that AI is in the "Good Copy" phase of its creative evolution. Expect "The Great Artist" to emerge at any moment.