More from the singularity is nearer
You know about Critical Race Theory, right? It says that if there’s an imbalance in, say, income between races, it must be due to discrimination. This is what wokism seems to be, and it’s moronic and false. The right wing has invented something equally stupid. Introducing Critical Trade Theory, stolen from this tweet. If there’s an imbalance in trade between countries, it must be due to unfair practices. (not due to the obvious, like one country is 10x richer than the other) There’s really only one way the trade deficits will go away, and that’s if trade goes to zero (or maybe if all these countries become richer than America). Same thing with the race deficits, no amount of “leg up” bullshit will change them. Why are all the politicians in America anti-growth anti-reality idiots who want to drive us into the poor house? The way this tariff shit is being done is another stupid form of anti-merit benefits to chosen groups of people, with a whole lot of grift to go along with it. Makes me just not want to play.
Intel is sitting on a huge amount of card inventory they can’t move, largely because of bad software. Most of this is a summary of the public #intel-hardware channel in the tinygrad discord. Intel currently is sitting on: 15,000 Gaudi 2 cards (with baseboards) 5,100 Intel Data Center GPU Max 1450s (without baseboards) If you were Intel, what would you do with them? First, starting with the Gaudi cards. The open source repo needed to control them was archived on Feb 4, 2025. There’s a closed source version of this that’s maybe still maintained, but eww closed source and do you think it’s really maintained? The architecture is kind of tragic, and that’s likely why they didn’t open source it. Unlike every other accelerator I have seen, the MMEs, which is where all the FLOPS are, are not controllable by the TPCs. While the TPCs have an LLVM port, the MME is not documented. After some poking around, I found the spec: It’s highly fixed function, looks very similar to the Apple ANE. But that’s not even the real problem with it. The problem is that it is controlled by queues, not by the TPCs. Unpacking habanalabs-dkms-1.19.2-32.all.deb you can find the queues. There is some way to push a command stream to the device so you don’t actually have to deal with the host itself for the queues. But that doesn’t prevent you having to decompose the network you are trying to run into something you can put on this fixed function block. Programmability is on a spectrum, ranging from CPUs being the easiest, to GPUs, to things like the Qualcomm DSP / Google TPU (where at least you drive the MME from the program), to this and the Apple ANE being the hardest. While it’s impressive that they actually got on MLPerf Training v4.0 training GPT3, I suspect it’s all hand coded, and if you even can deviate off the trodden path you’ll get almost no perf. Accelerators like this are okay for low power inference where you can adjust the model architecture for the target, Apple does a great job of this. But this will never be acceptable for a training chip. Then there’s the Data Center GPU Max 1450. Intel actually sent us a few of these. You quickly run into a problem…how do you plug them in? They need OAM sockets, 48V power, and a cooling solution that can sink 600W. As far as I can tell, they were only ever deployed in two systems, the Aurora Supercomputer and the Dell XE9640. It’s hard to know, but I really doubt many of these Dell systems were sold. Intel then sent us this carrier board. In some ways it’s helpful, but in other ways it’s not at all. It still doesn’t solve cooling or power, and you need to buy 16x MCIO cables (cheap in quantity, but expensive and hard to find off the shelf). Also, I never got a straight answer, but I really doubt Intel has many of these boards. And that board doesn’t look cheap to manufacturer more of. The connectors alone, which you need two of per GPU, cost $26 each. That’s $104 for just the OAM connectors. tiny corp was in discussions to buy these GPUs. How much would you pay for one of these on a PCIe card? The specs look great. 839 TFLOPS, 128 GB of ram, 3.3 TB/s of bandwidth. However…read this article. Even in simple synthetic benchmarks, the chip doesn’t get anywhere near its max performance, and it looks to be for fundamental reasons like memory latency. We estimate we could sell PCIe versions of these GPUs for $1,000; I don’t think most people know how hard it is to move non NVIDIA hardware. Before you say you’d pay more, ask yourself, do you really want to deal with the software? An adapter card has four pieces. A PCB for the card, a 12->48V voltage converter, a heatsink, and a fan. My quote from the guy who makes an OAM adapter board was $310 for 10+ PCBs and $75 for the voltage converter. A heatsink that can handle 600W (heat pipes + vapor chamber) is going to cost $100, then maybe $20 more for the fan. That’s $505, and you still need to assemble and test them, oh and now there’s tariffs. Maybe you can get this down to $400 in ~1000 quantity. So $200 for the GPU, $400 for the adapter, $100 for shipping/fulfillment/returns (more if you use Amazon), and 30% profit if you sell at $1k. tiny would net $1M on this, which has to cover NRE and you have risk of unsold inventory. We offered Intel $200 per GPU (a $680k wire) and they said no. They wanted $600. I suspect that unless a supercomputer person who already uses these GPUs wants to buy more, they will ride it to zero. tl;dr: there’s 5100 of these GPUs with no simple way to plug them in. It’s unclear if they worth the cost of the slot they go in. I bet they end up shredded, or maybe dumped on eBay for $50 each in a year like the Xeon Phi cards. If you buy one, good luck plugging it in! The reason Meta and friends buy some AMD is as a hedge against NVIDIA. Even if it’s not usable, AMD has progressed on a solid steady roadmap, with a clear continuation from the 2018 MI50 (which you can now buy for 99% off), to the MI325X which is a super exciting chip (AMD is king of chiplets). They are even showing signs of finally investing in software, which makes me bullish. If NVIDIA stumbles for a generation, this is AMD’s game. The ROCm “copy each NVIDIA repo” strategy actually works if your competition stumbles. They can win GPUs with slow and steady improvement + competition stumbling, that’s how AMD won server CPUs. With these Intel chips, I’m not sure who they would appeal to. Ponte Vecchio is cancelled. There’s no point in investing in the platform if there’s not going to be a next generation, and therefore nobody can justify the cost of developing software, therefore there won’t be software, therefore they aren’t worth plugging in. Where does this leave Intel’s AI roadmap? The successor to Ponte Vecchio was Rialto Bridge, but that was cancelled. The successor to that was Falcon Shores, but that was also cancelled. Intel claims the next GPU will be “Jaguar Shores”, but fool me once… To quote JazzLord1234 from reddit “No point even bothering to listen to their roadmaps anymore. They have squandered all their credibility.” Gaudi 3 is a flop due to “unbaked software”, but as much as I usually do blame software, nothing has changed from Gaudi 2 and it’s just a really hard chip to program for. So there’s no future there either. I can’t say that “Jaguar Shores” square instills confidence. It didn’t inspire confidence for “Joseph B.” on LinkedIn either. From my interactions with Intel people, it seems there’s no individuals with power there, it’s all committee like leadership. The problem with this is there’s nobody who can say yes, just many people who can say no. Hence all the cancellations and the nonsense strategy. AMD’s dysfunction is different. from the beginning they had leadership that can do things (Lisa Su replied to my first e-mail), they just didn’t see the value in investing in software until recently. They sort of had a point if they were only targeting hyperscalars. but it seems like SemiAnalysis got through to them that hyperscalars aren’t going to deal with bad software either. It remains to be seem if they can shift culture to actually deliver good software, but there’s movement in that direction, and if they succeed AMD is so undervalued. Their hardware is good. With Intel, until that committee style leadership is gone, there’s 0 chance for success. Committee leadership is fine if you are trying to maintain, but Intel’s AI situation is even more hopeless than AMDs, and you’d need something major to turn it around. At least with AMD, you can try installing ROCm and be frustrated when there are bugs. Every time I have tried Intel’s software I can’t even recall getting the import to work, and the card wasn’t powerful enough that I cared. Intel needs actual leadership to turn this around, or there’s 0 future in Intel AI.
If you give some monkeys a slice of cucumber each, they are all pretty happy. Then you give one monkey a grape, and nobody is happy with their cucumber any more. They might even throw the slices back at the experimenter. He got a god damned grape this is bullshit I don’t want a cucumber anymore! Nobody was in absolute terms worse off, but that doesn’t prevent the monkeys from being upset. And this isn’t unique to monkeys, I see this same behavior on display when I hear about billionaires. It’s not about what I have, they got a grape. The tweet is here. What do you do about this? Of course, you can fire this women, but what percent of people in American society feel the same way? How much of this can you tolerate and still have a functioning society? What’s particularly absurd about the critique in the video is that it hasn’t been thought through very far. If that house and its friends stopped “ordering shit”, the company would stop making money and she wouldn’t have that job. There’s nothing preventing her from quitting today and getting the same outcome for herself. But of course, that isn’t what it’s about, because then somebody else would be delivering the packages. You see, that house got a grape. So how do we get through this? I’ll propose something, but it’s sort of horrible. Bring people to power based on this feeling. Let everyone indulge fully in their resentment. Kill the bourgeois. They got grapes, kill them all! Watch the situation not improve. Realize that this must be because there’s still counterrevolutionaries in the mix, still a few grapefuckers. Some billionaire is trying to hide his billions! Let the purge continue! And still, things are not improving. People are starving. The economy isn’t even tracked anymore. Things are bad. Millions are dead. The demoralization is complete. Starvation and real poverty are more powerful emotions than resentment. It was bad when people were getting grapes, but now there aren’t even cucumbers anymore. In the face of true poverty for all, the resentment fades. Society begins to heal. People are grateful to have food, they are grateful for what they have. Expectations are back in line with market value. You have another way to fix this? Cause this is what seems to happen in history, and it takes a generation. The demoralization is just beginning.
AMD is sending us the two MI300X boxes we asked for. They are in the mail. It took a bit, but AMD passed my cultural test. I now believe they aren’t going to shoot themselves in the foot on software, and if that’s true, there’s absolutely no reason they should be worth 1/16th of NVIDIA. CUDA isn’t really the moat people think it is, it was just an early ecosystem. tiny corp has a fully sovereign AMD stack, and soon we’ll port it to the MI300X. You won’t even have to use tinygrad proper, tinygrad has a torch frontend now. Either NVIDIA is super overvalued or AMD is undervalued. If the petaflop gets commoditized (tiny corp’s mission), the current situation doesn’t make any sense. The hardware is similar, AMD even got the double throughput Tensor Cores on RDNA4 (NVIDIA artificially halves this on their cards, soon they won’t be able to). I’m betting on AMD being undervalued, and that the demand for AI has barely started. With good software, the MI300X should outperform the H100. In for a quarter million. Long term. It can always dip short term, but check back in 5 years.
More in programming
This article was originally commissioned by Luca Rossi (paywalled) for refactoring.fm, on February 11th, 2025. Luca edited a version of it that emphasized the importance of building “10x engineering teams” . It was later picked up by IEEE Spectrum (!!!), who scrapped most of the teams content and published a different, shorter piece on March […]
I was working on Edna, my open source note taking application that is a cross between Obsidian and Notational Velocity. It’s written in Svelte 5. While it runs in the browser, it’s more like a desktop app than a web page. A useful UI element for desktop-like apps is a nested context menu. I couldn’t find really good implementation of nested menus for Svelte. The only good quality library I found is shadcn-svelte but it’s a big dependency so I decided to implement a decent menu myself. This article describes my implementation clocking at around 550 lines of code. Indecent implementations Most menu implementations have flaws: verbose syntax not nested sub-menus hard to access (they go away before you can move mouse over it) show partially offscreen i.e. not fully visible no keyboard navigation My implementation doesn’t have those flaws but I wouldn’t go as far as saying it’s a really, really good implementation. It’s decent and works well for Edna. Avoiding the flaws Verbose syntax I find the following to be very verbose: <Menu.Root> <Menu.Items> <Menu.Item> <Menu.Text>Open</Menu.Text> <Menu.Shortcut>Ctrl + P</Menu.Shortcut> </Menu.Item> </Menu.Items> </Menu.Root> That’s just for one menu item. My syntax My syntax is much more compact. Here’s a partial menu from Edna: const contextMenu = [ ["Open note\tMod + P", kCmdOpenNote], ["Create new note", kCmdCreateNewNote], ["Create new scratch note\tAlt + N", kCmdCreateScratchNote], ["This Note", menuNote], ["Block", menuBlock], ["Notes storage", menuStorage], ]; const menuNote = [ ["Rename", kCmdRenameCurrentNote], ["Delete", kCmdDeleteCurrentNote], ]; contextMenu is a prop passed to Menu.svelte component. The rule is simple: menu is an array of [text, idOrSubMenu] elements text is menu text with optional shortcut, separated by tab \t second element is either a unique integer that identifies menu item or an array for nested sub-menu To ensure menu ids are unique I use nmid() (next menu id) function: let nextMenuID = 1000; function nmid() { nextMenuID++; return nextMenuID; } export const kCmdOpenNote = nmid(); export const kCmdCreateNewNote = nmid(); Disabling / removing items Depending on the current state of the application some menu items should be disabled and some should not be shown at all. My implementation allows that via menuItemStatus function, passed as a prop to menu component. This function is called for every menu item before showing the menu. The argument is [text, idOrSubMenu] menu item definition and it returns current menu state: kMenuStatusNormal, kMenuStatusRemoved, kMenuStatusDisabled. Acting on menu item When user clicks a menu item we call onmenucmd(commandID) prop function,. We can define complex menu and handle menu actions with very compact definition and only 2 functions. Representing menu at runtime Before we address the other flaws, I need to describe how we represent menu inside the component, because this holds secret to solving those flaws. Menu definition passed to menu component is converted to a tree represented by MenuItem class: class MenuItem { text = ""; shortcut = ""; /** @type {MenuItem[]} */ children = null; cmdId = 0; /** @type {HTMLElement} */ element = null; /** @type {HTMLElement} */ submenuElement = null; /** @type {MenuItem} */ parent = null; zIndex = 30; isSeparator = false; isRemoved = false; isDisabled = false; isSubMenu = false; isSelected = $state(false); } A top level menu is an array of MenuItem. text, shortcut and cmdId are extracted from menu definition. isRemoved, isDisabled is based on calling menuItemStatus() prop function. children is null for a menu item or an array if this is a trigger for sub-menu. isSubMenu could be derived as children != null but the code reads better with explicit bool. parent helps us navigate the tree upwards. zIndex exists so that we can ensure sub-menus are shown over their parents. element is assigned via bind:this during rendering. submenuElement is for sub-menu triggers and represents the element for sub-menu (as opposed to the element that triggers showing the menu). And finally we have isSelected, the only reactive attribute we need. It represents a selected state of a given menu item. It’s set either from mouseover event or via keyboard navigation. Selected menu is shown highlighted. Additionally, for menu items that are sub-menu triggers, it also causes the sub-menu to be shown. Implementing nesting A non-nested dropdown is easy to implement: the dropdown trigger element is position: relative the child, dropdown content (i.e. menu), is position: absolute, starts invisible (display: none) and is toggled visible either via css (hover attribute) or via JavaScript Implementing nesting is substantially more difficult. For nesting we need to keep several sub-menu shown at the same time (as many as nesting can go). Some menu implementations render sub-menus as peer elements. In my implementation it’s a single element so I have a only one onmouseover handler on top-level parent element of menu. There I find the menu item by finding element with role menuitem. I know it corresponds to MenuItem.element so I scan the whole menu tree to find matching MenuItem object. To select the menu I use a trick to simplify the logic: I unselect all menu items, select the one under mouse and all its parents. Selecting MenuItem happens by setting isSelected reactive value. It causes the item to re-render and sets is-selected css class to match isSelected reactive value, which highlights the item by changing the background color. Making sub-menus easy to access The following behavior is common and very frustrating: you hover over sub-menu trigger, which cases sub-menu to show up you try to move a mouse towards that sub-menu but the mouse leaves the trigger element causing sub-menu to disappear There are some clever solution to this problem but I found it can be solved quite simply by: delaying hiding of sub-menu by 300 millisecond. That gives the user enough time to reach sub-menu before it disappears showing sub-menu partially on top of its parent. Most implementations show sub-menus to the right of the parent. This reduces the distance to reach sub-menu Specifically, my formula for default sub-menu position is: .sub-menu-wrapper { left: calc(80% - 8px); } so it’s moved left 20% of parent width + 8px. Making menus always visible Context menu is shown where the mouse click happened. If mouse click happened near the edge of the window, the menu will be partially offscreen. To fix that I have aensurevisible(node) function which checks the position and size of the element and if necessary repositions the node to be fully visible by setting left and top css properties. I use it as an action for top-level menu element and call manually on sub-menu elements when showing them. For this to work, the element must have position: absolute. Implementing keyboard navigation To implement keyboard navigation I handle keydown event on top-level menu element and on ArrowUp, ArrowDown, ArrowLeft and ArrowDown I select the right MenuItem based on currently selected menu items. Tab works same as ArrowDown and selects the next menu item. Enter triggers menu command. Recursive rendering with snippets This is actually my second Svelte menu implementation. The first one was in Svelte 4 made for notepad2. Nested menu is a tree which led me to re-cursive rendering via <svelte:self> tag. However, this splits the information about the menu between multiple run-time components. Keyboard navigation is hard to implement without access to the global state of all menu items, which is why I didn’t implement keyboard navigation there. With Svelte 5 we can mount a single Menu component and render sub-menus with recursive snippets. Keyboard shortcuts Menu.svelte only shows keyboard shortcuts, you have to ensure that the shortcuts work somewhere else in the app. This is just as well because in Edna some keyboard shortcuts are handled by CodeMirror so it wouldn’t always be right to have menu register for keyboard events and originate those commands. You can use it I didn’t extract the code into a stand-alone re-usable component but you can copy Menu.svelte and the few utility functions it depends on into your own project. I use tailwindcss for CSS, which you can convert to regular CSS if needed. And then you can change how you render menu items and sub-menus. Potential improvements The component meets all the needs of Edna, but more features are always possible. It could support on/off items with checkmarks on the left. It could support groups of radio items and ability to render more fancy menu items. It could support icons on the left. It could support keyboard selection similar to how Windows does it: you can mark a latter in menu text with & and it becomes a keyboard shortcuts for the item, shown as underlined. Figma used to have a search box at the top of context menu for type-down find of menu items. I see they changed it to just triggering command-palette. References Edna is a note-taking application for programmers and power users written in Svelte 5 you can see the full implementation (and use in your own projects)
How a wild side-quest became the source of many of the articles you’ve read—and have come to expect—in this publication
Watch now | Privilege levels, syscall conventions, and how assembly code talks to the Linux kernel
Learn how disposable objects solve test cleanup problems in flat testing. Use TypeScript's using keyword to ensure reliable resource disposal in tests.