More from ntietz.com blog - technically a blog
Every time I run into endianness, I have to look it up. Which way do the bytes go, and what does that mean? Something about it breaks my brain, and makes me feel like I can't tell which way is up and down, left and right. This is the blog post I've needed every time I run into this. I hope it'll be the post you need, too. What is endianness? The term comes from Gulliver's travels, referring to a conflict over cracking boiled eggs on the big end or the little end[1]. In computers, the term refers to the order of bytes within a segment of data, or a word. Specifically, it only refers to the order of bytes, as those are the smallest unit of addressable data: bits are not individually addressable. The two main orderings are big-endian and little-endian. Big-endian means you store the "big" end first: the most-significant byte (highest value) goes into the smallest memory address. Little-endian means you store the "little" end first: the least-significant byte (smallest value) goes into the smallest memory address. Let's look at the number 168496141 as an example. This is 0x0A0B0C0D in hex. If we store 0x0A at address a, 0x0B at a+1, 0x0C at a+2, and 0x0D at a+3, then this is big-endian. And then if we store it in the other order, with 0x0D at a and 0x0A at a+3, it's little-endian. And... there's also mixed-endianness, where you use one kind within a word (say, little-endian) and a different ordering for words themselves (say, big-endian). If our example is on a system that has 2-byte words (for the sake of illustration), then we could order these bytes in a mixed-endian fashion. One possibility would be to put 0x0B in a, 0x0A in a+1, 0x0D in a+2, and 0x0C in a+3. There are certainly reasons to do this, and it comes up on some ARM processors, but... it feels so utterly cursed. Let's ignore it for the rest of this! For me, the intuitive ordering is big-ending, because it feels like it matches how we read and write numbers in English[2]. If lower memory addresses are on the left, and higher on the right, then this is the left-to-right ordering, just like digits in a written number. So... which do I have? Given some number, how do I know which endianness it uses? You don't, at least not from the number entirely by itself. Each integer that's valid in one endianness is still a valid integer in another endianness, it just is a different value. You have to see how things are used to figure it out. Or you can figure it out from the system you're using (or which wrote the data). If you're using an x86 or x64 system, it's mostly little-endian. (There are some instructions which enable fetching/writing in a big-endian format.) ARM systems are bi-endian, allowing either. But perhaps the most popular ARM chips today, Apple silicon, are little-endian. And the major microcontrollers I checked (AVR, ESP32, ATmega) are little-endian. It's thoroughly dominant commercially! Big-endian systems used to be more common. They're not really in most of the systems I'm likely to run into as a software engineer now, though. You are likely to run into it for some things, though. Even though we don't use big-endianness for processor math most of the time, we use it constantly to represent data. It comes back in networking! Most of the Internet protocols we know and love, like TCP and IP, use "network order" which means big-endian. This is mentioned in RFC 1700, among others. Other protocols do also use little-endianness again, though, so you can't always assume that it's big-endian just because it's coming over the wire. So... which you have? For your processor, probably little-endian. For data written to the disk or to the wire: who knows, check the protocol! Why do we do this??? I mean, ultimately, it's somewhat arbitrary. We have an endianness in the way we write, and we could pick either right-to-left or left-to-right. Both exist, but we need to pick one. Given that, it makes sense that both would arise over time, since there's no single entity controlling all computer usage[3]. There are advantages of each, though. One of the more interesting advantages is that little-endianness lets us pretend integers are whatever size we like, within bounds. If you write the number 26[4] into memory on a big-endian system, then read bytes from that memory address, it will represent different values depending on how many bytes you read. The length matters for reading in and interpreting the data. If you write it into memory on a little-endian system, though, and read bytes from the address (with the remaining ones zero, very important!), then it is the same value no matter how many bytes you read. As long as you don't truncate the value, at least; 0x0A0B read as an 8-bit int would not be equal to being read as a 16-bit ints, since an 8-bit int can't hold the entire thing. This lets you read a value in the size of integer you need for your calculation without conversion. On the other hand, big-endian values are easier to read and reason about as a human. If you dump out the raw bytes that you're working with, a big-endian number can be easier to spot since it matches the numbers we use in English. This makes it pretty convenient to store values as big-endian, even if that's not the native format, so you can spot things in a hex dump more easily. Ultimately, it's all kind of arbitrary. And it's a pile of standards where everything is made up, nothing matters, and the big-end is obviously the right end of the egg to crack. You monster. The correct answer is obviously the big end. That's where the little air pocket goes. But some people are monsters... ↩ Please, please, someone make a conlang that uses mixed-endian inspired numbers. ↩ If ever there were, maybe different endianness would be a contentious issue. Maybe some of our systems would be using big-endian but eventually realize their design was better suited to little-endian, and then spend a long time making that change. And then the government would become authoritarian on the promise of eradicating endianness-affirming care and—Oops, this became a metaphor. ↩ 26 in hex is 0x1A, which is purely a coincidence and not a reference to the First Amendment. This is a tech blog, not political, and I definitely stay in my lane. If it were a reference, though, I'd remind you to exercise their 1A rights[5] now and call your elected officials to ensure that we keep these rights. I'm scared, and I'm staring down the barrel of potential life-threatening circumstances if things get worse. I expect you're scared, too. And you know what? Bravery is doing things in spite of your fear. ↩ If you live somewhere other than the US, please interpret this as it applies to your own country's political process! There's a lot of authoritarian movement going on in the world, and we all need to work together for humanity's best, most free[6] future. ↩ I originally wrote "freest" which, while spelled correctly, looks so weird that I decided to replace it with "most free" instead. ↩
If you manage a team, who are your teammates? If you're a staff software engineer embedded in a product team, who are your teammates? The answer to the question comes down to who your main responsibility lies with. That's not the folks you're managing and leading. Your responsibility lies with your fellow leaders, and they're your teammates. The first team mentality There's a concept in leadership called the first team mentality. If you're a leader, then you're a member of a couple of different teams at the same time. Using myself as an example, I'm a member of the company's leadership team (along with the heads of marketing, sales, product, etc.), and I'm also a member of the engineering department's leadership team (along with the engineering directors and managers and the CTO). I'm also sometimes embedded into a team for a project, and at one point I was running a 3-person platform team day-to-day. So I'm on at least two teams, but often three or more. Which of these is my "first" team, the one which I will prioritize over all the others? For my role, that's ultimately the company leadership. Each department is supposed to work toward the company goals, and so if there's an inter-department conflict you need to do what's best for the company—helping your fellow department heads—rather than what's best for your department. (Ultimately, your job is to get both of these into alignment; more on that later.) This applies across roles. If you're an engineering manager, your teammates are not the people who report to you. Your teammates are the other engineering managers and staff engineers at your level. You all are working together toward department goals, and sometimes the team has to sacrifice to make that happen. Focus on the bigger goals One of the best things about a first team mentality is that it comes with a shift in where your focus is. You have to focus on the broader goals your group is working in service of, instead of focusing on your group's individual work. I don't think you can achieve either without the other. When you zoom out from the team you lead or manage and collaborate with your fellow leaders, you gain context from them. You see what their teams are working on, and you can contextualize your work with theirs. And you also see how your work impacts theirs, both positively and negatively. That broader context gives you a reminder of the bigger, broader goals. It can also show you that those goals are unclear. And if that's the case, then the work you're doing in your individual teams doesn't matter, because no one is going in the same direction! What's more important there is to focus on figuring out what the bigger goals should be. And once those are done, then you can realign each of your groups around them. Conflicts are a lens Sometimes the first team mentality will result in a conflict. There's something your group wants or needs, which will result in a problem for another group. Ultimately, this is your work to resolve, and the conflict is a lens you can use to see misalignment and to improve the greater organization. You have to find a way to make sure that your group is healthy and able to thrive. And you also have to make sure that your group works toward collective success, which means helping all the groups achieve success. Any time you run into a conflict like this, it means that something went wrong in alignment. Either your group was doing something which worked against its own goal, or it was doing something which worked against another group's goal. If the latter, then that means that the goals themselves fundamentally conflicted! So you go and you take that conflict, and you work through it. You work with your first team—and you figure out what the mismatch is, where it came from, and most importantly, what we do to resolve it. Then you take those new goals back to your group. And you do it with humility, since you're going to have to tell them that you made a mistake. Because that alignment is ultimately your job, and you have to own your failures if you expect your team to be able to trust you and trust each other.
Code ownership is a popular concept, but it emphasizes the wrong thing. It can bring out the worst in a person or a team: defensiveness, control-seeking, power struggles. Instead, we should be focusing on stewardship. How code ownership manifests Code ownership as a concept means that a particular person or team "owns" a section of the codebase. This gives them certain rights and responsibilities: They control what goes into the code, and can approve or deny changes They are responsible for fixing bugs in that part of the code They are responsible for maintaining and improving that part of the code There are tools that help with these, like the CODEOWNERS file on GitHub. This file lets you define a group or list of individuals who own a section of the repository. Then you can require reviews/approvals from them before anything gets merged. These are all coming from a good place. We want our code to be well-maintained, and we want to make sure that someone is responsible for its direction. It really helps to know who to go to with questions or requests. Without these, changes can grind to a halt, mired in confusion and tech debt. But the concept in practice brings challenges. If you've worked on a team using code ownership before, you've probably run into: that engineer who guards the code against anyone else's changes, wanting all the credit for themselves that engineer who refuses to add anything else to their codebase, because they don't want to maintain it that engineer who tries to gain code ownership over more areas, to control more of the code and more of the company I've done certainly acted badly due to code ownership, without realizing what I was doing or or why I was doing it at the time. There are almost endless ways that code ownership can bring out the worst in people. And it all makes sense. We can do better by shifting to stewardship instead of ownership. Stewardship is about service We are all stewards of things we own or are responsible for. I have stewardship over the house I live in with my family, for example. I also have stewardship over the espresso machine I use every day: It's a big piece of machinery, and it's my responsibility to take good care of it and to ensure that as long as it's mine, it operates well and lasts a long time. That reduces expense, reduces waste, and reduces impact on the world—but it also means that the object (an espresso machine) is serving its purpose to bring joy and connection. Code is no different. By focusing on stewardship rather than ownership, we are focusing on the responsible, sustainable maintenance of the code. We focus on taking good care of that which we're entrusted with. A steward doesn't jealously guard, or struggle to gain more power. A steward watches what her responsibilities are, ensuring enough to contribute but not so many as to burn out. And she nurtures and cares for the code, to make sure that it continues to serve its purpose. Instead of an adversarial relationship, stewardship promotes partnership: It promotes working with others to figure out how to make the best use of resources, instead of hoarding them for yourself. Stewardship can solve many of the same problems that code ownership does: It gives you someone who's a main point of contact for some code It grants someone responsibility for bug fixes and maintenance of that code And in some ways, they look alike. You're going to do a lot of the same things, controlling what goes in or out. But they are very different in the focus. Owners are concerned with the value of what they own. Stewards are concerned with how well it can serve the group. And this makes all the difference in producing better outcomes.
After I wrote YARR (Yet Another Rust Resource, with requisite pirate mentions), one of my friends tried it out. He gave me some really useful insights as he went through it, letting me see what was hard about learning Rust from a newcomer's perspective. Unsurprisingly, lifetimes are a challenge—and seeing him go through it helped me understand why they're hard to learn. Here are a few of the challenges he ran into. I don't think that these are necessarily problems, but they're perhaps opportunities to improve educational materials. They don't map 100% to how long a variable is in memory My friend gave me an example he's seen a few times when people explain lifetimes. fn longest<'a>(x: &'a str, y: &'a str) -> &'a str { if x.len() > y.len() { x } else { y } } And for many newcomers, you see this and you expect it is saying that x and y both have the lifetime 'a, so they live the same amount of time. But the following is valid: fn print_longest(x: &'static str) { let y = "local"; let a = longest(x, y); println!("{a}"); drop(a); drop(y); println!("y is gone"); } In this example, x and y live for different amounts of time. y doesn't even survive to the end of the function, whereas x should be valid for the entire duration of the program. That's because lifetimes are talking about a bound on the time something can live. There's some lifetime 'a during which we can say that x and y are both certainly valid. But x and y can both live longer than 'a. Lifetimes don't change the runtime behavior Most code we write changes what the program does at runtime. Types can be different, because sometimes you're giving the compiler information about what something is. But most type information can change the runtime behavior! The simplest example is when you have an integer. You can declare one without a type. let x = 10; This has an inferred type, and if you set a different type, like u8, you'll get different behavior at run time. let x: u8 = 10; In contrast, lifetimes are only used by the compiler to ensure that borrows are all valid. The compiler can reject your program if invalid borrows are performed, but the binary output should not be affected by the lifetimes of the variables. It's a different kind of type system We're used to seeing types in our programming languages, and these type systems are usually pretty similar. Rust's lifetimes are different, though. The borrow checker uses a linear type system to do its work. These are super cool, and something that I don't understand particularly well. I'm familiar with how to use the borrow checker, but I don't know any of the theory behind them. The premise, as I understand it, is that objects can be used exactly once, allowing you to safely deallocate it after use (since it won't be used again). This prevents multiple concurrent uses (yay, data race protection!) or use-after-free (yay, segfault protection!). The coolness is why we have it, but it's still pretty tough to understand. You have to learn this whole new type system that's pretty different from everything else you've touched. And most of the resources1 out there don't even mention that it's a different kind of type system! They share syntax with generics Another challenge is that the syntax is shared with generics. Even though lifetimes are very different in behavior and type system from generics, they sit inside very similar looking syntax. This is probably unavoidable—lifetimes are related to all the other types in your code—but it certainly makes things harder to learn. When you see something like this, you expect that it's generic over a type. fn something_generic<T>(arg: T) { ... } And you're right that it is! But then you have something that looks very similar, like this. And you might expect it to also be generic over a type. fn something_generic<'a>(arg: &'a str) { ... } But it's not, in the normal sense. Instead it's generic over a lifetime. And that's a little confusing that those sit in the same spot, especially when it's not called out as a potential gotcha in learning materials. * * * Lifetimes have some inherent complexity. The borrow checker is a very valuable tool, and it's great we have it! But with that power and complexity can come challenges in learning, and teaching, the underlying concepts. I think the current difficulty in learning Rust is due to a lot of things. One aspect is certainly some inherent complexity. But another aspect is that many resources aren't really geared toward the kind of programmer coming to Rust without this background knowledge, and there is room for improvement. We can make explanations of lifetimes and the borrow checker better and less confusing. Or we can at least make them more empathetic, projecting that it's expected to be confused because there are some good reasons it's hard to understand. And that you'll get there, eventually. Thank you, Ryan, for generously sharing your thoughts as you went through learning Rust. Our conversations were instrumental in writing this post. 1 I suppose, as the author of YARR, I can fix this in at least one instance.
Last week, I finally got verified on LinkedIn. Now there's a little badge next to my name that says "yes, she's a human who is legally named Nicole." Their marketing for verification says that I should now expect 60% more profile views and 50% more comments and reactions. For a writer like me, that seems great. More people viewing my content means more people can learn from me, or be entertained by me. And all that for free? There's a problem, of course. Nicole is my legal name, so I was able to get verified as a result. But many people don't go by their legal name. Other names are common So who doesn't go by their legal name? I didn't, for years after my wife and I got married. I went by an alias—our hyphenated last name—without legally changing it. I wasn't eligible to get verified then, since that name was not on my ID. I didn't, when I came out as transgender. It takes time to change your name and update your documents. Until that was complete, I would have had to go by my deadname or lose verification. And what about women who change their name when they get married, but go by their previous name professionally? This is an alias, and it is their name even though it's not what the government knows them as. But they would lose verification for doing this. Anyone who goes by a nickname or alias is ineligible. I have many friends who go by a different name than what their ID shows. This isn't fraudulent—it just reflects who they are. Penalized for being yourself And yet, if you fall into any case where you cannot get verified, then you can't get the benefits. You can't get your extra profile views and extra comments. Or put another way: you're penalized for not being verified. You'll get about 40% fewer profile views and 33% fewer comments/reactions than people who are verified. You get marginalized, unable to reap the full benefits of the platform, if you don't conform to a very particular outlook on what a name is (the official sequence of letters on your ID). To be clear, the problem isn't the verification process itself1. That process (and its associated benefits) may be in place to deal with bot traffic. I can sympathize with this, and I do want lower bot traffic—it makes platforms much more pleasant to use. Let us use our names The problem is that you have to show your legal name to everyone. There should be a process for being verified without it being your legal name on display. This process doesn't have to be scalable if the group that would utilize it is small—since surely they'd only forget about small populations2. It can be as simple as filing a help ticket and allowing a human to approve it based on some evidence. Is your name consistent across your public profiles? And you're a human being? Cool, verified. I believe that LinkedIn can, and should, do better. This feature as implemented is harming marginalized folks who are not able to get the same visibility when they cannot get verified. It reduces the exposure that marginalized creators can get. You should keep this in mind in the products you make, too. Don't require people to display their legal names. And before you even collect that data, think about what problem you're trying to solve. Do you need to collect legal names to solve that? (Probably not.) If so, do you need to store them after processing once? (Probably not.) And if so, do you need to display them publicly? (Probably not.) Names are so much more than what the government knows us by. Let us be our true selves and verify us with our true names. 1 It's not free of problems, though: I'd like to have a way to achieve the same result ("she's a human! she's generally the internet person she claims she is!") without showing my government identity documents to a third party. 2 Though, companies have been known to marginalize large groups of people. This is a rhetorical point that they are either harming a lot of people or they could solve the problem for a low cost.
More in programming
Japan already stipulates that employers must offer the option of reduced working hours to employees with children under three. However, the Child Care and Family Care Leave Act was amended in May 2024, with some of the new provisions coming into effect April 1 or October 1, 2025. The updates to the law address: Remote work Flexible start and end times Reduced hours On-site childcare facilities Compensation for lost salary And more Legal changes are one thing, of course, and social changes are another. Though employers are mandated to offer these options, how many employees in Japan actually avail themselves of these benefits? Does doing so create any stigma or resentment? Recent studies reveal an unsurprising gender disparity in accepting a modified work schedule, but generally positive attitudes toward these accommodations overall. The current reduced work options Reduced work schedules for employees with children under three years old are currently regulated by Article 23(1) of the Child Care and Family Care Leave Act. This Article stipulates that employers are required to offer accommodations to employees with children under three years old. Those accommodations must include the opportunity for a reduced work schedule of six hours a day. However, if the company is prepared to provide alternatives, and if the parent would prefer, this benefit can take other forms—for example, working seven hours a day or working fewer days per week. Eligible employees for the reduced work schedule are those who: Have children under three years old Normally work more than six hours a day Are not employed as day laborers Are not on childcare leave during the period to which the reduced work schedule applies Are not one of the following, which are exempted from the labor-management agreement Employees who have been employed by the company for less than one year Employees whose prescribed working days per week are two days or less Although the law requires employers to provide reduced work schedules only while the child is under three years old, some companies allow their employees with older children to work shorter hours as well. According to a 2020 survey by the Ministry of Health, Labor and Welfare, 15.8% of companies permit their employees to use the system until their children enter primary school, while 5.7% allow it until their children turn nine years old or enter third grade. Around 4% offer reduced hours until children graduate from elementary school, and 15.4% of companies give the option even after children have entered middle school. If, considering the nature or conditions of the work, it is difficult to give a reduced work schedule to employees, the law stipulates other measures such as flexible working hours. This law has now been altered, though, to include other accommodations. Updates to The Child Care and Family Care Leave Act Previously, remote work was not an option for employees with young children. Now, from April 1, 2025, employers must make an effort to allow employees with children under the age of three to work remotely if they choose. From October 1, 2025, employers are also obligated to provide two or more of the following measures to employees with children between the ages of three and the time they enter elementary school. An altered start time without changing the daily working hours, either by using a flex time system or by changing both the start and finish time for the workday The option to work remotely without changing daily working hours, which can be used 10 or more days per month Company-sponsored childcare, by providing childcare facilities or other equivalent benefits (e.g., arranging for babysitters and covering the cost) 10 days of leave per year to support employees’ childcare without changing daily working hours A reduced work schedule, which must include the option of 6-hour days How much it’s used in practice Of course, there’s always a gap between what the law specifies, and what actually happens in practice. How many parents typically make use of these legally-mandated accommodations, and for how long? The numbers A survey conducted by the Ministry of Health, Labor and Welfare in 2020 studied uptake of the reduced work schedule among employees with children under three years old. In this category, 40.8% of female permanent employees (正社員, seishain) and 21.6% of women who were not permanent employees answered that they use, or had used, the reduced work schedule. Only 12.3% of male permanent employees said the same. The same survey was conducted in 2022, and researchers found that the gap between female and male employees had actually widened. According to this second survey, 51.2% of female permanent employees and 24.3% of female non-permanent employees had reduced their hours, compared to only 7.6% of male permanent employees. Not only were fewer male employees using reduced work programs, but 41.2% of them said they did not intend to make use of them. By contrast, a mere 15.6% of female permanent employees answered they didn’t wish to claim the benefit. Of those employees who prefer the shorter schedule, how long do they typically use the benefit? The following charts, using data from the 2022 survey, show at what point those employees stop reducing their hours and return to a full-time schedule. Female permanent employees Female non-permanent employees Male permanent employees Male non-permanent employees Until youngest child turns 1 13.7% 17.9% 50.0% 25.9% Until youngest child turns 2 11.5% 7.9% 14.5% 29.6% Until youngest child turns 3 23.0% 16.3% 10.5% 11.1% Until youngest child enters primary school 18.9% 10.5% 6.6% 11.1% Sometime after the youngest child enters primary school 22.8% 16.9% 6.5% 11.1% Not sure 10% 30.5% 11.8% 11.1% From the companies’ perspectives, according to a survey conducted by the Cabinet Office in 2023, 65.9% of employers answered that their reduced work schedule system is fully used by their employees. What’s the public perception? Some fear that the number of people using the reduced work program—and, especially, the number of women—has created an impression of unfairness for those employees who work full-time. This is a natural concern, but statistics paint a different picture. In a survey of 300 people conducted in 2024, 49% actually expressed a favorable opinion of people who work shorter hours. Also, 38% had “no opinion” toward colleagues with reduced work schedules, indicating that 87% total don’t negatively view those parents who work shorter hours. While attitudes may vary from company to company, the public overall doesn’t seem to attach any stigma to parents who reduce their work schedules. Is this “the Mommy Track”? Others are concerned that working shorter hours will detour their career path. According to this report by the Ministry of Health, Labour and Welfare, 47.6% of male permanent employees indicated that, as the result of working fewer hours, they had been changed to a position with less responsibility. The same thing happened to 65.6% of male non-permanent employees, and 22.7% of female permanent employees. Therefore, it’s possible that using the reduced work schedule can affect one’s immediate chances for advancement. However, while 25% of male permanent employees and 15.5% of female permanent employees said the quality and importance of the work they were assigned had gone down, 21.4% of male and 18.1% of female permanent employees said the quality had gone up. Considering 53.6% of male and 66.4% of female permanent employees said it stayed the same, there seems to be no strong correlation between reducing one’s working hours, and being given less interesting or important tasks. Reduced work means reduced salary These reduced work schedules usually entail dropping below the originally-contracted work hours, which means the employer does not have to pay the employee for the time they did not work. For example, consider a person who normally works 8 hours a day reducing their work time to 6 hours a day (a 25% reduction). If their monthly salary is 300,000 yen, it would also decrease accordingly by 25% to 225,000 yen. Previously, both men and women have avoided reduced work schedules, because they do not want to lose income. As more mothers than fathers choose to work shorter hours, this financial burden tends to fall more heavily on women. To address this issue, childcare short-time employment benefits (育児時短就業給付) will start from April 2025. These benefits cover both male and female employees who work shorter hours to care for a child under two years old, and pay a stipend equivalent to 10% of their adjusted monthly salary during the reduced work schedule. Returning to the previous example, this stipend would grant 10% of the reduced salary, or 22,500 yen per month, bringing the total monthly paycheck to 247,500 yen, or 82.5% of the normal salary. This additional stipend, while helpful, may not be enough to persuade some families to accept shorter hours. The childcare short-time employment benefits are available to employees who meet the following criteria: The person is insured, and is working shorter hours to care for a child under two years old. The person started a reduced work schedule immediately after using the childcare leave covered by childcare leave benefits, or the person has been insured for 12 months in the two years prior to the reduced work schedule. Conclusion Japan’s newly-mandated options for reduced schedules, remote work, financial benefits, and other childcare accommodations could help many families in Japan. However, these programs will only prove beneficial if enough employees take advantage of them. As of now, there’s some concern that parents who accept shorter schedules could look bad or end up damaging their careers in the long run. Statistically speaking, some of the news is good: most people view parents who reduce their hours either positively or neutrally, not negatively. But other surveys indicate that a reduction in work hours often equates to a reduction in responsibility, which could indeed have long-term effects. That’s why it’s important for more parents to use these accommodations freely. Not only will doing so directly benefit the children, but it will also lessen any negative stigma associated with claiming them. This is particularly true for fathers, who can help even the playing field for their female colleagues by using these perks just as much as the mothers in their offices. And since the state is now offering a stipend to help compensate for lost income, there’s less and less reason not to take full advantage of these programs.
The night is dark and full of errors—and durable Rust software is not only ready for them, but handles them sensibly. Let’s see how, by returning to our line-counter project.
Every time I run into endianness, I have to look it up. Which way do the bytes go, and what does that mean? Something about it breaks my brain, and makes me feel like I can't tell which way is up and down, left and right. This is the blog post I've needed every time I run into this. I hope it'll be the post you need, too. What is endianness? The term comes from Gulliver's travels, referring to a conflict over cracking boiled eggs on the big end or the little end[1]. In computers, the term refers to the order of bytes within a segment of data, or a word. Specifically, it only refers to the order of bytes, as those are the smallest unit of addressable data: bits are not individually addressable. The two main orderings are big-endian and little-endian. Big-endian means you store the "big" end first: the most-significant byte (highest value) goes into the smallest memory address. Little-endian means you store the "little" end first: the least-significant byte (smallest value) goes into the smallest memory address. Let's look at the number 168496141 as an example. This is 0x0A0B0C0D in hex. If we store 0x0A at address a, 0x0B at a+1, 0x0C at a+2, and 0x0D at a+3, then this is big-endian. And then if we store it in the other order, with 0x0D at a and 0x0A at a+3, it's little-endian. And... there's also mixed-endianness, where you use one kind within a word (say, little-endian) and a different ordering for words themselves (say, big-endian). If our example is on a system that has 2-byte words (for the sake of illustration), then we could order these bytes in a mixed-endian fashion. One possibility would be to put 0x0B in a, 0x0A in a+1, 0x0D in a+2, and 0x0C in a+3. There are certainly reasons to do this, and it comes up on some ARM processors, but... it feels so utterly cursed. Let's ignore it for the rest of this! For me, the intuitive ordering is big-ending, because it feels like it matches how we read and write numbers in English[2]. If lower memory addresses are on the left, and higher on the right, then this is the left-to-right ordering, just like digits in a written number. So... which do I have? Given some number, how do I know which endianness it uses? You don't, at least not from the number entirely by itself. Each integer that's valid in one endianness is still a valid integer in another endianness, it just is a different value. You have to see how things are used to figure it out. Or you can figure it out from the system you're using (or which wrote the data). If you're using an x86 or x64 system, it's mostly little-endian. (There are some instructions which enable fetching/writing in a big-endian format.) ARM systems are bi-endian, allowing either. But perhaps the most popular ARM chips today, Apple silicon, are little-endian. And the major microcontrollers I checked (AVR, ESP32, ATmega) are little-endian. It's thoroughly dominant commercially! Big-endian systems used to be more common. They're not really in most of the systems I'm likely to run into as a software engineer now, though. You are likely to run into it for some things, though. Even though we don't use big-endianness for processor math most of the time, we use it constantly to represent data. It comes back in networking! Most of the Internet protocols we know and love, like TCP and IP, use "network order" which means big-endian. This is mentioned in RFC 1700, among others. Other protocols do also use little-endianness again, though, so you can't always assume that it's big-endian just because it's coming over the wire. So... which you have? For your processor, probably little-endian. For data written to the disk or to the wire: who knows, check the protocol! Why do we do this??? I mean, ultimately, it's somewhat arbitrary. We have an endianness in the way we write, and we could pick either right-to-left or left-to-right. Both exist, but we need to pick one. Given that, it makes sense that both would arise over time, since there's no single entity controlling all computer usage[3]. There are advantages of each, though. One of the more interesting advantages is that little-endianness lets us pretend integers are whatever size we like, within bounds. If you write the number 26[4] into memory on a big-endian system, then read bytes from that memory address, it will represent different values depending on how many bytes you read. The length matters for reading in and interpreting the data. If you write it into memory on a little-endian system, though, and read bytes from the address (with the remaining ones zero, very important!), then it is the same value no matter how many bytes you read. As long as you don't truncate the value, at least; 0x0A0B read as an 8-bit int would not be equal to being read as a 16-bit ints, since an 8-bit int can't hold the entire thing. This lets you read a value in the size of integer you need for your calculation without conversion. On the other hand, big-endian values are easier to read and reason about as a human. If you dump out the raw bytes that you're working with, a big-endian number can be easier to spot since it matches the numbers we use in English. This makes it pretty convenient to store values as big-endian, even if that's not the native format, so you can spot things in a hex dump more easily. Ultimately, it's all kind of arbitrary. And it's a pile of standards where everything is made up, nothing matters, and the big-end is obviously the right end of the egg to crack. You monster. The correct answer is obviously the big end. That's where the little air pocket goes. But some people are monsters... ↩ Please, please, someone make a conlang that uses mixed-endian inspired numbers. ↩ If ever there were, maybe different endianness would be a contentious issue. Maybe some of our systems would be using big-endian but eventually realize their design was better suited to little-endian, and then spend a long time making that change. And then the government would become authoritarian on the promise of eradicating endianness-affirming care and—Oops, this became a metaphor. ↩ 26 in hex is 0x1A, which is purely a coincidence and not a reference to the First Amendment. This is a tech blog, not political, and I definitely stay in my lane. If it were a reference, though, I'd remind you to exercise their 1A rights[5] now and call your elected officials to ensure that we keep these rights. I'm scared, and I'm staring down the barrel of potential life-threatening circumstances if things get worse. I expect you're scared, too. And you know what? Bravery is doing things in spite of your fear. ↩ If you live somewhere other than the US, please interpret this as it applies to your own country's political process! There's a lot of authoritarian movement going on in the world, and we all need to work together for humanity's best, most free[6] future. ↩ I originally wrote "freest" which, while spelled correctly, looks so weird that I decided to replace it with "most free" instead. ↩
A tiny tool to calculate when your baby might arrive
Intel is sitting on a huge amount of card inventory they can’t move, largely because of bad software. Most of this is a summary of the public #intel-hardware channel in the tinygrad discord. Intel currently is sitting on: 15,000 Gaudi 2 cards (with baseboards) 5,100 Intel Data Center GPU Max 1450s (without baseboards) If you were Intel, what would you do with them? First, starting with the Gaudi cards. The open source repo needed to control them was archived on Feb 4, 2025. There’s a closed source version of this that’s maybe still maintained, but eww closed source and do you think it’s really maintained? The architecture is kind of tragic, and that’s likely why they didn’t open source it. Unlike every other accelerator I have seen, the MMEs, which is where all the FLOPS are, are not controllable by the TPCs. While the TPCs have an LLVM port, the MME is not documented. After some poking around, I found the spec: It’s highly fixed function, looks very similar to the Apple ANE. But that’s not even the real problem with it. The problem is that it is controlled by queues, not by the TPCs. Unpacking habanalabs-dkms-1.19.2-32.all.deb you can find the queues. There is some way to push a command stream to the device so you don’t actually have to deal with the host itself for the queues. But that doesn’t prevent you having to decompose the network you are trying to run into something you can put on this fixed function block. Programmability is on a spectrum, ranging from CPUs being the easiest, to GPUs, to things like the Qualcomm DSP / Google TPU (where at least you drive the MME from the program), to this and the Apple ANE being the hardest. While it’s impressive that they actually got on MLPerf Training v4.0 training GPT3, I suspect it’s all hand coded, and if you even can deviate off the trodden path you’ll get almost no perf. Accelerators like this are okay for low power inference where you can adjust the model architecture for the target, Apple does a great job of this. But this will never be acceptable for a training chip. Then there’s the Data Center GPU Max 1450. Intel actually sent us a few of these. You quickly run into a problem…how do you plug them in? They need OAM sockets, 48V power, and a cooling solution that can sink 600W. As far as I can tell, they were only ever deployed in two systems, the Aurora Supercomputer and the Dell XE9640. It’s hard to know, but I really doubt many of these Dell systems were sold. Intel then sent us this carrier board. In some ways it’s helpful, but in other ways it’s not at all. It still doesn’t solve cooling or power, and you need to buy 16x MCIO cables (cheap in quantity, but expensive and hard to find off the shelf). Also, I never got a straight answer, but I really doubt Intel has many of these boards. And that board doesn’t look cheap to manufacturer more of. The connectors alone, which you need two of per GPU, cost $26 each. That’s $104 for just the OAM connectors. tiny corp was in discussions to buy these GPUs. How much would you pay for one of these on a PCIe card? The specs look great. 839 TFLOPS, 128 GB of ram, 3.3 TB/s of bandwidth. However…read this article. Even in simple synthetic benchmarks, the chip doesn’t get anywhere near its max performance, and it looks to be for fundamental reasons like memory latency. We estimate we could sell PCIe versions of these GPUs for $1,000; I don’t think most people know how hard it is to move non NVIDIA hardware. Before you say you’d pay more, ask yourself, do you really want to deal with the software? An adapter card has four pieces. A PCB for the card, a 12->48V voltage converter, a heatsink, and a fan. My quote from the guy who makes an OAM adapter board was $310 for 10+ PCBs and $75 for the voltage converter. A heatsink that can handle 600W (heat pipes + vapor chamber) is going to cost $100, then maybe $20 more for the fan. That’s $505, and you still need to assemble and test them, oh and now there’s tariffs. Maybe you can get this down to $400 in ~1000 quantity. So $200 for the GPU, $400 for the adapter, $100 for shipping/fulfillment/returns (more if you use Amazon), and 30% profit if you sell at $1k. tiny would net $1M on this, which has to cover NRE and you have risk of unsold inventory. We offered Intel $200 per GPU (a $680k wire) and they said no. They wanted $600. I suspect that unless a supercomputer person who already uses these GPUs wants to buy more, they will ride it to zero. tl;dr: there’s 5100 of these GPUs with no simple way to plug them in. It’s unclear if they worth the cost of the slot they go in. I bet they end up shredded, or maybe dumped on eBay for $50 each in a year like the Xeon Phi cards. If you buy one, good luck plugging it in! The reason Meta and friends buy some AMD is as a hedge against NVIDIA. Even if it’s not usable, AMD has progressed on a solid steady roadmap, with a clear continuation from the 2018 MI50 (which you can now buy for 99% off), to the MI325X which is a super exciting chip (AMD is king of chiplets). They are even showing signs of finally investing in software, which makes me bullish. If NVIDIA stumbles for a generation, this is AMD’s game. The ROCm “copy each NVIDIA repo” strategy actually works if your competition stumbles. They can win GPUs with slow and steady improvement + competition stumbling, that’s how AMD won server CPUs. With these Intel chips, I’m not sure who they would appeal to. Ponte Vecchio is cancelled. There’s no point in investing in the platform if there’s not going to be a next generation, and therefore nobody can justify the cost of developing software, therefore there won’t be software, therefore they aren’t worth plugging in. Where does this leave Intel’s AI roadmap? The successor to Ponte Vecchio was Rialto Bridge, but that was cancelled. The successor to that was Falcon Shores, but that was also cancelled. Intel claims the next GPU will be “Jaguar Shores”, but fool me once… To quote JazzLord1234 from reddit “No point even bothering to listen to their roadmaps anymore. They have squandered all their credibility.” Gaudi 3 is a flop due to “unbaked software”, but as much as I usually do blame software, nothing has changed from Gaudi 2 and it’s just a really hard chip to program for. So there’s no future there either. I can’t say that “Jaguar Shores” square instills confidence. It didn’t inspire confidence for “Joseph B.” on LinkedIn either. From my interactions with Intel people, it seems there’s no individuals with power there, it’s all committee like leadership. The problem with this is there’s nobody who can say yes, just many people who can say no. Hence all the cancellations and the nonsense strategy. AMD’s dysfunction is different. from the beginning they had leadership that can do things (Lisa Su replied to my first e-mail), they just didn’t see the value in investing in software until recently. They sort of had a point if they were only targeting hyperscalars. but it seems like SemiAnalysis got through to them that hyperscalars aren’t going to deal with bad software either. It remains to be seem if they can shift culture to actually deliver good software, but there’s movement in that direction, and if they succeed AMD is so undervalued. Their hardware is good. With Intel, until that committee style leadership is gone, there’s 0 chance for success. Committee leadership is fine if you are trying to maintain, but Intel’s AI situation is even more hopeless than AMDs, and you’d need something major to turn it around. At least with AMD, you can try installing ROCm and be frustrated when there are bugs. Every time I have tried Intel’s software I can’t even recall getting the import to work, and the card wasn’t powerful enough that I cared. Intel needs actual leadership to turn this around, or there’s 0 future in Intel AI.