Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
7
One of the neat things about the PCG random number generator by Melissa O’Neill is its use of instruction-level parallelism: the PCG state update can run in parallel with its output permutation. However, PCG only has a limited amount of ILP, about 3 instructions. Its overall speed is limited by the rate at which a CPU can run a sequence where the output of one multiply-add feeds into the next multiply-add. … Or is it? With some linear algebra and some AVX512, I can generate random numbers from a single instance of pcg32 at 200 Gbit/s on a single core. This is the same sequence of random numbers generated in the same order as normal pcg32, but more than 4x faster. You can look at the benchmark in my pcg-dxsm repository. skip ahead the insight multipliers trying it out results skip ahead One of the slightly weird features that PCG gets from its underlying linear congruential generator is “seekability”: you can skip ahead k steps in the stream of random numbers in log(k) time. The PCG...
6 days ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Tony Finch's blog

obfuscated C revisited

The International Obfuscated C Code Contest has a newly revamped web site, and the Judges have announced the 28th contest, to coincide with its 40th anniversary. (Or 41st?) The Judges have also updated the archive of past winners so that as many of them as possible work on modern systems. Accordingly, I took a look at my 1998 winner to see how much damage time hath wrought. When it is built, my program needs to go through the C preprocessor twice. There are a few reasons: It’s part of coercing the C compiler into compiling OFL, an obfuscated functional language. OFL has keywords l and b, short for let and be, so for example the function for constructing a pair is defined as l pair b (BB (B (B K)) C CI) In a less awful language that might be written let pair = λx λy λf λg (f x y) Anyway, the first pass of the C preprocessor turns a l (let) declaration into a macro #define pair b (BB (B (B K)) C CI) And the second pass expands the macros. (There’s a joke in the README that the OFL compiler has one optimization, function inlining (which is actually implemented by cpp macro expansion) but in fact inlining harms the performance of OFL.) The smaller the OFL interpreter, the more space there is for the program written in OFL. In the 1998 IOCCC rules, #define cost 7 characters, whereas l cost only one. I think the modern rules don’t count C or cpp keywords so there’s less reason to use this stupid trick to save space. Running the program through cpp twice is a horrible abuse of C and therefore just the kind of joke that the IOCCC encourages. (In fact the Makefile sends the program through cpp three times, twice explicitly and once as part of compiling to machine code. This is deliberately gratuitous, INABIAF.) There were a couple of ways this silliness caused problems. Modern headers are sensitive to which version of the C standard is in effect, wrt things like restrict keywords in standard library function declarations. The extra preprocessor invocations needed to be fixed to use consistent -std= options so that the final compilation doesn’t encounter language features from the future. Newer gcc emits #line directives around macro expansions. This caused problems for the declaration l ef E(EOF) which defines ef as a primitive value equal to EOF. After preprocessing this became #define ef E( #line 1213 "stdio.h" (-1) #line 69 "fanf.c" ) so the macro definition got truncated. The fix was to process the #include directives in the second preprocessor pass rather than the first. I vaguely remember some indecision when writing the program about whether to #include in the first or second pass, in particular whether preprocessing the headers twice would lead to trouble. First pass #include seemed to work and was shorter so that was what the original submission did. There’s one further change. The IOCCC Judges are trying to avoid compiler warnings about nonstandard arguments to main. To save a few characters, my entry had int main(int c) { ... } but the argument c isn’t used so I just removed it. The build commands still print “This may take some time to complete”, because in the 1990s if you tried to compile with optimization you would have been waiting a long time, if it completed at all. The revamped Makefile uses -O3, which takes gcc over 30 seconds and half a gigabyte of RAM. Quite a lot for less than 2.5 KiB of C!

a month ago 45 votes
nsnotifyd-2.3 released

D’oh, I lost track of a bug report that should have been fixed in nsnotifyd-2.2. Thus, hot on the heels of [the previous release][prev], here’s nsnotifyd-2.3. Sorry for causing extra work to my uncountably many users! The nsnotifyd daemon monitors a set of DNS zones and runs a command when any of them change. It listens for DNS NOTIFY messages so it can respond to changes promptly. It also uses each zone’s SOA refresh and retry parameters to poll for updates if nsnotifyd does not receive NOTIFY messages more frequently. It comes with a client program nsnotify for sending notify messages. This nsnotifyd-2.3 release includes some bug fixes: When nsnotifyd receives a SIGINT or SIGTERM while running the command, it failed to handle it correctly. Now it exits promptly. Many thanks to Athanasius for reporting the bug! Miscellaneous minor code cleanup and compiler warning suppression. Thanks also to Dan Langille who sent me a lovely appreciation: Now that I think of it, nsnotifyd is in my favorite group of software. That group is software I forget I’m running, because they just run and do the work. For years. I haven’t touched, modified, or configured nsnotifyd and it just keeps doing the job.

2 months ago 48 votes
nsnotifyd-2.2 released

I have made a new release of nsnotifyd, a tiny DNS server that just listens for NOTIFY messages and runs a script when one of your zones changes. This nsnotifyd-2.2 release includes a new feature: nsnotify can now send NOTIFY messages from a specific source address Thanks to Adam Augustine for the suggestion. I like receiving messages that say things like, Thanks for making this useful tool available for free.

2 months ago 48 votes
petnames and Zooko's fan

Recently the Spritely Institute published an introduction to Petnames, A humane approach to secure, decentralized naming. I have long been a fan of petnames, and graph naming systems in general. I first learned about them in the context of Mark Miller’s E programming language which was a distributed object capability system on the JVM. I gather Spritely are working on something similar, but with a more webby lisp/scheme flavour – and in fact Mark Miller is the second author on this article. An interesting thing that the article sort of hints at is that these kinds of systems have a fairly smooth range of operating points from decentralized petnames all the way to centralized globally unique names. Zooko’s triangle is actually more like a fan when the “human friendly” point is fixed: there’s an arc describing the tradeoffs, from personal through local to global. As a first step away from petnames, I like to use the term “nickname” for nearby instances of what they call “edge names” in the article: a nickname is not as personal as a petname, it’s a name you share with others. A bit further along the arc is the article’s example of the “bizdir” local business directory. It’s a trusted (and hopefully trustworthy) naming authority, but in a more local / federated fashion rather than centralized. How can a petname system function at the global+centralized point, so it could replace the DNS? It needs to pass the “billboard test”: I type in a name I see on a billboard and I get to the right place. (It might be a multipart name like a postal address or DNS name, with an extra “edge name” or two to provide enough disambiguating context.) I imagine that an operating system supplier might provide a few preconfigured petnames (it probably includes its own petname so the software can update itself securely), a lot like its preconfigured PKIX CA certificates. These petnames would refer to orgs like the “bizdir”, or Verisign, or Nominet, that act as nickname registries. Your collection of petnames is in effect your personal root zone, and the preconfigured petnames are in effect the default TLDs. When I was thinking about how a decentralized graph naming system could be made user-friendly and able to pass the billboard test I realised that there are organizational structures that are hard to avoid. I expect there would inevitably be something like the CA/Browser forum to mediate between OS suppliers and nickname registries: a petname ICANN. I wrote an older iteration of these ideas over a decade ago. Those notes suffer from too many DNS brainworms, but looking back, there’s some curious anti-DNS discussion. How might it be useful to be able to reach a name by multiple paths? Could you use that for attestation? What might it look like to have a shared context for names that is not global but is national or regional or local?

2 months ago 49 votes

More in programming

My Top 15 OS Books: From Theory and Implementation to Systems Programming

A personal guide to the most useful books for understanding operating systems

11 hours ago 4 votes
Aspect Ratio Changes With CSS View Transitions

So here I am playing with CSS view transitions (again). I’ve got Dave Rupert’s post open in one tab, which serves as my recurring reference for the question, “How do you get these things to work again?” I’ve followed Dave’s instructions for transitioning the page generally and am now working on individual pieces of UI specifically. I feel like I’m 98% of the way there, I’ve just hit a small bug. It’s small. Many people might not even notice it. But I do and it’s bugging me. When I transition from one page to the next, I expect this “active page” outline to transition nicely from the old page to the new one. But it doesn’t. Not quite. Did you notice it? It’s subtle and fast, but it’s there. I have to slow my ::view-transition-old() animation timing waaaaay down to see catch it. The outline grows proportionally in width but not in height as it transitions from one element to the next. I kill myself on trying to figure out what this bug is. Dave mentions in his post how he had to use fit-content to fix some issues with container changes between pages. I don’t fully understand what he’s getting at, but I think maybe that’s where my issue is? I try sticking fit-content on different things but none of it works. I ask AI and it’s totally worthless, synthesizing disparate topics about CSS into a seemingly right on the surface but totally wrong answer. So I sit and think about it. What’s happening almost looks like some kind of screwy side effect of a transform: scale() operation. Perhaps it’s something about how default user agent styles for these things is animating the before/after state? No, that can’t be it… Honestly, I have no idea. I don’t know much about CSS view transitions, but I know enough to know that I don’t know enough to even formulate the right set of keywords for a decent question. I feel stuck. I consider reaching out on the socials for help, but at the last minute I somehow stumble on this perfectly wonderful blog post from Jake Archibald: “View transitions: Handling aspect ratio changes” and he’s got a one-line fix in my hands in seconds! The article is beautiful. It not only gives me an answer, but it provides really wonderful visuals that help describe why the problem I’m seeing is a problem in the first place. It really helps fill out my understanding of how this feature works. I absolutely love finding writing like this on the web. So now my problem is fixed — no more weirdness! If you’re playing with CSS view transitions these days, Jake’s article is a must read to help shape your understanding of how the feature works. Go give it a read. Email · Mastodon · Bluesky

2 days ago 2 votes
Europe's impotent rage

Europe has become a third-rate power economically, politically, and militarily, and the price for this slowly building predicament is now due all at once. First, America is seeking to negotiate peace in Ukraine directly with Russia, without even inviting Europe to the table. Decades of underfunding the European military has lead us here. The never-ending ridicule of America, for spending the supposedly "absurd sum" of 3.4% of its GDP to maintain its might, coming home to roost. Second, mass immigration in Europe has become the central political theme driving the surge of right-wing parties in countries across the continent. Decades of blind adherence to a naive multi-cultural ideology has produced an abject failure to assimilate culturally-incompatible migrants. Rather than respond to this growing public discontent, mainstream parties all over Europe run the same playbook of calling anyone with legitimate concerns "racist", and attempting to disparage or even ban political parties advancing these topics. Third, the decline of entrepreneurship in Europe has lead to a death of new major companies, and an accelerated brain drain to America. The European economy lost parity with the American after 2008, and now the net-zero nonsense has lead Europe's old manufacturing powerhouse, Germany, to commit financial harakiri. Shutting its nuclear power plants, over-investing in solar and wind, and rendering its prized car industry noncompetitive on the global market. The latter leading European bureaucrats in the unenviable position of having to both denounce Trump on his proposed tariffs while imposing their own on the Chinese. A single failure in any of these three crucial realms would have been painful to deal with. But failure in all three at the same time is a disaster, and it's one of Europe's own making. Worse still is that Europeans at large still appear to be stuck in the early stages of grief. Somewhere between "anger" and "bargaining". Leaving us with "depression" before we arrive at "acceptance". Except this isn't destiny. Europe is not doomed to impotent outrage or repressive anger. Europe has the people, the talent, and the capital to choose a different path. What it currently lacks is the will. I'm a Dane. Therefore, I'm a European. I don't summarize the sad state of Europe out of spite or ill will or from a lack of standing. I don't want Europe to become American. But I want Europe to be strong, confident, and successful. Right now it's anything but. The best time for Europe to make a change was twenty years ago. The next best time is right now. Forza Europe! Viva Europe!

2 days ago 4 votes
The Challenges Faced by Multinational Teams and Japanese Companies

It’s a fact that Japan needs more international developers. That doesn’t mean integrating those developers into Japanese companies, as well as Japanese society, is a simple process. But what are the most common challenges encountered by these companies with multinational teams? To find out, TokyoDev interviewed a number of Japanese companies with international employees. In addition to discussing the benefits of hiring overseas, we also wanted to learn more about what challenges they had faced, and how they had overcome them. The companies interviewed included: Autify, which provides an AI-based software test automation platform Beatrust, which has created a search platform that automatically structures profile information Cybozu, Japan’s leading groupware provider DeepX, which automates heavy equipment machinery Givery, which scouts, hires, and trains world-class engineers Shippio, which digitalizes international trading and is Japan’s first digital forwarding company Yaraku, which offers web-based translation management According to those companies, the issues they experienced fell into two categories: addressing the language barrier, and helping new hires come to Japan. The language barrier Language issues are by far the most universal problem faced by Japanese companies with multinational teams. As a result, all of the companies we spoke to have evolved their own unique solutions. AI translation To help improve English-Japanese communication, Yaraku has turned to AI and its own translation tool, YarakuZen. With these they’ve reduced comprehension issues down to verbal communication alone. Since their engineering teams primarily communicate via text anyway, this has solved the majority of their language barrier issue, and engineers feel that they can now work smoothly together. Calling on bilinguals While DeepX employs engineers from over 20 countries, English is the common language between them. Documentation is written in English, and even Japanese departments still write minutes in English so colleagues can check them later. Likewise, explanations of company-wide meetings are delivered in both Japanese and English. Still, a communication gap exists. To overcome it, DeepX assigns Japanese project managers who can also speak English well. English skills weren’t previously a requirement, but once English became the official language of the engineering team, bilingualism was an essential part of the role. These project managers are responsible for taking requests from clients and communicating them accurately to the English-only engineers. In addition, DeepX is producing more bilingual employees by offering online training in both Japanese and English. The online lessons have proven particularly popular with international employees who have just arrived in Japan. Beatrust has pursued a similar policy of encouraging employees to learn and speak both languages. Dr. Andreas Dippon, the Vice President of Engineering at Beatrust, feels that bilingual colleagues are absolutely necessary to business. I think the biggest mistake you can make is just hiring foreigners who speak only English and assuming all the communication inside engineering is just English and that’s fine. You need to understand that business communication with [those] engineers will be immensely difficult . . . You need some almost bilingual people in between the business side and the engineering side to make it work. Similar to DeepX, Beatrust offers its employees a stipend for language learning. “So nowadays, it’s almost like 80 percent of both sides can speak English and Japanese to some extent, and then there are like two or three people on each side who cannot speak the other language,” Dippon said. “So we have like two or three engineers who cannot speak Japanese at all, and we have two or three business members who cannot speak English at all.” But in the engineering team itself “is 100 percent English. And the business team is almost 100 percent Japanese.” “ Of course the leaders try to bridge the gap,” Dippon explained. “So I’m now joining the business meetings that are in Japanese and trying to follow up on that and then share the information with the engineering team, and [it’s] also the same for the business lead, who is joining some engineering meetings and trying to update the business team on what’s happening inside engineering.” “Mixed language” Shippio, on the other hand, encountered negative results when they leaned too hard on their bilingual employees. Initially they asked bilinguals to provide simultaneous interpretation at meetings, but quickly decided that the burden on them was too great and not sustainable in the long term. Instead, Shippio has adopted a policy of “mixed language,” or combining Japanese and English together. The goal of mixed language is simple: to “understand each other.” Many employees who speak one language also know a bit of the other, and Shippio has found that by fostering a culture of flexible communication, employees can overcome the language barrier themselves. For example, a Japanese engineer might forget an English word, in which case he’ll do his best to explain the meaning in Japanese. If the international engineer can understand a bit of Japanese, he’ll be able to figure out what his coworker intended to say, at which point they will switch back to English. This method, while idiosyncratic to every conversation, strikes a balance between the stress of speaking another language and consideration for the other person. The most important thing, according to Shippio, is that the message is conveyed in any language. Meeting more often Another method these companies use is creating structured meeting schedules designed to improve cross-team communication. Givery teams hold what they call “win sessions” and “sync-up meetings” once or twice a month, to ensure thorough information-sharing within and between departments. These two types of meetings have different goals: A “win session” reviews business or project successes, with the aim of analyzing and then repeating that success in future. A “sync-up meeting” helps teams coordinate project deadlines. They report on their progress, discuss any obstacles that have arisen, and plan future tasks. In these meetings employees often speak Japanese, but the meetings are translated into English, and sometimes supplemented with additional English messages and explanations. By building these sort of regular, focused meetings into the company’s schedule, Givery aims to overcome language difficulties with extra personal contact. Beatrust takes a similarly structured, if somewhat more casual, approach. They tend to schedule most meetings on Friday, when engineers are likely to come to the office. However, in addition to the regular meetings, they also hold the “no meeting hour” for everyone, including the business team. “One of the reasons is to just let people talk to each other,” Dippon explained. “Let the engineers talk to business people and to each other.” This kind of interaction, we don’t really care if it’s personal stuff or work stuff that they talk about. Just to be there, talking to each other, and getting this feeling of a team [is what’s important]. . . . This is hugely beneficial, I think. Building Bonds Beatrust also believes in building team relationships through regular off-site events. “Last time we went to Takaosan, the mountain area,” said Dippon. “It was nice, we did udon-making. . . . This was kind of a workshop for QRs, and this was really fun, because even the Japanese people had never done it before by themselves. So it was a really great experience. After we did that, we had a half-day workshop about team culture, company culture, our next goals, and so on.” Dippon in particular appreciates any chance to learn more about his fellow employees. Like, ‘Why did you leave your country? Why did you come to Japan? What are the problems in your country? What’s good in your country?’ You hear a lot of very different stories. DeepX also hopes to deepen the bonds between employees with different cultures and backgrounds via family parties, barbecues, and other fun, relaxing events. This policy intensified after the COVID-19 pandemic, during which Japan’s borders were closed and international engineers weren’t able to immigrate. When the borders opened and those engineers finally did arrive, DeepX organized in-house get-togethers every two weeks, to fortify the newcomers’ relationships with other members of the company. Sponsoring visas Not every company that hires international developers actually brings them to Japan—-quite a few prefer to hire foreign employees who are already in-country. However, for those willing to sponsor new work visas, there is universal consensus on how best to do it: hire a professional. Cybozu has gone to the extent of bringing those professionals in-house. The first international member they hired was an engineer living in the United States. Though he wanted to work in Japan, at that time they didn’t have any experience in acquiring a work visa or relocating an employee, so they asked him to work for their US subsidiary instead. But as they continued hiring internationally, Cybozu realized that quite a few engineers were interested in physically relocating to Japan. To facilitate this, the company set up a new support system for their multinational team, for the purpose of providing their employees with work visas. Other companies prefer to outsource the visa process. DeepX, for example, has hired a certified administrative scrivener corporation to handle visa applications on behalf of the company. Autify also goes to a “dedicated, specialized” lawyer for immigration procedures. Thomas Santonja, VP of Engineering at Autify, feels that sponsoring visas is a necessary cost of business and that the advantages far outweigh the price. We used to have fully remote, long-term employees outside of Japan, but we stopped after we noticed that there is a lot of value in being able to meet in person and join in increased collaboration, especially with Japanese-speaking employees that are less inclined to make an effort when they don’t know the people individually. “It’s kind of become a requirement, in the last two years,” he concluded, “to at least be capable of being physically here.” However, Autify does prevent unnecessary expenses by having a new employee work remotely from their home country for a one month trial period before starting the visa process in earnest. So far, the only serious issues they encountered were with an employee based in Egypt; the visa process became so complicated, Autify eventually had to give up. But Autify also employs engineers from France, the Philippines, and Canada, among other countries, and has successfully brought their workers over many times. Helping employees adjust Sponsoring a visa is only the beginning of bringing an employee to Japan. The next step is providing special support for international employees, although this can look quite different from company to company. DeepX points out that just working at a new company is difficult enough; also beginning a new life in a new country, particularly when one doesn’t speak the language, can be incredibly challenging. That’s why DeepX not only covers the cost of international flights, but also implemented other support systems for new arrivals. To help them get started in Japan, DeepX provides a hired car to transport them from the airport, and a furnished monthly apartment for one month. Then they offer four days of special paid leave to complete necessary procedures: opening a bank account, signing a mobile phone contract, finding housing, etc. The company also introduces real estate companies that specialize in helping foreigners find housing, since that can sometimes be a difficult process on its own. Dippon at Beatrust believes that international employees need ongoing support, not just at the point of entry, and that it’s best to have at least one person in-house who is prepared to assist them. I think that one trap many companies run into is that they know all about Japanese laws and taxes and so on, and everybody grew up with that, so they are all familiar. But suddenly you have foreigners who have basically no idea about the systems, and they need a lot of support, because it can be quite different. Santonja at Autify, by contrast, has had a different experience helping employees get settled. “I am extremely tempted to say that I don’t have any challenges. I would be extremely hard pressed to tell you anything that could be remotely considered difficult or, you know, require some organization or even extra work or thinking.” Most people we hire look for us, right? So they are looking for an opportunity to move to Japan and be supported with a visa, which is again a very rare occurrence. They tend to be extremely motivated to live and make it work here. So I don’t think that integration in Japan is such a challenge. Conclusion To companies unfamiliar with the process, the barriers to hiring internationally may seem high. However, there are typically only two major challenges when integrating developers from other countries. The first, language issues, has a variety of solutions ranging from the technical to the cultural. The second, attaining the correct work visa, is best handled by trained professionals, whether in-house or through contractors. Neither of these difficulties is insurmountable, particularly with expert assistance. In addition, Givery in particular has stressed that it’s not necessary to figure out all the details in advance of hiring: in fact, it can benefit a company to introduce international workers early on, before its own internal policies are overly fixed. This information should also benefit international developers hoping to work in Japan. Since this article reflects the top concerns of Japanese companies, developers can work to proactively relieve those worries. Learning even basic Japanese helps reduce the language barrier, while becoming preemptively familiar with Japanese society reassures employers that you’re capable of taking care of yourself here. If you’d like to learn more about the benefits these companies enjoy from hiring international developers, see part one of this article series here. Want to find a job in Japan? Check out the TokyoDev job board. If you want to know more about multinational teams, moving to Japan, or Japanese work life in general, see our extensive library of articles. If you’d like to continue the conversation, please join the TokyoDev Discord.

2 days ago 4 votes
Can I ethically use LLMs?

The title is not a rhetorical question, and I'm not going to bury an answer. I don't have an answer. This post is my exploration of the question, and why I think it is a question1. Important things up front: what's my relationship with LLMs today? I don't use any LLMs regularly. I do have access to GitHub Copilot through my employer. I have it available on a hotkey, I think, and I cannot remember how to trigger it since I do not use it. I've explored using LLMs in the past. I used to be a regular Copilot user, and I explored ChatGPT, Claude, etc. to see what their capabilities were. I have done trainings for my coworkers on how to use them effectively, though I would not feel comfortable doing so now. My employer's product uses LLMs. I don't want to link to my employer, but yeah, I guess my paycheck depends on being okay with integrating them in? It's complicated (a refrain). (This post is, obviously, my opinions and does not reflect my employer.) I don't think using or not using them is a moral failing. There is a lot of moralizing around LLM usage. I'm not doing any of that here. I have my own beliefs (or my own questions), but I don't think people using LLMs are immoral (or vice versa). So, you can see I have used them and I'm not absolutist, but I don't use them today. Why not? Why did I stop using them in spite of the advances, where they're more capable than ever? It's because of these questions and issues. Where I have undecided ethical questions, I lean toward the more conservative2 choice of not using them until I have clarity on the ethics. (Note: I am not inviting folks to email me with answers to this question.) Energy usage Another technology that uses a lot of energy is blockchains. I think using public blockchains is almost universally unethical since there are other, better, less harmful options. Part of the harm from blockchains is an absurd amount of energy usage. LLMs also use a lot of energy. This can be split into training and inference energy usage. These vary based on the model. Some models can run locally on Apple silicon, and those are lower energy usage—their upper bound is running your computer full tilt, and an M4 Mac mini's max power consumption is 65 watts. This is roughly equivalent to one incandescent light bulb, or 8 LED light bulbs. It's good to turn off unnecessary lights, but doing so isn't going to solve the climate crisis; we need bigger, more sweeping reforms. I don't think that local models are going to significantly alter the climate crisis in either direction. Other models are massive and run in data centers on lots of power-hungry GPUs3. These data centers also require construction, and that comes with its own environmental impact. An article from Tom's Guide last year showed that "a single query on ChatGPT-4 can use up to 3 bottles of water, that a year of queries uses enough electricity to power over nine houses". A lot of the cost comes from new data centers being built. The demand for LLMs has led to more demand for power generation and more demand for data centers. And this new power generation is coming from gas-fired plants instead of sustainable, clean energy sources, because that's all we can build fast enough. A lot of attention is given to the training side. The numbers for training are large and shocking: Llama 3 used 500 mWh and GPT-3 training used 1,287 mWh—even more if you include the cost of training failed models which preceded these, the experiments that made the models possible. The listed figures are high, and 500 mWh is about the cost of a large jet flying for 7 hours. But we do it once per foundational model, and then the cost is spread across all the remaining usage. I don't think that the training side is significantly shifting the equation on climate change. We'd have a much larger impact on improving the climate crisis by advocating for remote work—reducing vehicles on the road, making many flights unnecessary—than by not training models. Overall it feels to me like local models have a clearly acceptable impact, and data center models have higher energy usage but still probably do not change the situation very much. Training data The training data for LLMs has largely been lifted without the consent of the people who generated that data. This is a lot of writing, music, videos, visual art, all of it. There are some attempts at there at using licensed data only, but the majority of models, and the most popular ones, are unlicensed data. Now the question is: is this an ethical problem? I know there are opinions on both sides of this. Some say that this data is publicly visible on the internet (though some of the data was not on the internet), and so it's fair game. Others say that this use isn't one that people consented to, and it should require that consent. My thought experiment is this: If we made search engines today, would people have this same objection to a search engine using their data without their express consent? I think most people would ultimately support search engines. They are different than LLMs, because they (mostly) serve results that point you to the original source, rather than create new content for you to replace the original sources with. But maybe people would reject search engines. And maybe consistency between these two isn't necessary, or maybe it's a false consistency—there could be other differentiating details that lead to different answers. Where I come down is that I think we need a robust mechanism to opt out of use of data for training, but that it's probably fine to train with publicly available data on the internet. What you do with the trained model is another question entirely. When you try to replace people making original works instead of creating an entirely new function, that's where you get questions. "Replacing" people There's a lot of LLM usage that is just trying to replace people in entire jobs. I mean, it feels like all of it. We see LLMs that are meant to replace writers and editors, artists and illustrators, musicians and songwriters. It doesn't say that directly—it says you're empowered to create things yourself. But what it means is that people should be able to press a few buttons and, with no artist involved, get out a beautiful artwork. Sounds like replacing to me. This is something we've long done with technology. We make technology that puts people out of jobs, and that was the whole industrial revolution. That's what happened with shipping from the containerization of ports, putting many dockworkers out of jobs. Replacing people in the abstract is not unethical. The problem is if we fail to deal with the harm created from replacing people. And people will be harmed, because losing your job or the value of it going down has a serious impact on quality of life. When we put people out of work, we—both society and technologists—have an ethical responsibility to ensure there's a plan to mitigate the harm from that. Maybe that means grants for living expenses while people switch fields, if put out of work in an LLM-heavy industry. Maybe that means a universal basic income. But it certainly doesn't mean doing nothing. Incorrect information and bias One of the major well-known problems of LLMs is a tendency to "hallucinate," or to confidently state facts that are made up from whole cloth. They also have an unknown amount of bias, with unknown mitigations in place, due to being closed systems. This is a big problem! We're not good at seeing what information is incorrect in something that's generated to look like it's the most likely string of tokens. If there is incorrect information in there, we'll just miss it. This means that people can make poor decisions on the basis of what an LLM tells them. They can have lost income due to its mistakes. And the bias? That's a huge problem, because it means that we don't know if this system will reinforce existing problematic norms4. We don't know what it will reinforce, because the training data is closed and there's not a lot of public evaluation on bias. So ultimately, we're left with an unknown harm of unknown magnitude to an unknown population5. Concentrating power (with the wealthy elite) A big ethical concern for me is also what this will do to our entire society. Many technologies are heralded as "democratizing" things. Spotify "democratized" music by making it so that anyone can get listens—but, y'all, it ended up flattening the tail and making the popular artists more money while making small artists less money. Will LLMs do the same? We know that the big models need large data centers to run their training and inference. And even small models need beefy hardware to run inference, let alone training! We have some access to models which can run locally, which is a good step. But the problem is that we can only run other people's models. They'll have those people's decisions baked in, decisions on which data to include or to exclude, decisions on how to approach questions of bias and abuse. And when the hardware to run the biggest models is only accessible to a few companies, that means that those companies really get a lot more powerful. OpenAI and Anthropic and Google and Meta all have the ability to run really large models. I certainly don't, though. This means that a technology that many are heralding as making things more accessible to everyone is controlled by a small handful of people. A small handful of people can decide how the models are trained, and set policies on how they're used. In a time when the US government is trying to get any paper retracted that mentioned queer people, and erasing trans people from the Stonewall monument, it feels self-evident that letting a small group of people control this technology imperils the future of many people. * * * Ultimately, I want robots to do the things I don't want to do. I want them to do my dishes and my laundry. I don't want them to play music instead of me, write code instead of me, write words instead of me. I am not sure whether or not using LLMs is unethical. There are certainly ways of using them which are unquestionably unethical—as is true with every technology. And there are ways of developing LLMs which are unethical—as is true with every technology. But the problems with them are large. I think it is unethical to use them without addressing the ethical questions above. If you're not working on mitigating the harms from LLMs (which do exist), then you might be doing something unethical. 1 I've been in the interesting situation of having anti-LLM people think I'm pro-LLM and vice versa. It's a very weird feeling, and makes me a little nervous of posting this! But this is an important question and an important conversation. 2 Footnoting out of an abundance of caution: I don't meant conservative-like-Republicans, because, no, I'd like to keep my rights thank you very much. I just mean in terms of minimizing risk. Let's please stop attacking trans rights, immigrants, Palestinians, and you know, everyone else that I've forgotten because the whole world seems to be on fire. 3 If we ever achieve artificial sentience, this sentence may acquire a second, more sinister, meaning. 4 It will. 5 My guess is a large harm to all underrepresented groups, but who asked the trans woman?

2 days ago 5 votes