Car trouble

Some time ago—I’m not sure when exactly—my car started rattling. It would only rattle: When the engine was on, sitting idle, or When accelerating with just the right amount of throttle. This rattle, I did not like it. It sounded like a tiny spoon in a garbage disposal. Which can’t be good, can it? But I exist only in the world of ideas and couldn’t summon the executive function to do anything about it. Eventually, the future Dynomight biologist rode in the car, and we had this conversation: Dynomight biologist: What’s that sound? Dynomight: Rattling! Dynomight biologist: (Pause.) Huh. (In the “Huh”, I could sense overtones of, “How interesting that you would choose to live like this.”) Time went by. I kept reminding myself that selfhood doesn’t exist and therefore we all have a moral responsibility to be kind to our future selves and that future me wouldn’t be any more enthusiastic having this rattle situation dumped on them than I was. So I spent many irreplaceable hours reading about...

a month ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from DYNOMIGHT

Limits of smart

Take me. Now take someone with the combined talents of Von Neumann, Archimedes, Ramanujan, and Mozart. Now take someone smarter again by the same margin and repeat that a few times. Say this Being is created and has an IQ of 300. Let’s also say it can think at 10,000× normal speed. But it only has access to the same resources you do. Now what? Let’s assume it would quickly solve all our problems in math and programming and philosophy. (To the extent they’re solvable.) That’s plausible since progress in these fields only requires thinking. What about other fields? Other fields How good would it be at predicting the weather? We’re constantly getting better at predicting weather, because: We have faster supercomputers to run simulations. We have better data from new satellites, weather stations, and radar. We use machine learning and statistics to exploit patterns in all that data. The Being could surely design better algorithms for simulations or machine learning. But still: There’s only so much you can do with a given supercomputer or a given amount of data. Weather is a chaotic system. If you want to predict further in the future, you’ll eventually need more FLOPs and better knowledge of starting conditions. Those require bigger supercomputers and better satellites. Just being smart doesn’t (immediately) cause those things to exist. Best guess: A bit better. Would it have known that Donald Trump would win the 2024 election? I don’t think this was knowable. Take all the available polling, economic data, and lessons from history. If you looked at these on Nov 2, 2024, I doubt they provide enough signal to predict the winner with confidence. The truth was out there in people’s voting intentions, but they were buried in the brains of millions of people. I’m sure the Being would give better predictions. If you let it bet in prediction markets, it would probably make tons of money. But it wouldn’t be able to give geopolitical events 0% or 100% probabilities. It wouldn’t be psychic. Best guess: No. Would it beat current chess engines? The top current chess engine has an Elo of 3625. This is insane. It’s 750 Elo higher than ever achieved by a human. Anyway, the old hated Levitt Equation says that after years of study, a person can achieve an Elo of around (10×IQ)+1000. This suggests the 300 IQ Being would manage an Elo of 4000. If you trust that calculation, and the Being played our current best engine, it would win 81.09% of games, draw 18.88%, and lose 0.03%. But we shouldn’t trust that calculation. Obviously, the Levitt Equation isn’t accurate even for normal IQs. And I suspect the Being would lose to modern chess engines in complex endgames. Because it turns out that complex endgames in chess aren’t really solved with “intelligence”. Chess engines do incredibly deep searches of trees of possible moves and countermoves. The best move is the thing that comes out of that tree search. There is no other explanation. We assumed the Being could think 10,000x faster than a normal human, and that would allow it to do some searching of its own, but it still wouldn’t approach the 100,000,000 positions chess engines might evaluate per second. But maybe that’s wrong? Or maybe the Being could find some way to avoid complex endgames? (Of course, if the Being had its own computer, it would reprogram it and crush us.) Best guess: Unsure. Would it solve “creativity”? Would the Being be able to create better novels or music or jokes? It would surely be amazing. Since we included Mozart, this is basically true by definition. But there are reasons to think normal-person art would remain valuable. One is that if you accept an extreme version of Bourdieu, then taste is fake and the only reason we “like” anything is so that we can play class games and oppress each other. If so, then it doesn’t matter how “good” the Being’s books are. The upper class will just continue finding ways to demonstrate their cultural capital to keep their less privileged competitors in their place. Alternatively, maybe you find that life is strange and cruel and beautiful, and sometimes you feel things that seem important but you can’t understand, but sometimes someone else feels the same things and they create something that transcends the gap between your minds and just for a moment you feel that you’re part of some universal story and you don’t feel so alone. Just because the Being is smart doesn’t mean it knows what it’s like to walk in your shoes. Best guess: It would be great, but if art is borne from experience, normal-person art will still have a place. Would it solve physics? If you were sufficiently smart, could you look at all our current experiments and see some underlying pattern? Is there some mathematical trick or idea that will make all the pieces fall into place? Maybe! Or maybe that’s impossible. Maybe there are just too many rulesets consistent with the observations we have. After all, no one predicted quantum physics. Starting around 1900 we observed strange things, and then we invented quantum physics to make peace with those observations. If it’s impossible, then all the Being could do in the short term would be to help design new experiments: “Go build this kind of super collider, or this kind of space telescope, please.” Best guess: Probably not. Would it cure cancer? I’ve asked many biologists this question. The universal answer is “no”. The idea seems to be that biology isn’t really limited at the moment by our intelligence, but by our experimental knowledge. Many people don’t realize just cumulative modern biology research is. Take these two mental models: Biology is a giant pool of phenomena, with people picking random things to investigate. Biology is a “hard onion” that needs to be peeled away layer-by-layer. We use our current knowledge to invent new tools, then use those tools to do experiments, gain new knowledge from those experiments, and then invent new tools. The truth is a mixture of both, perhaps a bit more like the first. But there’s a lot of the second, too! Modern biology concerns many very small things that we can’t just pick up and manipulate. So instead we build tools to build tools (like TALENs or molecular beacons or phage display or ChIP-seq or bioluminescence imaging or prime editing) to see and manipulate them. So why might the Being be unable to cure cancer? Because perhaps it’s not possible to cure cancer right now. New knowledge and new tools are needed, and both of these depend on the other. Probably the best the Being could do is accelerate that invention loop. Best guess: Probably not. Would it solve persuasion? Would the Being be able to convince anyone of anything? Would it be the best diplomat in history? Let’s just assume that the Being has the best logic, the best rhetoric, the most convincing emotional appeals, etc., and all calibrated based on who it’s speaking to. Fine. But at the same time, for what fraction of people do you think exist words that would change their mind about Trump or abortion, or the wars in Israel or Ukraine? I suspect that if you decided to be open-minded, then the Being would probably be extremely persuasive. But I don’t think it’s very common to do that. On the contrary, most of us live most of our lives with strong “defenses” activated. Would the Being be so good that defenses don’t matter? Would it convince enough people to start a social movement? Would everyone respond by refusing to listen to anything? I have no idea. Best guess: No idea. Themes There are a few repeated themes above. To do many things requires new fundamental knowledge (e.g. the results of physical experiments, how molecular biology works). The Being might eventually be able to acquire this knowledge, but it wouldn’t happen automatically because it requires physical experiments. To do other things requires situational knowledge (e.g. the voting intentions of millions of people, the temperature and humidify at every position in Earth’s atmosphere, which particular cells in your body have become cancerous as a result of what mutations). Getting this knowledge requires creating and maintaining a complex infrastructure. To do most things requires moving molecules around. There are lots of feedback loops. Maybe the Being could run its own experiments. But to do that would require building new machines. Which would require moving lots of molecules around. Which would require new machines and new knowledge and new experiments… Finally, there is chaos/complexity. Many things that are predictable in principle (e.g. chess, the weather, possibly psychology or social movements) aren’t predictable in practice because the underlying dynamics are too complicated to be understood or simulated. Looking back I often think to myself, “Hey self, if super-intelligent AI is invented in a few years, you’ll almost certainly look back on 2025 and feel really stupid for not predicting many things that will seem obvious in retrospect. What are those things? xox, Self.” (Usually the first thought this prompts is, “Computer security is going to be really important, can we please for the love of god keep our critical systems simple and isolated from the internet?” But let’s put that aside.) The second thought this prompts is, “Maybe the first-order consequences wouldn’t be that big?” Perhaps it would solve math and programming and overturn all creative industries, and maybe… that’s “all”, at first? A super-intelligence wouldn’t be a god. I would expect a super-intelligence to be better than humans at creating better super-intelligences. But physics still exists! To do most things, you need to move molecules around. And humans would still be needed to do that, at least at first. So here’s one plausible future: Super-intelligent AI is invented. At first, existing robots cannot replace humans for most tasks. It doesn’t matter how brilliantly it’s programmed. There simply aren’t enough robots and the hardware isn’t good enough. In order to make better robots, lots of research is needed. Humans are needed to move molecules around to build factories and to do that research. So there’s a feedback loop between more/better research, robotics, energy, factories, and hardware to run the AI on. Gradually that loop goes faster and faster. Until one day the loop can continue without the need for humans. That’s still rather terrifying. But it seems likely that there’s a substantial delay between step 1 and step 6. Factories and power plants take years to build (for humans). So maybe the best mental initial model is as a “multiplier on economic growth” like all the economists have been insisting all alone. Odds and ends How quickly could simulations (of, e.g., biological systems) replace physical experiments? I suspect simulations will be limited by the same feedback loops because (1) simulations are limited by available hardware and (2) new fundamental and/or situational knowledge is needed to set the simulations up. Would the Being actually solve all the problems in math? It’s not clear, because, as you get smarter and smarter, is there more interesting math to be done, forever? And does that math keep getting more and more unreasonably effective, forever? Or is an IQ of 300 still actually quite stupid in the grand scheme of things? If you want to comment but don’t like Substack, I’ve created a forum on lemmy. (I tried this a year ago with kbin, and 2 weeks later kbin died forever. Hopefully that won’t happen again?)

4 days ago • 9 votes

Rewarding ideas

If you were in South America 12,000 years ago and you discovered where a bunch of glyptodonts were hiding or you figured out a better glyptodont hunting method, you could tell your tribal band and later they would say, thank you for helping us kill these delicious glyptodonts we now think you are cool and now will treat you slightly better. And that was that. There was no other reward for producing information. Nowadays, we have new tricks. If you write a book or patent a drug and someone starts selling copies without your permission, you can ask the government to take their money or put them in prison. If you’re a scientist, you can ask the government to give you money so you can do science and then give it away. Why do these things exist? Well, information is cool because it’s cheap to copy. But for the same reason, it tends to be undersupplied. Say that if I worked hard I could find some new fact, e.g. that ultrasonic humidifiers are bad. This only helps me a little, since they’re not that bad. If I got even 5% of the extra lifespan gained by each person who kills their humidifier, I would spend all day everyday looking for such facts. But I don’t, so I don’t. (Also no one believes me.) Patents and copyrights and science grants feel inevitable and boring. But take a step back. How close do these things get us to “optimal”, to rewarding someone with $500 when they create information that provides society with $1000 of value? The answer is not close at all. Because: Yes, we want socially optimal information production. But also, restricting what words people are allowed to say to each other is impossible and tyrannical. Our tricks are a messy patchwork that try to bridge the yawning chasm between those two realities. We reward information production, but only in a few limited cases where it’s easy to enforce without intruding too much on basic liberties. In this post, I’ll argue that our existing tricks ingeniously allow “facts” to flow freely (yay liberty) while also creating indirect subsidies for finding new facts (yay information production). This only works because of certain coincidental facts about the world. And AI is in the process of changing those facts. So, why do we have the tricks we have? What makes them work? Will they still work in a post-AI world? How could they be changed? We have ways of making you talk Roughly speaking, we have five main tricks to reward information production today. First, you can copyright creative works, like books or music or code. This lasts for your life plus ~70 years. Second, you can patent new inventions, like drugs or machines or algorithms. This lasts ~20 years. Third, you can create trade secrets. These aren’t just secrets! If you run a business and you discover basically any useful information, then as long as you make “reasonable efforts” to keep it secret, it’s a crime for someone to steal your secret. Even if everything they do is otherwise legal, just obtaining the information is “economic espionage”. This protection lasts forever. Fourth, you can get direct subsidies. Journalism is increasingly funded by philanthropy. The government gives money to scientists so they can do science and make the resulting knowledge freely available (to some for-profit publisher who then charges the public $30/article for the same science they already paid for with taxes). Finally, social norms are as important today as ever. I’m often tempted to take How Much Would You Need to be Paid to Live on a Deserted Island for 1.5 Years and Do Nothing but Kill Seals? and re-post it like I’d written it, but I don’t because I fear that word would spread that I’m a big thieving loser. More prosaically, if one journalist makes a big discovery, it’s totally legal for others to re-report the facts without giving them credit. But journalists have a culture where credit is expected. These ways seem weird At first glance, our system seems obvious an inevitable. At second glance, though, it seems very strange. But at third glance, that strangeness can be seen as society having made some shrewd calculations to manage the tradeoff between (a) rewarding information, and (b) not creating dubious restrictions on speech. So let’s go through those second and third glances. Why is it that copyright lasts for life plus ~70 years, while patents last for ~20 years? Perhaps because artistic work usually has lots of substitutes. If I write a book, then I have a monopoly and I’m free to charge $25,000 for it. But if I did that, everyone else would just buy some $25 book instead. My monopoly doesn’t give me that much pricing power. Whereas if I invent a new drug for pancreatic cancer, I can probably charge people $100,000 per treatment. For drugs, a 20 year term still provides plenty of reward. Why do patents require filing a complicated application and paying gigantic fees, while copyright and trade secrets are automatic? Probably because it’s easy to determine who wrote a book, but hard to prove who came up with an invention. Why do patents require publishing how your invention works, while if you create a song, you don’t have to share your pre-mastered multi-track audio? Probably because creative works don’t have as much “secret sauce”. You can read a book and figure out how it was made much more easily than you can look at a new engine and understand the engineering principles. By forcing people filing patents to publish their ideas, this helps good ideas spread more quickly. Also, many more creative works are created each year. It’s just not worth it. Why does copyright only cover artistic aspects? Well, imagine that when Don Daglow created Utopia in 1981, he didn’t just get a copyright on the art and characters and code, but also on the idea of a real-time strategy game. Starcraft wouldn’t exist. The horror. Why is it that even conspiring to steal a trade secret is illegal? Say you and I decide to steal Coca-Cola’s secret formula, so we high-five and drive to Atlanta, but then we realize we’re idiots and go home. Believe it or not, we are now guilty of conspiracy to commit economic espionage and could theoretically be imprisoned for up to 15 years. Weird, huh? While this seems ridiculous, I guess prosecutors find (as with regular espionage) that it’s hard to prove actual espionage and only use this power in egregious cases. And why do trade secrets at all? Why make it illegal for someone to steal them only if you make “reasonable efforts” to keep them secret? Why is it legal to discover trade secrets through reverse engineering, but illegal to discover them by getting engineers drunk? Why does this protection apply to basically any form of information, and why does it last forever? I think the idea is that people will keep secrets no matter what. But without laws, people would spend vast sums trying to steal and/or secure secrets, and this arms race would have no social benefit. Meanwhile, trying to make it illegal to “steal” things that aren’t really secrets at all would trample on basic freedoms and be impossible to enforce. Why can you patent inventions but not discoveries? Why can you patent “algorithms” but not “math”? Well, imagine you could patent math. Would we be like the Pythagoreans and punish anyone who mentioned the wrong theorem? It’s far easier to judge if someone is using math, which is sort of the definition of an algorithm. Better to just pay mathematicians out of taxes and let the math be free. So, I view these tricks as a very clever and highly evolved solution to a difficult problem. If you take all the possible ways you could reward information and then sort them by (ease of implementation) × (how much ideas are rewarded), you’d probably end up with something close to our current system. Intermission: No really, they’re weird While I think our tricks are clever, we shouldn’t forget that they’re highly imperfect and lead lots of perversity. In case you’re a perversity aficionado, I’ve collected here some favorites: Say you suspect that some type of fungus might cure cancer, so you spend $50 billion checking each of the 144,000 known fungal species. And say you actually find one that works. Too bad! The fungus already existed, so that’s a “discovery”, not an “invention”. You might be able to patent some extract or something, but if you’re charging $100,000 per cure, people will find ways around the patent. Better not to spend that $50 billion in the first place. Information production is sometimes rewarded through bundling. But for decades, local newspapers made half their revenue from classified ads. This was a great business, since printing classified ads costs almost nothing. But because economics is weird, they found it was profitable to also pay tons of money for reporters who would create news, and then bundle the news and classified ads together. Then Craigslist was invented and now most of those newspapers are dead. Related: You’ve surely noticed that recipe sites have thousands of words of inane blabbering before they show the actual recipe. That’s partly to manipulate search engines and to have more space for ads. But it’s also because inane blabbering is copyrightable but recipes are not. Many people find it strange that you can patent “business methods”. But did you know this was already happening in France in 1792? As some people were arresting Louis XVI, others were filing patents for financial inventions. (Though these were later deemed invalid.) Believe it or not, you can patent methods for reducing taxes. I’m unsure what public interest is served by rewarding such inventions. You’ve probably heard that map makers sometimes add fake “trap cities” or “trap streets”. The idea is that if your traps appear on another map, you’ve got them dead to rights on copyright violation, right? Turns out: Nope. Locations of cities and streets are facts. A fake fact isn’t a form of creative expression and still isn’t copyrightable. One of the central concepts of patents is that you must publish your invention. But companies don’t like telling their competitors about their inventions. So many—particularly pharmaceutical companies—pay lawyers gigantic sums to write patents in a way that’s legally valid but impossible for a normal person to read. In response, their competitors pay their lawyers gigantic sums to decode the legal gibberish, and sometimes get patents translated from other countries (often Japan) that are less tolerant of such chicanery. When Don Daglow created Utopia in 1981, he couldn’t have copyrighted the idea of real-trim strategy game. But he might have been able to patent some version of that idea. I’m glad he didn’t. About those indirect subsidies We’re 1900 words in. What is this post about again? Oh yeah: Our current tricks for rewarding information production rely on coincidental facts about how the world works. Artificial intelligence is changing those facts. ??? What are these coincidental facts? Well, “facts” and “discoveries” are important. We want more of them! But legally protecting these things seems terrible, because that requires you to police what words people can say to each other. But copyright still cleverly provides indirect subsidies for creating facts and making discoveries. I’ll highlight two. First, while you can’t copyright facts, you can copyright “presentations” of facts. If I write a book, you are free to steal all my facts and write your own book. But, for humans, doing that is hard. If you go way back to, like, three years ago, “unbundling” the facts from their presentation took a ton of work. Now, AI can do this instantly and for near-zero cost. Second, I suspect that a lot of creation is driven by our old tribal band instincts. Like why am I writing this? Probably because some part of me hopes that after I post it, glowing pixels will show up on my screen which my brain will interpret as meaning that people love me or whatever. I’m pretty sure this doesn’t provide concrete benefits that will increase my chances of passing on my genes, but my brain is still operating on some confused heuristics where “more glowing pixels” means “more sexual opportunities” or “more friends to provide resources in times of need” or something. AI is overturning both of these. A few months back I did some research into how well LLMs can play chess. My most surprising finding was that if you asked chat-based LLMs to regurgitate the full sequence of moves before choosing a new one, that greatly improved performance. I’m pretty sure I was the first person to show this. A few weeks later, I asked Google “can llms play chess”. No credit. Beyond the idea of replaying moves, several quotes were taken from me almost verbatim. If a human did this, most people would think it was rude. But at least I’d know that they paid a tax on their time, and hypothetically some people might think less of them for having done it. AI does it instantly, for free, and does not care about social approval. Now, I don’t mean to beat up on poor Google. I actually they deserve credit for exceptionally good behavior. Many AI companies offer you a way to signal that you don’t want your content used for AI, but then they ignore their own signal and if you try to block them they switch IP addresses. I expected Google to say, “Sure, you can block our AI, we love you, just add the same signal that completely removes your website from our search engine.” But no. They offer a different signal just for AI, and they actually respect it. I added it, and when I checked a few weeks later, Google no longer provided any AI summary at all. So, good on Google for giving creators some control. But I’m skeptical that this is what the future will look like. Some people I respect say that they now write for AI. I admire this zen-like detachment from earthly concerns. But really? You’re happy to spend your time creating information just to feed it into a training set, so it can be used for purposes you might hate, and without giving you any reward, neither money nor recognition? For better or for worse, I don’t feel that way. What could be done? Option: Do nothing This is a strong choice! As I see it, our historical compromise was to try to reward information production, but to take a light touch. Only do it in a few places, where it can be done at low cost, with unambiguous rules that don’t involve degrading intrusions on basic human liberty. Maybe AI changes how well that compromise works. But that doesn’t imply that we should change anything. After all, when Craigslist debundled the news from classified ads, we didn’t make Craigslist illegal. We just left newspapers to their fate. This meant less news, particularly local news in smaller markets, and I think this has had some bad effects. But tend to think it was the right choice. Even if you ignore “freedom”, people have saved billions of dollars, and people trade far more goods now than when newspapers had a monopoly. And we shouldn’t forget that AI also has (possibly enormous) positive effects on information production, by making it easier and cheaper. Change how copyright and/or patents and/or trade secrets work? I have a couple (possibly bad) ideas for minor tinkering at the margins, below. In principle, we could make some kind of dramatic change. But I don’t see many options without huge problems. Can you think of anything? This all seems under-theorized. Increase non-market incentives for information production? After Craigslist killed newspapers, some started adding paywalls. Paywalls are, theoretically, a market-based solution. But I suspect that a lot of people who subscribe to these newspapers (and blogs (not me, money bad)) do it not just out of self-interest, but because they want to support them. They think what they’re doing is good for the world and they want to encourage it. There are also many charities that fund journalism. And OK, maybe AI decreases the incentives for internet randos to do research and share it with the world. How much information is really lost by this? How hard would it be to provide more grants (or post-hoc “awards”) to make up for what’s lost? Clarify the meaning of a derivative work. Say you create a game. You write code and sell me an executable. Then I take the executable, decompile it, replace all your art with new version, and then re-compile it for some other operating system and start selling it. Can you go to court and take my money? Yes, because your source code was copyrighted, and my new executable would be a “derivative work”. But say you write a book, and I get an AI to re-write it. Can you take my money? The legal standard here is “substantial similarity” which is just as confusing as it sounds. Courts talk about “total concept and feel test” and “comprehensive non-literal similarity” versus “fragmented literal similarity”. As far as I can tell, this is an incredibly blurry boundary that we’ve only gotten away with because cases are relatively rare. AI will force us to find a clearer boundary, one that doesn’t require judges to listen to individual pieces of music. But I’m not sure a clear boundary would do much to incentivize creators. If we had a magic box that perfectly decided what’s infringing and what isn’t, I don’t expect that the response will be for AI companies to pay creators. Rather, they’ll probably just tune their AIs to run right up to that boundary. Instead of re-writing one books, rewrite N books for whatever value of N is legal. Create a legal opt-out. Many companies theoretically offer an opt-out. It’s a different opt out for each company, and many of them seem to just ignore it. In principle, governments could create a legal mandate for this. It could even be fine-grained. Then, companies might compete with each other to make creators happy. It’s quite possible that such a mandate would be a disaster. For one thing, it would be a headache to enforce—would Federal AI inspectors demand to inspect the training data for all AI companies? And for any even moderately popular blogger, lots of people steal their articles and re-post them without credit. (Google and Bing usually de-list these sites, but you can find them with Yandex.) If they have different opt-out headers, how are AI companies supposed to know which one is correct? Most of all, I worry that that this mandate would just hurt “good” companies and/or companies in jurisdictions that actually enforce the mandate. If the effect was to hand AI leadership over to That Other Country, that seems bad. Go Xanadu Or maybe we could develop technology that would solve this problem using existing laws. A while back, I mentioned Project Xanadu, The Original Hypertext Project. I was mostly attracted by their attitude. (“It is a continual war over software politics and paradigms. With ideas which are still radical, WE FIGHT ON.”) But I couldn’t really understand what the hell it actually was. But then I read Jason Crawford’s The lessons of Xanadu WIRED magazine’s 1995 piece The Curse of Xanadu. I now understand that Xanadu was (is?) intended to be a system for interlinked documents. But it also included a crazy “transclusion” feature. This was some kind of distributed copyright scheme where authors could link and copy each other and royalties were somehow apportioned to all upstream documents. Maybe we don’t need new laws. Maybe a solution exists that uses a combination of technology and existing contract law. There’s a very large space of possibilities, and I don’t pretend to have the answer. But at a high level, there could be some system where people (and AIs?) put their creations. In order to access the system, you have to agree to distribute royalties according to some formula, and to treat anything learned through the system as a trade secret. In principle, it seems like essentially any combination of technology and laws could be implemented this way? And there could be a competition to find the best one? And we don’t have to rely on the Leviathan? And it might be better than our current crazy hacks? Seems hard, but it’s the best idea I’ve got. TLDR Weird situation, evolving, new ideas needed.

a week ago • 9 votes

My 16-month theanine self-experiment

The internet loves theanine. This is an amino acid analog that’s naturally found in tea, but now sold as a nutritional supplement for anxiety or mood or memory. Many people try theanine and report wow or great for ADHD or cured my (social) anxiety or changing my life. And it’s not just the placebo enthusiast community. This hacker news thread is full of positive reports, and gwern uses it regularly. But does it really work? Biologically speaking, it’s plausible. Theanine is structurally related to the neurotransmitter glutamate (theanine = C₇H₁₄N₂O₃, glutamate = C₅H₈NO₄-). For some reason, everyone is obsessed with stupid flashy dopamine and serotonin, and no one cares about glutamate. But it’s the most common common neurotransmitter and theanine is both metabolized into glutamate and seems to itself have various complicated effects on glutamate receptors. Of course, there are lots of supplements that could act on the brain, but are useless when taken orally. That’s because your brain is isolated from your circulatory system by a thin layer of cells that are extremely picky about what they let through. But it appears that theanine can get through these cells and into the brain. So that sounds good. But do these low-level effects actually lead to changes in mood in real humans? When I looked into the academic research, I was surprised by how weak it was. Personally, on these kinds of issues, I find the European Food Safety Authority to be the single most trustworthy scientific body. They did an assessment in 2011 and found: Claim Result Improvement of cognitive function cause and effect relationship has not been established Alleviation of psychological stress cause and effect relationship has not been established Maintenance of normal sleep cause and effect relationship has not been established Reduction of menstrual discomfort cause and effect relationship has not been established Examine is an independent website that’s respected for summarizing the scientific literature on health and supplements. They looked into if theanine helped with various things, like alertness, anxiety, and attention. In all cases found low quality evidence for near zero effect. A 2020 review of eight randomized double-blind placebo controlled trials found that theanine might help with stress and anxiety. While this review seems generally good, I found it to be insufficiently paranoid. One study they review found that theanine worked better than alprazolam (xanax) for acute anxiety. The correct response would be, “That’s impossible, and the fact that normal scientific practices could lead to such a conclusion casts doubt on everything.” But the review sort of takes it at value and moves on. After 2020, the only major trial I could find was this 2021 study that took 52 healthy older Japanese people and gave them theanine (or placebo) for 12 weeks. They tested for improvements in a million different measures of cognitive functioning and mostly found nothing. Why I did this I’ve long found that tea makes me much less nervous than coffee, even with equal caffeine. Many people have suggested theanine as the explanation, but I’m skeptical. Most tea only has ~5 mg of theanine per cup, while when people supplement, they take 100-400 mg. Apparently grassy shade-grown Japanese teas are particularly high in theanine. And I do find those teas particularly calming. But they still only manage ~25 mg per cup. (Maybe it’s because tea is better than coffee?) Still, I’ve supplemented theanine on and off for more than 10 years, and it seems helpful. So after seeing the weak scientific evidence, I thought: Why not do a self-experiment? Theanine seems ideal because it’s a supplement with short term effects. So you can test it against placebo. (Try that with meditation.) And you can build up a large sample using a single human body without waiting weeks for it to build up in the body before each measurement. Everyone agrees theanine is safe. It’s biologically plausible. While academic studies haven’t proven a benefit, they haven’t disproven one either. Given the vast anecdotal evidence, I saw a chance to stick it to the stodgy scientific establishment, to show the power of internet people and give the first rigorous evidence that theanine really works. Stockholm, prepare thyself. What I did First, I needed placebos. This was super annoying. The obvious way to create them would be to buy some empty capsules and fill some with theanine and others with some inert substance. But that doesn’t sound fun. Isn’t the whole idea of modernity that we’re supposed to replace labor with capital? So I went searching for a pair of capsules I could buy off the shelf, subject to the following constraints: Capsule A contains 200 mg of theanine. Capsule B contains something with minimal effects on anxiety, stress, memory, concentration, etc. Capsule B contains something I don’t mind putting into my body. Both capsules are exactly the same size and weight. Both capsules are almost but not quite the same color. Both capsules are made by some company with a history of making at least a modest effort to sell supplements that contain what they say they contain, and that don’t have terrifying levels of heavy metals. The capsules themselves aren’t made from the skin and bones and connective tissues of dead animals (personal preference). After a ludicrous amount of searching, I found that NOW® sells these veggie capsules: Capsule A: 200 mg L-Theanine Capsule B: 25 mcg (1,000 IU) Vitamin D These are exactly the same size, exactly the same weight, exactly the same texture, and very close in color. They’re so close in color that under warm lighting, they’re indistinguishable. But under cold/blue lighting, the vitamin D capsules are slightly more yellow. For dosing, I decided to take a capsule whenever I was feeling stressed or anxious. Some people worry this invalidates the results. Not so! I’m still choosing randomly, and this better reflects how people use theanine in practice. Theanine is often recommended for reducing anxiety from caffeine. While I didn’t explicitly take caffeine as part of this experiment, I had almost always taken some anyway. Statistically, it would have been best to randomize so I had a 50% chance of taking theanine and a 50% chance of taking vitamin D. But I decided that would be annoying, since I was taking these capsules when stressed. So I decided to randomize so I got theanine ⅔ of the time and vitamin D ⅓ of the time. Randomization was very easy: I took two theanine capsules and one vitamin D capsule and put them into a little cup. I then closed my eyes, shook the cup around a bit and took one. I then covered the cup with a card. This picture shows one vitamin D capsule (top) and two theanine capsules. For each trial, I recorded my subjective starting stress level on a scale of 1-5, then set an alarm for an hour, which is enough to reach near-peak concentrations in the blood. After the alarm sounded (or occasionally later, if I missed it) I recorded the end time, my end stress level, and my percentage prediction that what I’d taken was actually theanine. Then, and only then, I looked into the cup. If the two remaining pills were different colors, I’d taken theanine. If not, it was vitamin D. After ~14 months, I got frustrated by how slowly data was coming in. This was the first time in my life, I’ve had too much chill. At that point, I decided to start taking the capsules once or twice a day, even if I wasn’t stressed. I’ll show the transition point in the graphs below. Ultimately, I collected 94 data points, which look like this: Date Start time Start stres End time End stress Prediction Result Nov 18, 2023 9:38 AM 3.5 10:45 AM 2.2 80% T Nov 19, 2023 9:40 AM 2.8 10:41 AM 2.9 75% T ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ Feb 28, 2025 4:58 PM 2.1 5:58 PM 1.8 75% D Jan 3, 2025 6:12 PM 2.1 7:12 PM 2.0 61% T What are the results? Bad. Here are the raw stress levels. Each line line shows one trial, with the start marked with a tiny horizontal bar. Note the clear change when I started dosing daily: Alternatively, here’s the difference in stress (end - start) as a function of time. If “Δ Stress” is negative, that means stress went down. Here are the start and end stress levels for each trial, ignoring time. The dotted line shows equal stress levels, so anything below that line means stress went down: Finally, here are the probabilities I gave that each capsule was in fact theanine. Thoughts Ooof. My stress level did usually go down, at least provided I was stressed at the start. But it went down regardless of if I took theanine or not. And I was terrible at guessing what I’d taken. Why did my stress decrease when I took vitamin D? Maybe it’s the placebo effect. But I suspects it’s mostly reversion to the mean: If you mark down the times in your life when you’re most stressed, on average you’ll be less stressed an hour later. You can see evidence for this in the stress tended to decrease more when it started at a higher level. So, eyeballing the above figures, theanine doesn’t appear to do anything. (We can argue about statistics below.) Why? I think these are the possibilities: Theanine works, but I got fake theanine. Theanine works, but vitamin D works equally well. Theanine works, but I was unlucky. Theanine works, but I’m disembodied and unable to report my internal states. Theanine works on some people, but not me. Theanine doesn’t work. It’s hard to disprove the idea that theanine works. But I tell you this: I expected it to work. And I really tried. For almost 100 trials over 16 months, I paid attention to what I was feeling and tried to detect any sign that I’d taken theanine, even if it wasn’t a change in stress. I could detect nothing. Even after months of failure, I’d often feel confident that this time I could tell, only to be proven wrong. So, cards on the table, here are my made-up probabilities for each of the possible explanations: Explanation belief Fake theanine 3% D equally good 1% Unlucky 6% Disembodied 15% Not on me 20% Doesn’t work 55% Should I have been surprised by these results? Well, the scientific literature on theanine hasn’t found much of an effect. And the only other good self-experiment on theanine I’ve found is by Niplav, who found it did slightly worse than chance and declared it a “hard pass”. What about other blinded self-experiments with other substances? They’re surprisingly scarce, but here’s what I could find: author substance result Niplav caffeine positive Gwern amphetamines positive Gwern lithium no effect Gwern LSD microdose no effect Gwern ZMA inconclusive Slatestarcodex sleep support no effect Stimulants work! But for everything else… I particularly encourage you to read the sleep support post. He was confident it worked, he’d recommended it to lots of friends, but it totally failed when put to the test. I’ve seen many other self-experiments (including for theanine), but they’re non-blinded and I’d be doing you a disservice if I liked to them. People often mention that hypothetically this means the results aren’t scientific, but treat it like a small niggling technicality. It’s not. So I propose a new rule: Blind trial or GTFO. I know many people reading this probably use and like theanine. Maybe it works for you! But given the weak academic results, and given the fact that I actually did a blinded experiment, I think you now have the burden of proof. Doing this kind of test isn’t hard. If you’re sure theanine (or anything else) works, prove it. Appendix: OK fine let’s argue about statistics Do you demand p-values? Are you outraged I just plotted the data and then started talking about it qualitatively? I think faith in statistics follows a U-shaped curve. By default, people don’t trust them. If you learn a little statistics, they seem great. (Particularly if you’re part of a community that’s formed a little cult around one set of statistical practices and convinced each other that they’re more reliable than they are.) But if you learn a lot of statistics, then you realize all the assumptions that are needed and all the ways things can go wrong and you become very paranoid. If you want p-values, I’ll give you p-values. But first let me point out a problem. While I was blinded during each trial, I saw the theanine/D result when I wrote it down. Over time I couldn’t help but notice that my stress dropped even when I took vitamin D, and that I was terrible at predicting what I’d taken. So while this experiment is randomized and blinded, the data isn’t independent or identically distributed. If I did this again, I’d make sure I couldn’t see any outcomes until the end, perhaps by making 100 numbered envelopes, putting three capsules in each, and only looking at what was left at the end. But if you want to compute p-values anyway, OK! Here are the basic numbers for the trials when I took theanine: Variable Substance Mean 95% C.I. p start stress theanine 2.480 (2.361, 2.599) end stress theanine 2.181 (2.104, 2.258) Δ stress theanine -0.299 (-0.392, -0.205) 2.00×10⁻⁸ Predicted T theanine 68.4% (66.2%, 70.5%) Stress went down, p < .0000001. But here are the the numbers for vitamin D: Variable Substance Mean 95% C.I. p start stress vitamin D 2.350 (2.173, 2.526) end stress vitamin D 2.025 (1.936, 2.114) Δ stress vitamin D -0.325 (-0.453, -0.197) 2.44×10⁻⁵ Predicted T vitamin D 72.9% (69.7%, 76.1%) Stress also went down. Finally, here’s the difference between theanine and vitamin D, computed with a two-sided t-test with unequal variance: Variable Substance Mean 95% C.I. p start stress theanine - D 0.130 (-0.095, 0.354) 0.254 end stress theanine - D 0.156 (0.0165, 0.296) 0.029 Δ stress theanine - D -0.026 (-0.201, 0.148) 0.764 Predicted T theanine - D -4.5% (-8.5%, -0.5%) 0.029 Technically, I did find two significant results. But the second row says that end stress was slightly higher with theanine than with vitamin D, and the last row says that I gave slightly higher probabilities that I’d taken theanine when I’d actually taken vitamin D. Of course, I don’t think this means I’ve proven theanine is harmful. I just think this confirms my general paranoia. To a first approximation, if it ain’t visible in the raw data, I ain’t going. Speaking of raw data, you can download mine here.

3 weeks ago • 16 votes

Bayes is not a phase

People make fun of techie/rationalist/effective-altruist types for many weird obsessions, like stimulants or meditation or polyamory or psychedelics or seed oils or air quality or re-deriving all of philosophy from scratch. Some of these seem fair to me, or at least understandable. But the single most common point of mockery is surely the obsession with “Bayesian” reasoning. Many people seem to see this as some screwy hipster fad, some alternate mode of logic that all these weirdos have decided to trust instead of normal human thinking. This drives me crazy. Because everyone uses Bayesian reasoning all the time, even if they don’t think of it that way. Arguably, we’re born Bayesian and do it instinctively. It’s normal and natural and—I daresay—almost boring. “Bayesian reasoning” is just a slight formalization of everyday thought. It’s not a trend. It’s forever. But it’s forever like arithmetic is forever: Strange to be obsessed with it, but really strange to make fun of someone for using it. Here, I’ll explain what Bayesian reasoning is, why it’s so fundamental, why people argue about it, and why much of that controversy is ultimately a boring semantic debate of no interest to an enlightened person like yourself. Then, for the haters, I’ll give some actually good reasons to skeptical about how useful it is in practice. I won’t use any equations. That’s not because I don’t think you can take it, but Bayesian reasoning isn’t math. It’s a concept. The typical explanations use lots of math and kind of gesture around the concept, but never seem to get to the core of it, which I think leads people to miss the forest for the trees. Examples Let’s get our intuition flowing with a few examples. Bored one day, you convince a friend to give you an antinuclear antibody test for lupus. (It beats watching TV.) To your shock, the test is positive. After seeing that it’s 90% accurate, you sink into existential terror. But then you remember that only 0.5% of people actually have lupus, so if you gave this test to 2000 random people, you’d get ~199 false positives and only ~9 true positives. Then you feel less scared. You take a penny and flip it 20 times, getting 16 heads. You plug this into a calculator which tells you that with 95% confidence, the true bias is between 57.6% and 92.9%. This calculator isn’t making a mistake. But still, this was a normal penny. So you’re pretty sure the bias is very close to 50% and you just got 16 heads by chance. You wonder if artificial superintelligence will be created in the next five years. That sounds weird, so you figure the chance is 1%. But then you notice that AI seems to be better better at a suspicious rate. And you see that lots of expert forecasters give higher numbers. So you raise your estimate to 5%. You wonder if plants are conscious. You decide there’s a 0.05% chance. Probably you find some of these situations more objectionable than others. But what’s really happening here? What is “Bayesian reasoning”? You may have heard of something called “Bayes’ equation”. Forget it. It’s a distraction. Everyone uses that equation, including people that hate Bayesian reasoning. The core of Bayesian reasoning is a concept which cannot be translated into math. Here’s how I like to put it: Mixing aleatoric uncertainty and epistemic uncertainty: Good. I’m very sorry to define a fancy word using two even-fancier ones. But “aleatoric” and “epistemic” get at something important. Consider these two statements: My favorite U-235 atom has a 0.0000000985% probability it will decay in the next year. There’s a 82% probability I’m taller than you. The word “probability” appears in both these statements. But it means completely different things. U-235 decays due to random quantum fluctuations. The decay probability has a clear physical meaning. If you get a few quadrillion U-235 atoms together (not too close, please) then after a year, something very close to 0.0000000985% will have decayed. That’s “aleatoric” uncertainty. Meanwhile, I’m either taller than you or I’m not. You may not know, but it’s a fixed fact about the universe that either is true or isn’t. That 82% probability is a feeling that exists in your brain. That’s “epistemic” uncertainty. Why people are obsessed with it? So why would you mix those two types of uncertainty together? And why does Bayesian thinking almost seem like a religion to some people? I think the strongest reason is that it gives optimal decision procedures. Say you come to dinner at my house. After eating, you were hoping for brandy and cigars, but instead I bring out a jar. Inside it is one gold coin worth $1000 and 4 worthless fake coins. They all look the same, except the fake coins have heads on both sides. Now, I offer you a deal: If you give me $125, then I’ll pick a random coin and give it to you. Should you accept? This is pretty trivial, but let’s go through the logic. There are five possible outcomes. In one you get something worth $1000, while in the others, you get something worth $0. Since each outcome is equally likely, on average you get something worth $200. You should accept my wager. (At least, assuming you have enough liquidity that you’re risk neutral. If you need that $125 to buy food for the next week, don’t gamble it.) Easy. But say instead I draw a random coin and flip it into the table. It happens to land heads. Then I point to it and ask: Would you like to pay $125 for that coin? If you’re Bayesian, you’d reason like this: There are five coins, each of which has two sides. So there are 10 equally likely outcomes: But since you saw heads, one of those outcomes is impossible. If you start with 10 equally likely outcomes and the rule one out, the other 9 are still equally likely. So on average, in this situation, you’ll get something worth $1000/9 ≈ $111.11. Since that’s less than $125, you should refuse my offer. That’s the one true correct way to play this game: You calculate the “probability” that the coin is gold and then you use that “probability” to make decisions. Even though the coin on the table is fixed, you act like it’s random if it’s gold or not. If you make decisions in any other way, you’ll lose money over time, either by accepting losing bets, or missing out on winning bets. And if you’re non-Bayesian, how do you play this game? Exactly the same way. Well, except that you can’t talk about the “probability” that the coin is gold. So you’ll have to stumble around with “expected utilities” or whatever. But you’d better end up with something equivalent to the probabilistic calculation, because anything else is leaving money on the table. Why this is controversial I hope you’re now convinced that it’s useful to act as if non-aleatoric probabilities are “real” probabilities. And that thinking about them as probabilities makes this easier. Still, I stress that it’s somewhat debatable if the above coin scenario is really “Bayesian”. If it counts at all, it’s certainly the least controversial kind, and not representative of how Bayesian reasoning is typically used in the real world. In practice, you face a situation where you don’t know how many coins are gold vs. fake. Maybe I bring out the jar and you ask me what fraction of the coins are gold and I vaguely say, “A decent number.” Then you need to stare into my eyes and decide what kind of person you think I am. Alice might decide I’m super chill and estimate that 50% of coins are gold. Bob might decide I’m a huge jerk and only 1% are. That will lead Alice and Bob to completely different “probabilities”. Here’s a more realistic example. Say you make a new MRNA vaccine for the flu. You test it on a handful of patients and you seem get good results, so you go to investors to try to get money to run a big trial so you can sell it. What’s the “probability” this drug would get FDA approval if funded? If potential investors are Bayesian, they will mentally weigh the data from your handful of patients with the base rate for how often trials succeed. But what kind of trials should be compared against? All trials? Vaccine trials? Flu drug trials? MRNA trials? Trials in the last 20 years? These lead to different probabilities, and there’s no single right answer. And it gets much worse than that. I can calculate a “probability” that artificial superintelligence will be invented in five years by making up a “prior” and then adjusting it based on what superforecasters say or how fast AI seems to be improving. But that prior and the adjustments will basically just be things I made up. So in the real world, Bayesian probabilities are on a spectrum. In some situations, like the gold coin example, they’re very hard to argue with, and being non-Bayesian seems stubborn and pedantic. But in other situations, the “probabilities” that come out are very squishy and subjective. This is why people say they aren’t “real”. The boring debate People have argued about Bayesian reasoning for decades. If I calculate that there’s a 4.3% probability I have lupus, is that a “real” probability? I think a lot of the arguments are ultimately boil down to semantics. You could imagine a world where we used “a-probability” for strict aleatoric uncertainty, and “b-probability” for Bayesian probabilities. Then that debate wouldn’t exist. That dissolves the debate in the abstract. If we carefully marked everything Bayesian as “b-probabilities”, then we could argue about specific situations. How justifiable are the assumptions that were used? Some “b-probabilities” are much squishier than others. The argument against using squishy probabilities is obvious: They’re totally subjective. The argument in favor of squishy probabilities is more subtle. It’s that in life you have to make decisions. You either buy the coin from me or you don’t. You either fund the vaccine trial or you don’t. Making subjective assumptions is uncomfortable, but too bad, life requires hard decisions. So why be formal about it? Why not just rely on vibes. Well, while I think we’re all born Bayesian, we’re not great Bayesians. We have all sorts of predictable biases like base rate neglect and anchoring. The way to eliminate these is to state your assumptions formally and reason formally. Bayesian reasoning is also very legible. If we get different numbers for the probability of AGI in the next five years, we can compare our calculations and maybe learn something. Why not to be Bayesian So why not be Bayesian? I think there are two main reasons. First, outside of very simple situations, Bayesian reasoning requires using slow and unreliable algorithms. In general, Bayesian reasoning is in a complexity class slightly worse than the famous NP-complete class. Second, outside of very simple situations, creating formal probabilistic models is hard. You need to learn the intricacies of Wishart distributions skew normal distributions and kurtosis. And even if you know those things, creating probabilistic models is incredibly dangerous—if you accidentally set some parameter somewhere incorrectly, you can easily get crazy results. Compared to the general population, I think I’m comfortably in the top 0.1% in terms of my mastery of Bayesian stuff. (That’s probably true for anyone who’s ever built a non-trivial model for a real problem.) And yet, here I have a blog where I’ve examined if seed oil is bad for you and if alien aircraft are visiting earth and if it’s a good idea to take statins or use air purifiers or get colonoscopies or eat aspartame or practice gratitude or use an ultrasonic humidifier. And I have used formal Bayesian models never. Why? The answer is that the real world is messy and creating a formal model that would integrate all the available information would be really, really, really difficult. If I had an infinite amount of time, I do think that would be the best approach. But I’d be incredibly paranoid that one parameter set incorrectly anywhere could lead to disaster. To create a model that I actually trusted for any of these situations would probably take me months. Meanwhile, my brain meat has been optimized for millions of years to combine information. It’s very far from optimal, but it usually doesn’t make insane mistakes. And I can still get some of the benefits of Bayesian reasoning by keeping it in mind. Conclusions “Aleatoric” probabilities are different from “Bayesian” probabilities. It’s silly to argue about which is “real”. Just say which one you’re talking about. Some Bayesian probabilities are much squishier than others. When you see one, make sure you understand what assumptions went into it. Life involves lots of hard choices with messy information. Theoretically, if you can formalize your assumptions, then Bayesian reasoning is the “optimal” way to make decisions. But in practice, formalizing assumptions is both hard and dangerous. For most people in most situations, it’s probably safer to use normal human judgement, but keep Bayesian reasoning in mind as a guide. P.S. The mentoring applications were so unbelievably impressive that I decided to pick winners randomly. I tried very hard to email everyone who applied but kept getting blocked as spam, even when trying to send out notifications in small batches. (What a web we weave.) So I’m very sorry if you didn’t get my email. I read every application and I was humbled by the amazing things you are all doing. I wish I could have accepted everyone.

a month ago • 14 votes

The first RCT for GLP-1 drugs and alcoholism isn’t what we hoped

GLP-1 drugs are a miracle for diabetes and obesity. There are rumors that they might also be a miracle for addiction to alcohol, drugs, nicotine, and gambling. That would be good. We like miracles. We just got the first good trial and—despite what you might have heard—it’s not very encouraging. Semaglutide—aka Wegovy / Ozempic—is a GLP-1 agonist. This means it binds to the same receptors the glucagon-like peptide-1 hormone normally binds to. Similar drugs include dulaglutide, exenatide, liraglutide, lixisenatide, and tirzepatide. These were originally investigated for diabetes, on the theory that GLP-1 increases insulin and thus decreases blood sugar. But GLP-1 seems to have lots of other effects, like preventing glucose from entering the bloodstream, slowing digestion, and making you feel full longer. It was found to cause sharp decreases in body mass, which is why supposedly 12% of Americans had tried one of these drugs by mid 2024. (I’m skeptical that of that 12% number, but a different survey in late 2024 found that 10% of Americans were currently taking one of these drugs. I know Americans take more drugs than anyone on the planet, but still…) Anyway, there are vast reports from people taking these drugs that they help with various addictions. Many people report stopping drinking or smoking without even trying. This is plausible enough. We don’t know which of the many effects of these drugs is really helping with obesity. Maybe it’s not the effects on blood sugar that matter, but these drugs have some kind of generalized “anti-addiction” effect on the brain? Or maybe screwing around with blood sugar changes willpower? Or maybe when people get thinner, that changes how the brain works? Who knows. Beyond anecdotes, are some observational studies and animal experiments suggesting they might help with addiction (OKeefe et al. 2024). We are so desperate for data that some researchers have even resorted to computing statistics based on what people say on reddit. So while it seems plausible these drugs might help with other addictions, there’s limited data and no clear story for why this should happen biologically. This makes the first RCT, which came out last week, very interesting. This paper contains this figure, about which everyone is going crazy: I admit this looks good. This is indeed a figure in which the orange bar is higher than the blue bar. However: This figure does not mean what you think it means. Despite the label, this isn’t actually the amount of alcohol people consumed. What’s shown is a regression coefficient, which was calculated on a non-random subset of subjects. There are other figures. Why isn’t anyone talking about the other figures? What they did This trial gathered 48 participants. They selected them according to the DSM-5 definition of “alcohol use disorder” which happens to be more than 14 drinks per week for men and 7 drinks per week for women, plus at least 2 heavy drinking episodes. Perhaps because of this lower threshold, 34 of the subjects were women. The trial lasted 9 weeks. During it, half of the subjects were given weekly placebo injections. The other half were given weekly injections of increasing amounts of semaglutide: 0.25 mg for 4 weeks, then 0.5 mg for 4 weeks, and then 0.5 or 1 mg in the last week, depending on a doctor’s judgement. Outcome 1: Drinking The first outcome was to simply ask people to record how much they drank in daily life. Here are the results: If I understand correctly, at some point 6 out of the 24 subjects in the placebo group stopped providing these records, and 3 out of 24 in the semaglutide group. I believe the above shows the data for whatever subset of people were still cooperating on each week. It’s not clear to me what bias this might produce. When I first saw that figure, I thought it looked good. The lines are going down, and the semaglutide line is lower. But then I checked the appendix. (Protip: Always check the appendix.) This contains the same data, but stratified by if people were obese or not: Now it looks like semaglutide isn’t doing anything. It’s just that among the non-obese, the semaglutide group happened to start at a lower baseline. How to reconcile this with the earlier figure? Well, if you look carefully, it doesn’t really show any benefit to semaglutide either. There’s a difference in the two curves, but it was there from the beginning. Over time, there’s no difference in the difference, which is what we’d expect to see if semaglutide was helping. The paper provides other measurements like “changes in drinking days” and “changes in heavy drinking days” and “changes in drinks per drinking day”, but it’s the same story: Either no benefit or no difference. So… This is a small sample. It only lasted nine weeks, and subjects spent many of them on pretty small doses. But this is far the miracle we hoped for. Some effect might be hiding in the noise, but what these results most look like is zero effect. Outcome 2: Delayed drinking There are also lab experiments. They did these at both the start and end of the study. In the first experiment, they basically set each subject’s favorite alcoholic drink in front of them and said them, “For each minute you wait before drinking this, we will pay you, up to a maximum of 50 minutes.” How much were they paid, you ask? Oddly, that’s not specified in the paper. It’s also not specified in the supplemental information. It’s also not specified in the 289 page application they made to the FDA to be able to do this study. (Good times!) But there is a citation for a different paper in which people were paid $0.24/minute, decreasing by $0.01 / minute every five minutes. If they used the same amounts here, then the maximum subjects could earn was $9.75. Anyway, here are the results: So… basically nothing? Because almost everyone waited the full 50 minutes? And they did this for only $9.75? Seems weird. I don’t really see this as evidence against semaglutide. Rather, I think this didn’t end up proving much in either direction. Outcome 3: Laboratory drinking So what’s with that initial figure? Well, after the delayed drinking experiment was over, the subjects were given 2 hours to drink as much as they wanted, up to some kind of safe limit. This is what led to the figure everyone is so excited about: When I first saw this, I too thought it looked good. I thought it looked so good that I started writing this post, eager to share the good news. But at some point I read the caption more carefully and my Spidey sense started tingling. There’s two issues here. First of all, subjects were free to skip this part of the experiment, and a lot did. Only 12 of the 24 subjects in the placebo group and 13 of 24 in the semaglutide group actually did it. This means the results are non-randomized. I mean, the people who declined to do this experiment would probably have drunk different amounts than those who agreed, right? So if semaglutide had any influence on people decision to participate (e.g. because it changed their relationship with alcohol, which is the hypothesis of this research) then the results would be biased. That bias could potentially go in either direction. But basically this means we’re sort of working with observational data. The second issue is that what’s being show in this plot is not data. I know it looks like data, but what’s shown are numbers derived from regression coefficients. In the appendix, you can find this table: Basically, they fit a regression to predict how much people drank in this experiment at the end of the study (“g-EtOH”) based on (a) how much they drank during the same experiment at the start of the study (“Baseline”) (b) their sex, and (c) if they got semaglutide or not (“Condition”). Those coefficients are in the B column. How exactly they got from these coefficients to the numbers in the figure isn’t entirely clear to me. But using a plot digitizer I found that the figure shows ~59.9 g for the placebo group and ~33.3 g for the semaglutide group, for a difference of 26.6 g. I believe that difference comes from the regression coefficient for “Condition” (-25.32) plus some adjustments for the fact that sex and baseline consumption vary a bit between the two groups. So… that’s not nothing! This is some evidence in favor of semaglutide being helpful. But it’s still basically just a regression coefficient computed on a non-randomized sample. Which is sad, since the point of RCTs is to avoid resorting to regression coefficients on non-randomized samples. Thus, I put much more faith in outcome #1. Discussion To summarize, the most reliable outcome of this paper was how much people reported drinking in daily life. No effect was observed there. The laboratory experiment suggests some effect, but the evidence is much weaker. When you combine the two, the results of this paper are quite bad, at least relative to my (high) hopes. Obviously, just because the results are disappointing does not mean the research was bad. The measure of science is the importance of the questions, not what the answers happen to be. It’s unfortunate that a non-randomized sample participated in the final drinking experiment, but what were they supposed to do, force them? This experiment involved giving a synthetic hormone and an addictive substance with people with a use disorder. If you have any doubts about the amount of work necessary to bring that to reality, I strongly encourage you to look at the FDA application. OK, fine, I admit that I do feel this paper “hides the bodies” slightly too effectively, in a way that could mislead people who aren’t experts or that don’t read the paper carefully. I think I’m on firm ground with that complaint, since in the discussions I’ve seen, 100% of people were in fact misled. But I’m sympathetic to the reality that most reviewers don’t share my enlightened views about judging science, and that a hypothetical paper written with my level of skepticism would never be published. (People think the problem with science is that it’s too woke. While I don’t really disagree, I still think the bigger problem is screwed up incentives that force everyone oversell everything, because that’s what you have to do to survive. But that’s a story for another time.) Anyway, despite these results, I’m still hopeful that GLP-1 drugs might help with addiction. This is a relatively small study, and it only lasted 9 weeks. I’m don’t think we can dismiss the huge number of anecdotes yet. And the laboratory experiment was at least a little promising. Given how destructive addictions can be, I vote for more research in this direction. Fortunately, given the billions of dollars to be made, that’s sure to happen. But given just how miraculous semaglutide is for obesity, and given the miraculous anecdotes, I don’t see how to spin this paper as anything but a letdown. It provides weak evidence for any effect and comes close to excluding the possibility of another miracle. If you’ve forgotten what miracles look like, here is the figure for body weight:

a month ago • 16 votes

More in life

ringfencing my self

I remember reading somewhere that just few decades ago we lived in small communities, unconnected by the internet. We would only need to cope with the happenings of this small community, and...

23 hours ago • 3 votes

creatures of temperament

It is the glory of God to conceal a thing:

6 hours ago • 2 votes

Love's Inevitable Gravity

The Sacred Archaeology of Desire

15 hours ago • 1 votes

What Happened to the Celebrities?

Something has changed in the culture

7 hours ago • 1 votes

What 'Adolescence' Doesn't Tell Us About Boys

Impressive Entertainment, Not Sure About the Social Commentary

2 days ago • 2 votes

New here?

Car trouble

Improve your reading experience

More from DYNOMIGHT

More in life

bored reading