Full Width [alt+shift+f] FOCUS MODE Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
39
window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-1XJMTJ5KCK'); .md h2 { font-size: 20px; } Many machine learning researchers worry about risks from building artificial intelligence (AI). This includes me -- I think AI has the potential to change the world in both wonderful and terrible ways, and we will need to work hard to get to the wonderful outcomes. Part of that hard work involves doing our best to experimentally ground and scientifically evaluate potential risks. One popular AI risk centers on [AGI misalignment](https://en.wikipedia.org/wiki/AI_alignment). It posits that we will build a superintelligent, super-capable, AI, but that the AI's objectives will be misspecified and misaligned with human values. If the AI is powerful enough, and pursues its objectives inflexibly enough, then even a subtle misalignment might pose an existential risk to humanity. For instance, if an AI is...
over a year ago

Comments

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Jascha’s blog

Neural network training makes beautiful fractals

window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-1XJMTJ5KCK'); .md h2 { font-size: 20px; } .vimeo-player { position: relative; width: 444px; height: 444px; margin: auto; } .vimeo-player iframe { position: absolute; top: 0; left: 0; width: 100%; height: 100%; } My five year old daughter came home from kindergarten a few months ago, and told my partner and I that math was stupid (!). We have since been working (so far successfully) to make her more excited about all things math, and more proud of her math accomplishments. One success we've had is that she is now very interested in fractals in general, and in particular enjoys watching deep zoom videos into [Mandelbrot](https://youtu.be/8cgp2WNNKmQ?si=PD7W2q4qDNY9AgzD) and [Mandelbulb](https://youtu.be/BLmAV6O_ea0?si=4iyAFMgzde0mTmsq) fractal sets, and eating [romanesco broccoli](https://en.wikipedia.org/wiki/Romanesco_broccoli). My daughter's interest has made me think a lot about fractals, and about the ways in which fractals relate to a passion of mine, which is artificial neural networks. I've realized that there are similarities between the way in which many fractals are generated, and the way in which we train neural networks. Both involve repeatedly applying a function to its own output. In both cases, that function has hyperparameters that control its behavior. In both cases the repeated function application can produce outputs that either diverge to infinity or remain happily bounded depending on those hyperparameters. Fractals are often defined by the boundary between hyperparameters where function iteration diverges or remains bounded. Motivated by these similarities, I looked for fractal structure in the hyperparameter landscapes of neural network training. And I found it! The boundary between hyperparameters for which neural network training succeeds or fails has (gorgeous, organic) fractal structure. Details, and beautiful videos, below. For a more technical presentation, see the short paper [*The boundary of neural network trainability is fractal*](https://arxiv.org/abs/2402.06184). # Neural network training and hyperparameters In order to train an artificial neural network, we iteratively update its parameters to make it perform better. We often do this by performing [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent) steps on a loss function. The loss function is a measure of the neural network's performance. By descending the loss by gradient descent, we find values of the parameters for which the neural network performs well. Training depends on *hyperparameters*, which specify details about how parameter update steps should be performed and how the network should be initialized. For instance, one common hyperparameter is the learning rate, which sets the magnitude of the update we make to the model’s parameters at every training step. If the learning rate is too large, then the parameter update steps are too large. This causes the parameters to diverge (grow towards infinity) during training, and as a result causes the training loss to become very bad. If the learning rate is too small, the training steps are too short, and it takes a very large number of training steps to train the neural network. Requiring a very large number of training steps makes training slow and expensive. In practice, we often want to make the learning rate as large as possible, without making it so large that the parameters diverge. # Visualizing the hyperparameter landscape We can visualize how adjusting hyperparameters (like the learning rate) affects how quickly a neural network either trains or diverges. In the following image, each pixel corresponds to training the same neural network from the same initialization on the same data -- but with *different hyperparameters*. Blue-green colors mean that training *converged* for those hyperparameters, and the network successfully trained. Red-yellow colors mean that training *diverged* for those hyperparameters. The paler the color the faster the convergence or divergence The neural network I used in this experiment is small and simple; it consists of an input layer, a $\operatorname{tanh}$ nonlinearity, and an output layer[^netdetails]. In the image, the x-coordinate changes the learning rate for the input layer’s parameters, and the y-coordinate changes the learning rate for the output layer’s parameters. ![Figure [p_ml]: **Hyperparameter landscape: A visualization of how neural network training success depends on learning rate hyperparameters.** Each pixel corresponds to a training run with the specified input and output layer learning rates. Training runs shown in blue-green converged, while training runs shown in red-yellow diverged.[^saturation] Hyperparameters leading to the best performance (lightest blue-green) are typically very close to hyperparameters for which training diverges, so the boundary region is of particular interest.](/assets/fractal/zoom_sequence_width-16_depth-2_datasetparamratio-1.0_minibatch-None_nonlinearity-tanh_phasespace-lr_vs_lr_step-0.png width="444px" border="1") The best performing hyperparameters -- those that are shown with the palest blue-green shade, and for which the neural network trains the most quickly -- are near the boundary between hyperparameters for which training converges and for which it diverges. This is a general property. The best hyperparameters for neural network training are usually very near the edge of stability. For instance, as suggested above, the best learning rate in a grid search is typically the largest learning rate for which training converges rather than diverges. # The boundary of neural network trainability is fractal Because it is where we find the best hyperparameters, the boundary between hyperparameters that lead to converging or diverging training is of particular interest to us. Let’s take a closer look at it. Play the following video (I recommend playing it full screen, and increasing the playback resolution): As we zoom into the boundary between hyperparameter configurations where training succeeds (blue) and fails (red), we find intricate structure at every scale. The boundary of neural network trainability is fractal! 🤯 (If you watched the video to the end, you saw it turn blocky in the last frames. During network training I used the $\operatorname{float64}$ numeric type, which stores numbers with around 16 decimal digits of precision. The blockiness is what happens when we zoom in so far that we need more than 16 digits of precision to tell pixels apart.) This behavior is general. We see fractals if we change the data, change the architecture, or change the hyperparameters we look at. The fractals look qualitatively different for different choices though. Network and training design decisions also have artistic consequences! ![Figure [paper]: **Neural network training produces fractals in all of the experimental configurations I tried.** The figure is taken from the [companion paper](https://arxiv.org/abs/2402.06184), and shows a region of the fractal resulting from each experimental condition. Experimental conditions changed the nonlinearity in the network, changed the dataset size, changed between minibatch and full batch training, and changed the hyperparameters we look at.](/assets/fractal/fractal_tiles_midres.png width="444px" border="1") Here are the remaining fractal zoom videos for the diverse configurations summarized in Figure [paper]. You can find code for these experiments in [this colab](https://colab.research.google.com/github/Sohl-Dickstein/fractal/blob/main/the_boundary_of_neural_network_trainability_is_fractal.ipynb)[^beware]. - **Changing the activation function to the identity function:** i.e. the network is a deep linear network, with no nonlinearity. - **Change the activation function to $\operatorname{ReLU}$:** This is a neat fractal, since the piecewise linear structure of the $\operatorname{ReLU}$ is visually apparent in the straight lines dividing regions of the fractal. - **Train with a dataset size of 1:** i.e. only train on a single datapoint. Other experiments have a number of training datapoints which is the same as the free parameter count of the model. - **Train with a minibatch size of 16:** Other experiments use full batch training. - **Look at different hyperparameters:** I add a hyperparameter which sets the mean value of the neural network weights at initialization. I visualize training success in terms of this weight initialization hyperparameter (*x-axis*) and a single learning rate hyperparameter (*y-axis*). Other experiments visualize training success in terms of learning rate hyperparameters for each layer. This fractal is **extra pretty** -- I like how it goes through cycles where what seems like noise is resolved to be structure at a higher resolution. # This isn’t so strange after all Now that I’ve shown you something surprising and beautiful, let me tell you why we should have expected it all along. In an academic paper I would put this section first, and tell the story as if I knew fractals would be there -- but of course I didn't know what I would find until I ran the experiment! ## Fractals result from repeated iteration of a function One common way to make a fractal is to iterate a function repeatedly, and identify boundaries where the behavior of the iterated function changes. We can refer to these boundaries as bifurcation boundaries of the iterated function; the dynamics bifurcate at this boundary, in that function iteration leads to dramatically different sequences on either side of the boundary. For instance, to generate the Mandelbrot set, we iterate the function $f( z; c ) = z^2 + c$ over and over again. The Mandelbrot fractal is the bifurcation boundary between the values of $c$ in the complex plane for which this iterated function diverges, and for which it remains bounded. The parameter $c$ is a (hyper)parameter of the function $f( z; c )$, similarly to how learning rates are hyperparameters for neural network training. ![Figure [mandelbrot fractal]: **The Mandelbrot fractal is generated by iterating a simple function, similar to the way in which update steps are iterated when training a neural network.** The image is color coded by whether iterations started at a point diverge (red-yellow colors) or remain bounded (blue-green colors). The boundary between the diverging and bounded regions is fractal. This image was generated by [this colab](https://colab.research.google.com/github/Sohl-Dickstein/fractal/blob/main/the_boundary_of_neural_network_trainability_is_fractal.ipynb).](/assets/fractal/mandelbrot_midres.png width="444px" border="1") Other examples of fractals which are formed by bifurcation boundaries include [magnet fractals](https://paulbourke.net/fractals/magnet/), [Lyapunov fractals](https://en.wikipedia.org/wiki/Lyapunov_fractal), the [quadratic Julia set](https://mathworld.wolfram.com/JuliaSet.html), and the [Burning Ship fractal](Burning Ship fractal). ## Fractals can result from optimization One particularly relevant class of bifurcation fractals are [Newton fractals](https://en.wikipedia.org/wiki/Newton_fractal). These are generated by iterating Newton's method to find the roots of a polynomial. [Newton's method is an optimization algorithm](https://en.wikipedia.org/wiki/Newton%27s_method_in_optimization). Newton fractals are thus a proof of principle that fractals can result from iterating steps of an optimization algorithm. ![Figure [newton fractal]: **Newton fractals, like the one shown, are formed by iterating Newton's method to find roots of a polynomial, and color coding initial conditions by the specific root the iterates converge to.** Newton fractals are a proof of principle that optimization can generate a fractal, since Newton's method is an optimization procedure. They motivate the idea of fractal behavior resulting from training (i.e. optimizing) a neural network.](/assets/fractal/Julia_set_for_the_rational_function.png width="444px" border="1") ## Artificial neural networks are trained by repeatedly iterating a function When we train a neural network by iterating steps of gradient descent, we are iterating a fixed function, the same as for Mandelbrot, Newton, and other fractals. Like for Newton fractals, this fixed function corresponds to an optimization algorithm. Specifically, when we train a neural network using steepest gradient descent with a constant learning rate, we iterate the fixed function $f(\theta; \eta ) = \theta( \eta ) - \eta\, g( \theta )$. Here $\eta$ is the learning rate hyperparameter, $\theta$ are the parameters of the neural network, and $g( \theta )$ is the gradient of the loss function. There are many differences between neural network training and traditional fractal generation. The fractals I just discussed all involve iterating a function of a single (complex valued) number. The equation defining the iterated function is short and simple, and takes less than a line of text to write down. On the other hand, neural network training iterates a function for all the parameters in the neural network. Some neural networks have trillions of parameters, which means the input and output of the iterated function is described with *trillions* of numbers, one for each parameter. The equation for a neural network training update is similarly far more complex than the function which is iterated for traditional fractals; it would require many lines, or possibly many pages, to write down the parameter update equations for a large neural network. Nonetheless, training a neural network can be seen as a scaled up version of the type of iterative process that generates traditional fractals. We should not be surprised that it produces fractals in a similar way to simpler iterative processes.[^symmetry] # Closing thoughts ## Meta-learning is hard Meta-learning is a research area that I believe will transform AI over the next several years. In meta-learning we *learn* aspects of AI pipelines which are traditionally hand designed. For instance, we might meta-train functions to initialize, [optimize](https://github.com/google/learned_optimization/tree/main/learned_optimization/research/general_lopt), or regularize neural networks. If deep learning has taught us one thing, it's that with enough compute and data, trained neural networks can outperform and replace hand-designed heuristics; in meta-learning, we apply the same lesson to replace the hand-designed heuristics we use to train the neural networks themselves. Meta-learning is the reason I became interested in hyperparameter landscapes. The fractal hyperparameter landscapes we saw above help us understand some of the challenges we face in meta-learning. The process of meta-training usually involves optimizing hyperparameters (or meta-parameters) by gradient descent. The loss function we perform meta-gradient-descent on is called the meta-loss. The fractal landscapes we have been visualizing are also meta-loss landscapes; we are visualizing how well training succeeds (or fails) as we change hyperparameters. In practice, we often find the meta-loss atrocious to work with. It is often *chaotic* in the hyperparameters, which makes it [very difficult to descend](https://arxiv.org/abs/1810.10180)[^meta-descent]. Our results suggest a more nuanced and also more general perspective; meta-loss landscapes are chaotic because they are fractal. At every length scale, small changes in the hyperparameters can lead to large changes in training dynamics. ![Figure [meta landscape]: **Chaotic meta-loss landscapes make meta-learning challenging.** The image shows an example meta-loss landscape for a learned optimizer, with darker colors corresponding to better meta-loss. The two axes correspond to two of the meta-parameters of the learned optimizer (similar to the visualization in Figure [p_ml], where axes correspond to two hyperparameters). See [this paper](https://arxiv.org/abs/1810.10180) for details. This meta-loss landscape is difficult to meta-train on, since steepest gradient descent will become stuck in valleys or local minima, and because the gradients of the rapidly changing meta-loss function are exceptionally high variance.](/assets/fractal/meta-loss-landscape.png width="444px" border="1") ## Fractals are beautiful and relaxing Recent AI projects I have collaborated on have felt freighted with historical significance. We are building tools that will change people's lives, and maybe bend the arc of history, for both [better and worse](/2023/09/10/diversity-ai-risk.html). This is incredibly exciting! But it is often also stressful. This project on the other hand ... was just fun. I started the project because my daughter thought fractals were mesmerizing, and I think the final results are gorgeous. I hope you enjoy it in the same spirit! ----- # Acknowledgements Thank you to Maika Mars Miyakawa Sohl-Dickstein for inspiring the original idea, and for detailed feedback on the generated fractals. Thank you to Asako Miyakawa for providing feedback on a draft of this post. In more detail, the baseline neural network architecture, design, and training configuration is as follows: - Two layer fully connected neural network, with 16 units in the input and hidden layers, and with no bias parameters. The only parameters are the input layer weight matrix, and the output layer weight matrix. - $\operatorname{tanh}$ nonlinearity in the single hidden layer - Mean square error loss - Fixed random training dataset, with number of datapoints the same as the number of free parameters in the network - Full batch steepest descent training, with a constant learning rate - **A different learning rate for each layer.** That is rather than training the input and output layer weight matrices with the same learning rate, each weight matrix has its own learning rate hyperparameter. All experiments change one aspect of this configuration, except for the baseline experiment, which follows this configuration without change. If you want even more detail, see the [arXiv note](https://arxiv.org/abs/2402.06184) or the [colab notebook I used for all experiments](https://colab.research.google.com/github/Sohl-Dickstein/fractal/blob/main/the_boundary_of_neural_network_trainability_is_fractal.ipynb). [^saturation]: The discerning reader may have noticed that training diverges when the output learning rate is made large, but that if the input learning rate is made large, performance worsens but nothing diverges. This is due to the $\operatorname{tanh}$ nonlinearity saturating. When the input learning rate is large, the input weights become large, the hidden layer pre-activations become large, and the $\operatorname{tanh}$ units saturate (their outputs grow very close to either -1 or 1). The output layer can still train on the (essentially frozen) $[-1, 1]$ activations from the first layer, and so some learning can still occur. [^beware]: Like the fractals, the research code in the colab has vibes of layered organic complexity ... user beware! [^symmetry]: Many fractals are generated by iterating simple functions, such as low order polynomials, or ratios of low order polynomials. Iterating these simple functions often generates simple symmetries, that are visually obvious when looking at the resulting fractals. The fractals resulting from neural networks are more organic, with fewer visually obvious symmetries. This is likely due to the higher complexity of the iterated functions themselves, as well as the many random parameters in the function definitions, stemming from the random initialization of the neural network and random training data. [^meta-descent]: My collaborators and I have done more research into how to optimize a chaotic meta-loss. Especially see the papers: [*Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies*](https://icml.cc/virtual/2021/poster/10175), and [*Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies*](https://openreview.net/forum?id=VhbV56AJNt). body{visibility:hidden;white-space:pre;font-family:monospace} window.markdeepOptions = {mode: 'html', tocStyle: 'medium'}; window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")

a year ago 44 votes
Brain dump on the diversity of AI risk

window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-1XJMTJ5KCK'); .md h2 { font-size: 20px; } AI has the power to change the world in both wonderful and terrible ways. We should try to make the wonderful outcomes more likely than the terrible ones. Towards that end, here is a brain dump of my thoughts about how AI might go wrong, in rough outline form. I am not the first person to have any of these thoughts, but collecting and structuring these risks was useful for me. Hopefully reading them will be useful for you. My top fears include targeted manipulation of humans, autonomous weapons, massive job loss, AI-enabled surveillance and subjugation, widespread failure of societal mechanisms, extreme concentration of power, and loss of human control. I want to emphasize -- I expect AI to lead to far more good than harm, but part of achieving that is thinking carefully about risk. # Warmup: Future AI capabilities and evaluating risk 1. Over the last several years, AI has developed remarkable new capabilities. These include [writing software](https://github.com/features/copilot), [writing essays](https://www.nytimes.com/2023/08/24/technology/how-schools-can-survive-and-maybe-even-thrive-with-ai-this-fall.html), [passing the bar exam](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4389233), [generating realistic images](https://imagen.research.google/), [predicting how proteins will fold](https://www.deepmind.com/research/highlighted-research/alphafold), and [drawing unicorns in TikZ](https://arxiv.org/abs/2303.12712). (The last one is only slightly tongue in cheek. Controlling 2d images after being trained only on text is impressive.) 1. AI will continue to develop remarkable new capabilities. * Humans aren't irreplicable. There is no fundamental barrier to creating machines that can accomplish anything a group of humans can accomplish (excluding tasks that rely in their definition on being performed by a human). * For intellectual work, AI will become cheaper and faster than humans * For physical work, we are likely to see a sudden transition, from expensive robots that do narrow things in very specific situations, to cheap robots that can be repurposed to do many things. * The more capable and adaptable the software controlling a robot is, the cheaper, less reliable, and less well calibrated the sensors, actuators, and body need to be. * Scaling laws teach us that AI models can be improved by scaling up training data. I expect a virtuous cycle where somewhat general robots become capable enough to be widely deployed, enabling collection of much larger-scale diverse robotics data, leading to more capable robots. * The timeline for broadly human-level capabilities is hard to [predict](https://bounded-regret.ghost.io/scoring-ml-forecasts-for-2023/). My guess is more than 4 years and less than 40. * AI will do things that no human can do. * Operate faster than humans. * Repeat the same complex operation many times in a consistent and reliable way. * Tap into broader capabilities than any single human can tap into. e.g. the same model can [pass a medical exam](https://arxiv.org/abs/2303.13375), answer questions about [physics](https://benathi.github.io/blogs/2023-03/gpt4-physics-olympiad/) and [cosmology](https://www.linkedin.com/pulse/asking-gpt-4-cosmology-gabriel-altay/), [perform mathematical reasoning](https://blog.research.google/2022/06/minerva-solving-quantitative-reasoning.html?m=1), read [every human language](https://www.reddit.com/r/OpenAI/comments/13hvqfr/native_bilinguals_is_gpt4_equally_as_impressive/) ... and make unexpected connections between these fields. * Go deeper in a narrow area of expertise than a human could. e.g. an AI can read every email and calendar event you've ever received, web page you've looked at, and book you've read, and remind you of past context whenever anything -- person, topic, place -- comes up that's related to your past experience. Even the most dedicated personal human assistant would be unable to achieve the same degree of familiarity. * Share knowledge or capabilities directly, without going through a slow and costly teaching process. If an AI model gains a skill, that skill can be shared by copying the model's parameters. Humans are unable to gain new skills by copying patterns of neural connectivity from each other. 1. AI capabilities will have profound effects on the world. * Those effects have the possibility of being wonderful, terrible, or (most likely) some complicated mixture of the two. * There is not going to be just one consequence from advanced AI. AI will produce lots of different profound side effects, **all at once**. The fears below should not be considered as competing scenarios. You should rather imagine the chaos that will occur when variants of many of the below fears materialize simultaneously. (see the concept of [polycrisis](https://www.weforum.org/agenda/2023/03/polycrisis-adam-tooze-historian-explains/)) 1. When deciding what AI risks to focus on, we should evaluate: * **probability:** How likely are the events that lead to this risk? * **severity:** If this risk occurs, how large is the resulting harm? (Different people will assign different severities based on different value systems. This is OK. I expect better outcomes if different groups focus on different types of risk.) * **cascading consequences:** Near-future AI risks could lead to the disruption of the social and institutional structures that enable us to take concerted rational action. If this risk occurs, how will it impact our ability to handle later AI risks? * **comparative advantage:** What skills or resources do I have that give me unusual leverage to understand or mitigate this particular risk? 1. We should take *social disruption* seriously as a negative outcome. This can be far worse than partisans having unhinged arguments in the media. If the mechanisms of society are truly disrupted, we should expect outcomes like violent crime, kidnapping, fascism, war, rampant addiction, and unreliable access to essentials like food, electricity, communication, and firefighters. 1. Mitigating most AI-related risks involves tackling a complex mess of overlapping social, commercial, economic, religious, political, geopolitical, and technical challenges. I come from an ML science + engineering background, and I am going to focus on suggesting mitigations in the areas where I have expertise. *We desperately need people with diverse interdisciplinary backgrounds working on non-technical mitigations for AI risk.* # Specific risks and harms stemming from AI 1. The capabilities and limitations of present day AI are already causing or exacerbating harms. * Harms include: generating socially biased results; generating (or failing to recognize) toxic content; generating bullshit and lies (current large language models are poorly grounded in the truth even when used and created with the best intents); causing addiction and radicalization (through gamification and addictive recommender systems). * These AI behaviors are already damaging lives. e.g. see the use of racially biased ML to [recommend criminal sentencing](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing) * I am not going to focus on this class of risk, despite its importance. These risks are already a topic of research and concern, though more resources are needed. I am going to focus on future risks, where less work is (mostly) being done towards mitigations. 1. AI will do most jobs that are currently done by humans. * This is likely to lead to massive unemployment. * This is likely to lead to massive social disruption. * I'm unsure in what order jobs will be supplanted. The tasks that are hard or easy for an AI are different than the tasks that are hard or easy for a person. We have terrible intuition for this difference. * Five years ago I would have guessed that generating commissioned art from a description would be one of the last, rather than one of the first, human tasks to be automated. * Most human jobs involve a diversity of skills. We should expect many jobs to [transform as parts of them are automated, before they disappear](https://www.journals.uchicago.edu/doi/full/10.1086/718327). * Most of the mitigations for job loss are social and political. * [Universal basic income](https://en.wikipedia.org/wiki/Universal_basic_income). * Technical mitigations: * Favor research and product directions that seem likely to be more complementary and enabling, and less competitive, with human job roles. Almost everything will have a little of both characters ... but the balance between enabling vs. competing with humans is a question we should be explicitly thinking about when we choose projects. 1. AI will enable extremely effective targeted manipulation of humans. * Twitter/X currently uses *primitive* machine learning models, and chooses a sequence of *pre-existing* posts to show me. This is enough to make me spend hours slowly scrolling a screen with my finger, receiving little value in return. * Future AI will be able to dynamically generate the text, audio, and video stimuli which is predicted to be most compelling to me personally, based upon the record of my past online interactions. * Stimuli may be designed to: * cause addictive behavior, such as compulsive app use * promote a political agenda * promote a religious agenda * promote a commercial agenda -- advertising superstimuli * Thought experiments * Have you ever met someone, and had an instant butterfly-in-the-stomach can't-quite-breathe feeling of attraction? Imagine if every time you load a website, there is someone who makes specifically you feel that way, telling you to drink coca-cola. * Have you ever found yourself obsessively playing an online game, or obsessively scrolling a social network or news source? Imagine if the intermittent rewards were generated based upon a model of your mental state, to be as addictive as possible to your specific brain at that specific moment in time. * Have you ever crafted an opinion to try to please your peers? Imagine that same dynamic, but where the peer feedback is artificial and chosen by an advertiser. * Have you ever listened to music, or looked at art, or read a passage of text, and felt like it was created just for you, and touched something deep in your identity? Imagine if every political ad made you feel that way. * I believe the social effects of this will be much, much more powerful and qualitatively different than current online manipulation. (*"[More is different](https://www.jstor.org/stable/pdf/1734697.pdf?casa_token=GDThS0md5IsAAAAA:cnx_fNDcb477G6-zU5qu0qC1tbKmgAhnIj_QecjFNwwYi3pge7vEWiaxIm4mAJqsatKbKnyMu-6ettZAtUDxysDPeFzAM736jpKJq-alTnjB4kCBAFrX3g)"*, or *"quantity has a quality all its own"*, depending on whether you prefer to quote P.W. Anderson or Stalin) * If our opinions and behavior are controlled by whomever pipes stimuli to us, then it breaks many of the basic mechanisms of democracy. Objective truth and grounding in reality will be increasingly irrelevant to societal decisions. * If the addictive potential of generated media is similar to or greater than that of hard drugs ... there are going to be a lot of addicts. * Class divides will grow worse, between people that are privileged enough to protect themselves from manipulative content, and those that are not. * Feelings of emotional connection or beauty may become vacuous, as they are mass produced. (see [parasocial relationships](https://en.wikipedia.org/wiki/Parasocial_interaction) for a less targeted present day example) * non-technical mitigations: * Advocate for laws that restrict stimuli and interaction dynamics which produce anomalous effects on human behavior. * Forbid apps on the Google or Apple storefront that produce anomalous effects on human behavior. (this will include forbidding extremely addictive apps -- so may be difficult to achieve given incentives) * Technical mitigations: * Develop tools to identify stimuli which will produce anomalous effects on human behavior, or anomalous affective response. * Protective filter: Develop models that rewrite stimuli (text or images or other modalities) to contain the same denoted information, but without the associated manipulative subtext. That is, rewrite stimuli to contain the parts you want to experience, but remove aspects which would make you behave in a strange way. * Study the ways in which human behavior and/or perception can be manipulated by optimizing stimuli, to better understand the problem. * I have done some work -- in a collaboration led by Gamaleldin Elsayed -- where we showed that adversarial attacks which cause image models to make incorrect predictions also bias the perception of human beings, even when the attacks are nearly imperceptible. See the Nature Communications paper [*Subtle adversarial image manipulations influence both human and machine perception*](https://www.nature.com/articles/s41467-023-40499-0). * Research scaling laws between model size, training compute, training data from an individual and from a population, and ability to influence a human. 1. AI will enable new weapons and new types of violence. * Autonomous weapons, i.e. weapons that can fight on their own, without requiring human controllers on the battlefield. * Autonomous weapons are difficult to attribute to a responsible group. No one can prove whose drones committed an assassination or an invasion. We should expect increases in deniable anonymous violence. * Removal of social cost of war -- if you invade a country with robots, none of your citizens die, and none of them see atrocities. Domestic politics may become more accepting of war. * Development of new weapons * e.g. new biological, chemical, cyber, or robotic weapons * AI will enable these weapons to be made more capable + deadly than if they were created solely by humans. * AI may lower the barriers to access, so smaller + less resourced groups can make them. * Technical mitigations: * Be extremely cautious of doing research which is dual use. Think carefully about potential violent or harmful applications of a capability, during the research process. * When training and releasing models, include safeguards to prevent them being used for violent purposes. e.g. large language models should refuse to provide instructions for building weapons. Protein/DNA/chemical design models should refuse to design molecules which match characteristics of bio-weapons. This should be integrated as much as possible into the entire training process, rather than tacked on via fine-tuning. 1. AI will enable qualitatively new kinds of surveillance and social control. * AI will have the ability to simultaneously monitor all electronic communications (email, chat, web browsing, ...), cameras, and microphones in a society. It will be able to use that data to build a personalized model of the likely motivations, beliefs, and actions of every single person. Actionable intelligence on this scale, and with this degree of personalization, is different from anything previously possible. * This domestic surveillance data will be useful and extremely tempting even in societies which aren't currently authoritarian. e.g. detailed surveillance data could be used to prevent crime, stop domestic abuse, watch for the sale of illegal drugs, or track health crises. * Once a society starts using this class of technology, it will be difficult to seek political change. Organized movements will be transparent to whoever controls the surveillance technology. Behavior that is considered undesirable will be easily policed. * This class of data can be used for commercial as well as political ends. The products that are offered to you may become hyper-specialized. The jobs that are offered to you may become hyper-specific and narrowly scoped. This may have negative effects on social mobility, and on personal growth and exploration. * Political mitigations: * Offer jobs in the US to all the AI researchers in oppressive regimes!! We currently make it hard for world class talent from countries with which we have a bad relationship to immigrate. We should instead be making it easy for the talent to defect. * Technical mitigations: * Don't design the technologies that are obviously best suited for a panopticon. * Can we design behavioral patterns that are adversarial examples, and will mislead surveillance technology? * Can we use techniques e.g. from differential privacy to technically limit the types of information available in aggregated surveillance data? 1. AI will catalyze failure of societal mechanisms through increased efficiency. I wrote a [blog post on this class of risk](https://sohl-dickstein.github.io/2022/11/06/strong-Goodhart.html). * Many, many parts of our society rely on people and organizations pursuing proxy goals that are aligned with true goals that are good for society. * For instance, in American democracy presidential candidates pursue the proxy goal of getting the majority of electoral votes. Our democracy's healthy functioning relies on that proxy goal being aligned with an actual goal of putting people in power who act in the best interest of the populace. * When we get very efficient at pursuing a proxy goal, we *overfit* to the proxy goal, and this often makes the true goal grow *much worse*. * For instance, in American democracy we begin selecting narrowly for candidates that are best at achieving 270 electoral votes. Focusing on this leads to candidates lying, sabotaging beneficial policies of competitors, and degrading the mechanics of the electoral system. * AI is a tool that can make almost anything much more efficient. When it makes pursuit of a proxy goal more efficient, it will often make the true goal get worse. * AI is going to make pursuit of many, many proxy goals more efficient, *all at once*. We should expect all kinds of unexpected parts of society, which rely on inefficient pursuit of proxy goals, to break, *all at once*. * This is likely to lead to societal disruption, in unexpected ways. * Technical mitigations: * Study the mechanisms behind overfitting, and generalize our understanding of overfitting beyond optimization of machine learning models. * Find mitigations for overfitting that apply to social systems. (see [blog post](https://sohl-dickstein.github.io/2022/11/06/strong-Goodhart.html) again) 1. AI will lead to concentration of power. * AI will create massive wealth, and may provide almost unimaginable (god-like?) power to manipulate the world. * If the most advanced AI is controlled by a small group, then the personal quirks, selfish interests, and internal politics of that small group may have massive (existential?) impact on the rest of the world. * Examples of small groups include the leadership of OpenAI, Anthropic, Alphabet, or China. * This is likely to be a strongly negative outcome for everyone not in the controlling group. *"Power tends to corrupt and absolute power corrupts absolutely."* * Even if AI is available to a larger group, there may be dramatic disparities in access and control. These will lead to dramatic disparities in wealth and quality of life between AI haves and have-nots. * Technical mitigations: * Release AI models as open source. But this comes with its own set of misuse risks that need to be balanced against the benefits! I have no idea if this is a good idea in general. * Improve AI efficiency, both at inference and training, so that there aren't cost barriers to providing AI tools to the entire world. As in the last point though, AI that is too cheap to meter and widely distributed will increase many other AI risks. It's unclear what the right balance is. * As a researcher, try to work for the most responsible organizations. Try also to work for organizations that will diversify the set of *responsible* players, so that there isn't just one winner of the AI race. As with open source though, diversifying the set of organizations with cutting edge AI introduces its own risks! 1. AI will create a slippery slope, where humans lose control of our society. * AI will become better and more efficient at decision making than humans. We will outsource more and more critical tasks that are currently performed by humans. e.g.: * corporations run and staffed by AIs * government agencies run and staffed by AIs * AIs negotiating international trade agreements and regulation with other AIs * AIs identifying crimes, providing evidence of guilt, recommending sentencing * AIs identifying the most important problems to spend research and engineering effort on * AIs selecting the political candidates most likely to win elections, and advising those candidates on what to say and do * As a result, less and less decision making will be driven by human input. Humans will eventually end up as passive passengers in a global society driven by AIs. * It’s not clear whether this is a dystopia. In many ways, it could be good for humanity! But I like our agency in the world, and would find this an unfortunate outcome. * If society moves in a bad or weird direction, humans will find themselves disempowered to do anything about it. * Legal mitigations: * Require that humans be an active part of the decision making loop for a broad array of tasks. These are likely to feel like silly jobs though, and may also put the jurisdiction that requires them at an economic disadvantage. * Technical mitigations: * Value alignment! If AIs are going to be making all of our decisions for us, we want to make sure they are doing so in a way that aligns with our ethics and welfare. It will be important to make this alignment to societal values, rather than individual values. (take home assignment: write out a list of universally accepted societal values we should align our AI to.) * Augment humans. Find ways to make humans more effective or smarter, so that we remain relevant agents. 1. AI will cause disaster by superhuman pursuit of an objective that is misaligned with human values * This category involves an AI becoming far more intelligent than humans, and pursuing some goal that is misaligned with human intention ... leading to the superintelligent AI doing things like destroying the Earth or enslaving all humans as an [instrumental sub-goal](https://en.wikipedia.org/wiki/Instrumental_convergence) to achieve its misaligned goal. * This is a popular and actively researched AI risk in technical circles. I think its popularity is because it's the unique AI risk which seems solvable just by thinking hard about the problem and doing good research. All the other problems are at least as much social and political as technical. * I think the probability of this class of risk is low. But, the severity is potentialy high. It is worth thinking about and taking seriously. * I have a blog post arguing for a [hot mess theory of AI misalignment](https://sohl-dickstein.github.io/2023/03/09/coherence.html) -- as AIs become smarter, I believe they will become less coherent in their behavior (ie, more of a hot mess), rather than engage in monomanical pursuit of a slightly incorrect objective. That is, I believe we should be more worried about the kind of alignment failure where AIs simply behave in unpredictable ways that don't pursue any consistent objective. 1. AI will lead to unexpected harms. * The actual way in which the future plays out will be different from anyone's specific predictions. AI is a transformative and disruptive, but still *unpredictable*, technology. Many of the foundational capabilities and behaviors AI systems will exhibit are still unclear. It is also unclear how those capabilities and behaviors will interact with society. * Depending on the types of AI we build, and the ethics we choose, we may decide that AI has moral standing. If this happens, we will need to consider harm done to, as well as enabled by, AI. The types of harms an AI might experience are difficult to predict, since they will be unlike harms experienced by humans. (I don't believe near-future AI systems will have significant moral standing.) * Some of the greatest risks are likely to be things we haven't even thought of yet. We should prioritize identifying new risks. # Parting thoughts 1. If AI produces profound social effects, AI developers may be blamed. * This could lead to attacks on AI scientists and engineers, and other elites. This is especially likely if the current rule of law is one of the things disrupted by AI. (The Chinese cultural revolution and the Khmer Rouge regime are examples of cultural disruption that was not good for intellectual elites.) * It is in our own direct, as well as enlightened, self-interest to make the consequences of our technology as positive as possible. 1. Mitigating existential risks requires solving intermediate risks. * Many non-existential, intermediate time-scale, risks would damage our society's ability to act in the concerted thoughtful way required to solve later risks. * If you think existential risks like extinction or permanent dystopia are overriding, it is important to also work to solve earlier risks. If we don't solve the earlier risks, we won't achieve the level of cooperation required to solve the big ones. 1. It is important that we ground our risk assessments in experiment and theory. * Thinking carefully about the future is a valuable exercise, but is not enough on its own. Fields which are not grounded in experiments or formal validation [make silently incorrect conclusions](https://sohl-dickstein.github.io/2023/03/09/coherence.html#endnote-compneuro). * Right now, we are almost certainly making many silently incorrect conclusions about the shape of AI risk, because we base most of our AI risk scenarios on elaborate verbal arguments, without experimental validation. It is dangerous for us to be silently wrong about AI risks. * As we work to mitigate AI risk, we must try hard to validate the risks themselves. It is difficult -- but possible! -- to validate risks posed by technology that doesn't exist yet. We must work to find aspects of risk scenarios we can measure now or formally prove. 1. We have a lot of leverage, and we should use it to make the future we want. * AI will bend the arc of history, and we are early in the process of creating it. Small interventions at the beginning of something huge have enormous consequences. We can make small choices now that will make the future much better, or much worse. * AI has the potential to unlock astounding wealth, and do awesome (in the original sense of the word) good in the world. It can provide a personal tutor for every student, eliminate traffic accidents, solve cancer, solve aging, provide enough excess resources to easily feed the 700+ million people who live in hunger, make work an optional recreational activity, propel us to the planets and the stars, and more. * Building AI is also the most fascinating scientific endeavor of my lifetime. * We have a unique opportunity to build the future we want to live in. Thinking about how to avoid bad outcomes, and achieve good outcomes, is a necessary step in building it. # Acknowledgements Thank you to Asako Miyakawa, Meredith Ringel Morris, Noah Fiedel, Fernando Diaz, Rif, Sebastian Farquhar, Peter Liu, Dave Orr, Lauren Wilcox, Simon Kornblith, Gamaleldin Elsayed, and Toby Shevlane for valuable feedback on ideas in this post! body{visibility:hidden;white-space:pre;font-family:monospace} window.markdeepOptions = {mode: 'html', tocStyle: 'medium'}; window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")

a year ago 38 votes
Too much efficiency makes everything worse: overfitting and the strong version of Goodhart’s law

window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-1XJMTJ5KCK'); Increased efficiency can sometimes, counterintuitively, lead to worse outcomes. This is true almost everywhere. We will name this phenomenon the strong version of [Goodhart's law](https://en.wikipedia.org/wiki/Goodhart%27s_law). As one example, more efficient centralized tracking of student progress by standardized testing seems like such a good idea that well-intentioned laws [mandate it](https://en.wikipedia.org/wiki/No_Child_Left_Behind_Act). However, testing also incentivizes schools to focus more on teaching students to test well, and less on teaching broadly useful skills. As a result, it can cause overall educational outcomes to become worse. Similar examples abound, in politics, economics, health, science, and many other fields. This same counterintuitive relationship between efficiency and outcome occurs in machine learning, where it is called overfitting. Overfitting is heavily studied, somewhat theoretically understood, and has well known mitigations. This connection between the strong version of Goodhart's law in general, and overfitting in machine learning, provides a new lens for understanding bad outcomes, and new ideas for fixing them. Overfitting and Goodhart's law ========================== In machine learning (ML), **overfitting** is a pervasive phenomenon. We want to train an ML model to achieve some goal. We can't directly fit the model to the goal, so we instead train the model using some proxy which is *similar* to the goal. ![](/assets/cartoon-conversation.png width="300px" border="1") For instance, as an occasional computer vision researcher, my goal is sometimes to prove that my new image classification model works well. I accomplish this by measuring its accuracy, after asking it to label images (is this image a cat or a dog or a frog or a truck or a ...) from a standardized [test dataset of images](https://paperswithcode.com/dataset/cifar-10). I'm not allowed to train my model on the test dataset though (that would be cheating), so I instead train the model on a *proxy* dataset, called the training dataset. I also can't directly target prediction accuracy during training[^accuracytarget], so I instead target a *proxy* objective which is only related to accuracy. So rather than training my model on the goal I care about -- classification accuracy on a test dataset -- I instead train it using a *proxy objective* on a *proxy dataset*. At first everything goes as we hope -- the proxy improves, and since the goal is similar to the proxy, it also improves. ![](/assets/cartoon-early.png width="444px" border="1") As we continue optimizing the proxy though, we eventually exhaust the useable similarity between proxy and goal. The proxy keeps on getting better, but the goal stops improving. In machine learning we call this overfitting, but it is also an example of Goodhart's law. ![](/assets/cartoon-mid.png width="444px" border="1") [Goodhart's law](https://en.wikipedia.org/wiki/Goodhart%27s_law) states that, *when a measure becomes a target, it ceases to be a good measure*[^strathern]. Goodhart proposed this in the context of monetary policy, but it applies far more broadly. In the context of overfitting in machine learning, it describes how the proxy objective we optimize ceases to be a good measure of the objective we care about. The strong version of Goodhart's law: as we become too efficient, the thing we care about grows worse ========================== If we keep on optimizing the proxy objective, even after our goal stops improving, something more worrying happens. The goal often starts getting *worse*, even as our proxy objective continues to improve. Not just a little bit worse either -- often the goal will diverge towards infinity. This is an [extremely](https://www.cs.princeton.edu/courses/archive/spring16/cos495/slides/ML_basics_lecture6_overfitting.pdf) [general](https://www.cs.mcgill.ca/~dprecup/courses/ML/Lectures/ml-lecture02.pdf) [phenomenon](https://scholar.google.com/scholar?hl=en&q=overfitting) in machine learning. It mostly doesn't matter what our goal and proxy are, or what model architecture we use[^overfittinggenerality]. If we are very efficient at optimizing a proxy, then we make the thing it is a proxy for grow worse. ![](/assets/cartoon-late.png width="444px" border="1") Though this pheonomenon is often discussed, it doesn't seem to be named[^notoverfitting]. Let's call it **the strong version of Goodhart's law**[^strongunintended]. We can state it as: > *When a measure becomes a target, > if it is effectively optimized, > then the thing it is designed to measure will grow worse.* Goodhart's law says that if you optimize a proxy, eventually the goal you care about will stop improving. The strong version of Goodhart's law differs in that it says that as you over-optimize, the goal you care about won't just stop improving, but will instead grow much worse than if you had done nothing at all. Goodhart's law applies well beyond economics, where it was originally proposed. Similarly, the strong version of Goodhart's law applies well beyond machine learning. I believe it can help us understand failures in economies, governments, and social systems. Increasing efficiency and overfitting are happening everywhere ========================== Increasing efficiency is permeating almost every aspect of our society. If the thing that is being made more efficient is beneficial, then the increased efficiency makes the world a better place (overall, the world [seems to be becoming a better place](https://ourworldindata.org/a-history-of-global-living-conditions-in-5-charts)). If the thing that is being made more efficient is socially harmful, then the consequences of greater efficiency are scary or depressing (think mass surveillance, or robotic weapons). What about the most common case though -- where the thing we are making more efficient is related, but not identical, to beneficial outcomes? What happens when we get better at something which is merely correlated with outcomes we care about? In that case, we can overfit, the same as we do in machine learning. The outcomes we care about will improve for a while ... and then they will grow dramatically worse. Below are a few, possibly facile, examples applying this analogy. > **Goal:** Educate children well **Proxy:** [Measure student and school performance](https://en.wikipedia.org/wiki/No_Child_Left_Behind_Act) on standardized tests **Strong version of Goodhart's law leads to:** Schools narrowly focus on teaching students to answer questions like those on the test, at the expense of the underlying skills the test is intended to measure > **Goal:** Rapid progress in science **Proxy:** Pay researchers a [cash bonus for every publication](https://www.science.org/content/article/cash-bonuses-peer-reviewed-papers-go-global) **Strong version of Goodhart's law leads to:** Publication of incorrect or incremental results, collusion between reviewers and authors, research paper mills > **Goal:** A well-lived life **Proxy:** Maximize the reward pathway in the brain **Strong version of Goodhart's law leads to:** Substance addiction, gambling addiction, days lost to doomscrolling Twitter > **Goal:** Healthy population **Proxy:** Access to nutrient-rich food **Strong version of Goodhart's law leads to:** Obesity epidemic > **Goal:** Leaders that act in the best interests of the population **Proxy:** Leaders that have the most support in the population **Strong version of Goodhart's law leads to:** Leaders whose expertise and passions center narrowly around manipulating public opinion at the expense of social outcomes > **Goal:** An informed, thoughtful, and involved populace **Proxy:** The ease with which people can share and find ideas **Strong version of Goodhart's law leads to:** Filter bubbles, conspiracy theories, parasitic memes, escalated tribalism > **Goal:** Distribution of labor and resources based upon the needs of society **Proxy:** Capitalism **Strong version of Goodhart's law leads to:** Massive wealth disparities (with incomes ranging from hundreds of dollars per year to hundreds of dollars per second), with [more than a billion](https://hdr.undp.org/en/2020-MPI ) people living in poverty > **Goal:** The owners of Paperclips Unlimited, LLC, become wealthy **Proxy:** Number of paperclips made by the AI-run manufacturing plant **Strong version of Goodhart's law leads to:** The entire solar system, including the company owners, being [converted to paperclips](https://www.lesswrong.com/tag/paperclip-maximizer) As an exercise for the reader, you can think about how the strong version of Goodhart's law would apply to other efficiencies, like the ones in this list: ~~~ none telepresence and virtual reality personalized medicine gene therapy tailoring marketing messages to the individual consumers or voters who will find them most actionable predicting the outcome of elections writing code artificial intelligence reducing slack in supply chains rapidly disseminating ideas generating entertainment identifying new products people will buy raising livestock trading securities extracting fish from the ocean constructing cars ~~~ [Listing [greater-efficiency]: Some additional diverse things we are getting more efficient at. For most of these, initial improvements were broadly beneficial, but getting too good at them could cause profound negative consequences.] How do we mitigate the problems caused by overfitting and the strong version of Goodhart's law? ========================== If overfitting is useful as an analogy, it will be because some of the approaches that improve it in machine learning also transfer to other domains. Below, I review some of the most effective techniques from machine learning, and share some thoughts about how they might transfer. + **Mitigation: Better align proxy goals with desired outcomes.** In machine learning this often means carefully collecting training examples which are as similar as possible to the situation at test time. Outside of machine learning, this means changing the proxies we have control over -- e.g. laws, incentives, and social norms -- so that they directly encourage behavior that better aligns with our goals. This is the standard approach used to (try to) engineer social systems. + **Mitigation: Add regularization penalties to the system.** In machine learning, this is often performed by [penalizing the squared magnitude of parameters](https://developers.google.com/machine-learning/crash-course/regularization-for-simplicity/l2-regularization), so that they stay small. Importantly, regularization doesn't need to directly target undesirable behavior. Almost anything that penalizes deviations of a model from typicality works well. Outside of machine learning, anything that penalizes complexity, or adds friction or extra cost to a system, can be viewed as regularization. Some example ideas: + Add a billing mechanism to SMTP, so there's a small cost for every email. + Use a progressive tax code, so that unusual success is linked to disproportionately greater cost + Charge a court fee proportional to the squared (exponentiated?) number of lawsuits initiated by an organization, so that unusual use of the court system leads to unusual expenses + Tax the number of bits of information stored about users + **Mitigation: Inject noise into the system.** In machine learning, this involves adding random jitter to the inputs, parameters, and internal state of a model. The unpredictability resulting from this noise makes overfitting far more difficult. Here are some ideas for how to improve outcomes by injecting noise outside of machine learning: + Stack rank all the candidates for a highly competitive school or job. Typically, offers would be made to the top-k candidates. Instead, make offers probabilistically, with probability proportional to $\left(\right.$[approx # top tier candidates] $+$ [candidate's stack rank]$\left.\right)^{-1}$. Benefits include: greater diversity of accepted candidates; less ridiculous resources spent by the candidates tuning their application, and by application reviewers reviewing the applications, since small changes in assessed rank only have a small effect on outcome probabilities; occasionally you will draw a longshot candidate that is more likely to fail, but also more likely to succeed in an unconventional and unusually valuable way. + Randomly time quizzes and tests in a class, rather than giving them on pre-announced dates, so that students study to understand the material more, and cram (i.e., overfit) for the test less. + Require securities exchanges to add random jitter to the times when they process trades, with a standard deviation of about a second. (An efficient market is great. Building a global financial system out of a chaotic nonstationary dynamical system with a characteristic timescale more than six orders of magnitude faster than human reaction time is just asking for trouble.) + Randomize details of the electoral system on voting day, in order to prevent candidates from overfitting to incidental details of the current electoral system (e.g. by taking unreasonable positions that appeal to a pivotal minority). For instance randomly select between ranked choice or first past the post ballots, or randomly rescale the importance of votes from different districts. (I'm not saying all of these are *good* ideas. Just ... ideas.) + **Mitigation: Early stopping.** In machine learning, it's common to monitor a third metric, besides training loss and test performance, which we call validation loss. When the validation loss starts to get worse, we stop training, even if the training loss is still improving. This is the single most effective tool we have to prevent catastrophic overfitting. Here are some ways early stopping could be applied outside of machine learning: + Sharply limit the time between a call for proposals and submission date, so that proposals better reflect pre-existing readiness, and to avoid an effect where increasing resources are poured into proposal generation, rather than being used to create something useful + Whenever stock volatility rises above a threshold, suspend all market activity + The use of antitrust law to split companies that are preventing competition in a market + Estimate the importance of a decision in $$. When the value of the time you have already spent analyzing the decision approaches that value, make a snap decision. + Freeze the information that agents are allowed to use to achieve their goals. Press blackouts in the 48 hours before an election might fall under this category. One of the best understood *causes* of extreme overfitting is that the expressivity of the model being trained *too closely matches* the complexity of the proxy task. When the model is very weak, it can only make a little bit of progress on the task, and it doesn’t exhaust the similarity between the goal and the proxy. When the model is extremely strong and expressive, it can optimize the proxy objective in isolation, without inducing extreme behavior on other objectives. When the model's expressivity roughly matches the task complexity (e.g., the number of parameters is no more than a few orders of magnitude higher or lower than the number of training examples), then it can only do well on the proxy task by doing *extreme things everywhere else*. See Figure [capacity] for a demonstration of this idea on a simple task. This cause of overfitting motivates two final, diametrically opposed, methods for mitigating the strong version of Goodhart’s law. + **Mitigation: Restrict capabilities / capacity.** In machine learning, this is often achieved by making the model so small that it's incapable of overfitting. In the broader world, we could similarly limit the capacity of organizations or agents. Examples include: + Campaign finance limits + Set a maximum number of people that can work in companies of a given type. e.g. allow only 10 people to work in any lobbying group + Set the maximum number of parameters, or training compute, that any AI system can use. + **Mitigation: Increase capabilities / capacity.** In machine learning, if a model is made very big, it often has enough capacity to overfit to the training data without making performance on the test data worse. In the broader world, this would correspond to developing capabilities that are so great that there is no longer any tradeoff required between performance on the goal and the proxy. Examples include: + Obliterate all privacy, and make all the information about all people, governments, and other organizations available to everyone all the time, so that everyone can have perfect trust of everyone else. This could be achieved by legislating that every database be publicly accessible, and by putting cameras in every building. (to be clear -- from my value system, this would be a dystopian scenario) + Invest in basic research in clean energy + Develop as many complex, inscrutable, and diverse market trading instruments as possible, vesting on as many timescales as possible. (In nature, more complex ecosystems are more stable. Maybe there is a parallel for markets?) + Use the largest, most compute and data intensive, AI model possible in every scenario 😮[^gobig] This last mitigation of just continuing to increase capabilities works surprisingly well in machine learning. It is also a path of least resistance. Trying to fix our institutions by blindly making them better at pursuing misaligned goals is a terrible idea though. Parting thoughts ========================== The strong version of Goodhart's law underlies most of my personal fears around AI (expect a future blog post about my AI fears!). If there is one thing AI will enable, it is greater efficiency, on almost all tasks, over a very short time period. We are going to need to simultaneously deal with massive numbers of diverse unwanted side effects, just as our ability to collaborate on solutions is also disrupted. There's a lot of opportunity to *research* solutions to this problem. If you are a scientist looking for research ideas which are pro-social, and have the potential to create a whole new field, you should consider building formal (mathematical) bridges between results on overfitting in machine learning, and problems in economics, political science, management science, operations research, and elsewhere[^researchideas]. This is a goldmine waiting to be tapped. (I might actually be suggesting here that we should invent the field of [psychohistory](https://en.wikipedia.org/wiki/Psychohistory), and that overfitting phenomena will have a big role in that field.) The more our social systems break due to the strong version of Goodhart's law, the less we will be able to take the concerted rational action required to fix them. Hopefully naming, and better understanding, the phenomenon will help push in the opposite direction. ![Figure [capacity]: **Models often suffer from the strong version of Goodhart's law, and overfit catastrophically, when their complexity is well matched to the complexity of the proxy task.** If a model is instead much more or much less capable than required, it will overfit less. Here, models are trained to map from a one-dimensional input $x$ to a one-dimensional output $y$. All models are trained on the same 10 datapoints, in red. The model with 4 parameters is too weak to exactly fit the datapoints, but it smoothly approximates them. The model with 10,000 parameters is strong enough to easily fit all the datapoints, and also smoothly interpolate between them. The model with 10 parameters is exactly strong enough to fit the datapoints, but it can only contort itself to do so by behaving in extreme ways away from the training data. If asked to predict $y$ for a new value of $x$, the 10 parameter model would perform extremely poorly. For details of this toy experiment, which uses linear random feature models, see this [colab notebook](https://colab.research.google.com/drive/1mAqCsCE-6biiFxQu8swlc5MygmI9lMJA?usp=sharing).](/assets/size-mitigation.png width="290px" border="1") [^accuracytarget]: Accuracy is not differentiable, which makes it impossible to target by naive gradient descent training. It is usually replaced during training by a proxy of softmax-cross-entropy loss, which is differentiable. There are blackbox training methods which can directly target accuracy, but they are inefficient and rarely used. [^strathern]: This modern phrasing is due to Marilyn Strathern. Goodhart originally phrased the observation as the more clunky *any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes*. [^overfittinggenerality]: This glosses over a lot of variation. For instance, there is an entire subfield which studies the qualitative differences in overfitting in underparameterized, critically parameterized, and overparameterized models. Despite this variation, the core observation -- that when we train on a proxy our target gets better for a while, but then grows worse -- holds broadly. [^notoverfitting]: It's not simply overfitting. Overfitting refers to the proxy becoming better than the goal, not to the goal growing worse in an absolute sense. There are other related, but not identical, concepts -- for instance [perverse incentives](https://en.wikipedia.org/wiki/Perverse_incentive), [Campbell's law](https://en.wikipedia.org/wiki/Campbell%27s_law), the [Streisand effect](https://en.wikipedia.org/wiki/Streisand_effect), the [law of unintended consequences](https://en.wikipedia.org/wiki/Unintended_consequences), [Jevons paradox](https://en.m.wikipedia.org/wiki/Jevons_paradox), and the concept of [negative externalities](https://en.m.wikipedia.org/wiki/Externality#Negative). [Goodhart's curse](https://arbital.com/p/goodharts_curse/) is perhaps the closest. However, the definition of Goodhart's curse incorporates not only the phenomenon, but also a specific mechanism, and the mechanism is incorrect[^Goodhartcurse]. *Edit 2022/11/9: Andrew Hundt [suggests](https://twitter.com/athundt/status/1589591738792177664) that similar observations that optimization isn't always desirable have been made in the social sciences, and gives specific examples of "The New Jim Code" and "[Weapons of Math Destruction](https://en.m.wikipedia.org/wiki/Weapons_of_Math_Destruction)". Kiran Vodrahalli [points out](https://mathstodon.xyz/@kiranvodrahalli/109300676096306738) connections to robust optimization and the "[price of robustness](https://www.robustopt.com/references/Price%20of%20Robustness.pdf)." [Leo Gao](https://bmk.sh/) points me at a [recent paper](https://arxiv.org/abs/2210.10760) which uses the descriptive term "overoptimization" for this phenomenon, which I think is good.* [^strongunintended]: I also considered calling it the strong law of unintended consequences -- it's not just that there are unexpected side effects, but that that the more effectively you accomplish your task, the more those side effects will act against your original goal. [^gobig]: Note that for suficiently strong AI, limitations on its capabilities might be determined by the laws of physics, rather than by its compute scale or training dataset size. So if you're worried about misaligned AGI, this mitigation may offer no comfort. [^researchideas]: For instance, take PAC Bayes bounds from statistical learning theory, and use them to predict the optimal amount of power unions should have, in order to maximize the wealth of workers in an industry. Or, estimate the spectrum of candidate-controllable and uncontrollable variables in political contests, to predict points of political breakdown. (I'm blithely suggesting these examples as if they would be easy, and are well formed in their description. Of course, neither is true -- actually doing this would require hard work and brilliance in some ratio.) [^Goodhartcurse]: The [definition of Goodhart's curse](https://arbital.com/p/goodharts_curse/) includes [the optimizer's curse](https://www.semanticscholar.org/paper/The-Optimizer's-Curse%3A-Skepticism-and-Postdecision-Smith-Winkler/28cfed594544215673db802dce79b8c12d3ab5ab) as its causal mechanism. This is where the word 'curse' comes from in its name. If an objective $u$ is an imperfect proxy for a goal objective $v$, the optimizer's curse explains why optimizing $u$ finds an anomalously good $u$, and makes the *gap* between $u$ and $v$ grow large. It doesn't explain why optimizing $u$ makes $v$ grow worse in an absolute sense. That is, the optimizer's curse provides motivation for why Goodhart's law occurs. It does not provide motivation for why the strong version of Goodhart's law occurs. (As I briefly discuss elsewhere in the post, one common causal mechanism for $v$ growing worse is that it's expressivity is too closely matched to the complexity of the task it is performing. This is a very active research area though, and our understanding is both incomplete and actively changing.) body{visibility:hidden;white-space:pre;font-family:monospace} window.markdeepOptions = {mode: 'html', tocStyle: 'medium'}; window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")

over a year ago 42 votes

More in AI

Mass Intelligence

From GPT-5 to nano banana: everyone is getting access to powerful AI

14 hours ago 5 votes
Pluralistic: The capitalism of fools (28 Aug 2025)

Today's links The capitalism of fools: Trump's mirror-world New Deal. Hey look at this: Delights to delectate. Object permanence: IBM's fabric design; Nixon Cthulu; Surveillance capitalism is capitalism, with surveillance; Dismaland ad; Outdoor ed vs TB; Mathematicians' fave chalk. Upcoming appearances: Where to find me. Recent appearances: Where I've been. Latest books: You keep readin' em, I'll keep writin' 'em. Upcoming books: Like I said, I'll keep writin' 'em. Colophon: All the rest. The capitalism of fools (permalink) As Trump rails against free trade, demands public ownership stakes in corporations that receive government funds, and (selectively) enforces antitrust law, some (stupid) people are wondering, "Is Trump a communist?" In The American Prospect, David Dayen writes about the strange case of Trump's policies, which fly in the face of right wing economic orthodoxy and have the superficial trappings of a leftist economic program: https://prospect.org/economy/2025-08-28-judge-actually-existing-trump-economy/ The problem isn't that tariffs are always bad, nor is it that demanding state ownership stakes in structurally important companies that depend on public funds is bad policy. The problem is that Trump's version of these policies sucks, because everything Trump touches dies, and because he governs solely on vibes, half-remembered wisdom imparted by the last person who spoke to him, and the dying phantoms of old memories as they vanish beneath a thick bark of amyloid plaque. Take Trump's demand for a 10% stake in Intel (a course of action endorsed by no less than Bernie Sanders). Intel is a company in trouble, whose financialization has left it dependent on other companies (notably TMSC) to make its most advanced chips. The company has hollowed itself out, jettisoning both manufacturing capacity and cash reserves, pissing away the funds thus freed up on stock buybacks and dividends. Handing Trump a 10% "golden share" does nothing to improve Intel's serious structural problems. And if you take Trump at his word and accept that securing US access to advanced chips is a national security priority, Trump's Intel plan does nothing to advance that access. But it gets worse: Trump also says denying China access to these chips is a national security priority, but he greenlit Nvidia's plan to sell its top-of-the-range silicon to China in exchange for a gaudy statuette and a 15% export tax. It's possible to pursue chip manufacturing as a matter of national industrial policy, and it's even possible to achieve this goal by taking ownership stakes in key firms – because it's often easier to demand corporate change via a board seat than it is to win the court battles needed to successfully invoke the Defense Production Act. The problem is that Trumpland is uninterested in making any of that happen. They just want a smash and grab and some red meat for the base: "Look, we made Intel squeal!" Then there's the Trump tariffs. Writing in Vox EU, Lausanne prof of international business Richard Baldwin writes about the long and checkered history of using tariffs to incubate and nurture domestic production: https://www.nakedcapitalism.com/2025/08/trumpian-tariffs-rerun-the-failed-strategy-of-import-substitution-industrialization.html The theory of tariffs goes like this: if we make imports more expensive by imposing a tax on them (tariffs are taxes that are paid by consumers, after all), then domestic manufacturers will build factories and start manufacturing the foreign goods we've just raised prices on. This is called "import substitution," and it really has worked, but only in a few cases. What do those cases have in common? They were part of a comprehensive program of "export discipline, state-directed credit, and careful government–business coordination": https://academic.oup.com/book/10201 In other words, tariffs only work to reshore production where there is a lot of careful planning, diligent data-collection, and review. Governments have to provide credit to key firms to get them capitalized, provide incentives, and smack nonperformers around. Basically, this is the stuff that Biden did for renewables with the energy sector, and – to a lesser extent – for silicon with the CHIPS Act. Trump's not doing any of that. He's just winging it. There's zero follow-through. It's all about appearances, soundbites, and the libidinal satisfaction of watching corporate titans bend the knee to your cult leader. This is also how Trump approaches antitrust. When it comes to corporate power, both Trump and Biden's antitrust enforcers are able to strike terror into the hearts of corporate behemoths. The difference is that the Biden administration prioritized monopolists based on how harmful they were to the American people and the American economy, whereas Trump's trustbusters target companies based on whether Trump is mad at them: https://pluralistic.net/2024/11/12/the-enemy-of-your-enemy/#is-your-enemy What's more, any company willing to hand a million or two to a top Trump enforcer can just walk away from the charges: https://prospect.org/power/2025-08-19-doj-insider-blows-whistle-pay-to-play-antitrust-corruption/ In her 2023 book Doppelganger, Naomi Klein introduces the idea of a right-wing "mirror world" that offers a conspiratorial, unhinged version of actual problems that leftists wrestle with: https://pluralistic.net/2023/09/05/not-that-naomi/#if-the-naomi-be-klein-youre-doing-just-fine For example, the antivax movement claims that pharma companies operate on the basis of unchecked greed, without regard to the harm their defective products cause to everyday people. When they talk about this, they sound an awful like leftists who are angry that the Sacklers killed a million Americans with their opiods and then walked away with billions of dollars: https://pluralistic.net/2023/12/05/third-party-nonconsensual-releases/#au-recherche-du-pedos-perdue Then there are the conspiracy theories about voting machines. Progressives have been sounding the alarm about the security defects in voting machine since the Bush v Gore years, but that doesn't mean that Venezuelan hackers stole the 2020 election for Biden: https://pluralistic.net/2021/01/11/seeing-things/#ess When anti-15-minute-city weirdos warn that automated license-plate cameras are a gift to tyrants both petty and gross, they are repeating a warning that leftists have sounded since the Patriot Act: https://locusmag.com/2023/05/commentary-cory-doctorow-the-swivel-eyed-loons-have-a-point/ The mirror-world is a world where real problems (the rampant sexual abuse of children by powerful people and authortiy figures) are met with fake solutions (shooting up pizza parlors and transferring Ghislaine Maxwell to a country-club prison): https://www.bbc.com/news/articles/czd049y2qymo Most of the people stuck in the mirror world are poor and powerless, because desperation makes you an easy mark for grifters peddling conspiracy theories. But Trump's policies on corporate power are what happens in the mirror world inhabited by the rich and powerful. Trump is risking the economic future of every person in America (except a few cronies), but that's not the only risk here. There's also the risk that reasonable people will come to view industrial policy, government stakes in publicly supported companies, and antitrust as reckless showboating, a tactic exclusively belonging to right wing nutjobs and would-be dictators. Sociologists have a name for this: they call it "schismogenesis," when a group defines itself in opposition to its rivals. Schismogenesis is progressives insisting that voting machines and pharma companies are trustworthy and that James Comey is a resistance hero: https://pluralistic.net/2021/12/18/schizmogenesis/ After we get rid of Trump, America will be in tatters. We're going to need big, muscular state action to revive the nation and rebuild its economy. We can't afford to let Trump poison the well for the very idea of state intervention in corporate activity. Hey look at this (permalink) Thinking Ahead to the Full Military Takeover of Cities https://www.hamiltonnolan.com/p/thinking-ahead-to-the-full-military Framework is working on a giant haptic touchpad, Trackpoint nub, and eGPU for its laptops https://www.theverge.com/news/766161/framework-egpu-haptic-touchpad-trackpoint-nub National says "fuck you" on the right to repair https://norightturn.blogspot.com/2025/08/national-says-fuck-you-on-right-to.html?m=1 Tax the Rich. They’ll Stay https://www.rollingstone.com/politics/political-commentary/zohran-mamdani-tax-rich-new-york-city-1235414327/ Welcome to the Free Online Tax Preparation Feedback Survey https://irsresearch.gov1.qualtrics.com/jfe/form/SV_ewDJ6DeBj3ockGa Object permanence (permalink) #20yrsago Cops have to pay $41k for stopping man from videoing them https://web.archive.org/web/20050905015507/http://www.paed.uscourts.gov/documents/opinions/05D0847P.pdf #20yrsago Commercial music in podcasts: the end of free expression? https://memex.craphound.com/2005/08/26/commercial-music-in-podcasts-the-end-of-free-expression/ #10yrsago North Dakota cops can now use lobbyist-approved taser/pepper-spray drones https://www.thedailybeast.com/first-state-legalizes-taser-drones-for-cops-thanks-to-a-lobbyist/ #10yrsago Illinois mayor appoints failed censor to town library board https://ncac.org/news/blog/mayor-appoints-would-be-censor-to-library-board #10yrsago IBM’s lost, glorious fabric design https://collection.cooperhewitt.org/users/mepelman/visits/qtxg/87597377/ #10yrsago Former mayor of SLC suing NSA for warrantless Olympic surveillance https://www.techdirt.com/2015/08/26/prominent-salt-lake-city-residents-sue-nsa-over-mass-warrantless-surveillance-during-2002-olympics/ #10yrsago Health’s unkillable urban legend: “You must drink 8 glasses of water/day” https://www.nytimes.com/2015/08/25/upshot/no-you-do-not-have-to-drink-8-glasses-of-water-a-day.html?_r=0 #10yrsago Austin Grossman’s CROOKED: the awful, cthulhoid truth about Richard Nixon https://memex.craphound.com/2015/08/26/austin-grossmans-crooked-the-awful-cthulhoid-truth-about-richard-nixon/ #10yrsago After Katrina, FBI prioritized cellphone surveillance https://www.muckrock.com/news/archives/2015/aug/27/stingray-katrina/ #10yrsago Germany’s spy agency gave the NSA the private data of German citizens in exchange for Xkeyscore access https://www.zeit.de/digital/datenschutz/2015-08/xkeyscore-nsa-domestic-intelligence-agency #10yrsago Elaborate spear-phishing attempt against global Iranian and free speech activists, including an EFF staffer https://citizenlab.ca/2015/08/iran_two_factor_phishing/ #10yrsago Commercial for Banksy’s Dismaland https://www.youtube.com/watch?v=V2NG-MgHqEk #5yrsago Outdoor education beat TB in 1907 https://pluralistic.net/2020/08/27/cult-chalk/#tb #5yrsago Hagoromo, mathematicians' cult chalk https://pluralistic.net/2020/08/27/cult-chalk/#hagoromo #5yrsago Principles for platform regulation https://pluralistic.net/2020/08/27/cult-chalk/#eff-eu #5yrsago It's blursday https://pluralistic.net/2020/08/26/destroy-surveillance-capitalism/#blursday #5yrsago Surveillance Capitalism is just capitalism, plus surveillance https://pluralistic.net/2020/08/26/destroy-surveillance-capitalism/#surveillance-monopolism Upcoming appearances (permalink) Ithaca: AD White keynote (Cornell), Sep 12 https://deanoffaculty.cornell.edu/events/keynote-cory-doctorow-professor-at-large/ DC: Enshittification at Politics and Prose, Oct 8 https://politics-prose.com/cory-doctorow-10825 New Orleans: DeepSouthCon63, Oct 10-12 http://www.contraflowscifi.org/ Chicago: Enshittification with Kara Swisher (Chicago Humanities), Oct 15 https://www.oldtownschool.org/concerts/2025/10-15-2025-kara-swisher-and-cory-doctorow-on-enshittification/ San Francisco: Enshittification at Public Works (The Booksmith), Oct 20 https://app.gopassage.com/events/doctorow25 Miami: Enshittification at Books & Books, Nov 5 https://www.eventbrite.com/e/an-evening-with-cory-doctorow-tickets-1504647263469 Recent appearances (permalink) Divesting from Amazon’s Audible and the Fight for Digital Rights (Libro.fm) https://pocketcasts.com/podcasts/9349e8d0-a87f-013a-d8af-0acc26574db2/00e6cbcf-7f27-4589-a11e-93e4ab59c04b The Utopias Podcast https://www.buzzsprout.com/2272465/episodes/17650124 Tariffs vs IP Law (Firewalls Don't Stop Dragons) https://www.youtube.com/watch?v=LFABFe-5-uQ Latest books (permalink) "Picks and Shovels": a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books (US), Head of Zeus (UK), February 2025 (https://us.macmillan.com/books/9781250865908/picksandshovels). "The Bezzle": a sequel to "Red Team Blues," about prison-tech and other grifts, Tor Books (US), Head of Zeus (UK), February 2024 (the-bezzle.org). "The Lost Cause:" a solarpunk novel of hope in the climate emergency, Tor Books (US), Head of Zeus (UK), November 2023 (http://lost-cause.org). "The Internet Con": A nonfiction book about interoperability and Big Tech (Verso) September 2023 (http://seizethemeansofcomputation.org). Signed copies at Book Soup (https://www.booksoup.com/book/9781804291245). "Red Team Blues": "A grabby, compulsive thriller that will leave you knowing more about how the world works than you did before." Tor Books http://redteamblues.com. "Chokepoint Capitalism: How to Beat Big Tech, Tame Big Content, and Get Artists Paid, with Rebecca Giblin", on how to unrig the markets for creative labor, Beacon Press/Scribe 2022 https://chokepointcapitalism.com Upcoming books (permalink) "Canny Valley": A limited edition collection of the collages I create for Pluralistic, self-published, September 2025 "Enshittification: Why Everything Suddenly Got Worse and What to Do About It," Farrar, Straus, Giroux, October 7 2025 https://us.macmillan.com/books/9780374619329/enshittification/ "Unauthorized Bread": a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2026 "Enshittification, Why Everything Suddenly Got Worse and What to Do About It" (the graphic novel), Firstsecond, 2026 "The Memex Method," Farrar, Straus, Giroux, 2026 "The Reverse-Centaur's Guide to AI," a short book about being a better AI critic, Farrar, Straus and Giroux, 2026 Colophon (permalink) Today's top sources: Currently writing: "The Reverse Centaur's Guide to AI," a short book for Farrar, Straus and Giroux about being an effective AI critic. (1090 words yesterday, 45491 words total). A Little Brother short story about DIY insulin PLANNING This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net. https://creativecommons.org/licenses/by/4.0/ Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution. How to get Pluralistic: Blog (no ads, tracking, or data-collection): Pluralistic.net Newsletter (no ads, tracking, or data-collection): https://pluralistic.net/plura-list Mastodon (no ads, tracking, or data-collection): https://mamot.fr/@pluralistic Medium (no ads, paywalled): https://doctorow.medium.com/ Twitter (mass-scale, unrestricted, third-party surveillance and advertising): https://twitter.com/doctorow Tumblr (mass-scale, unrestricted, third-party surveillance and advertising): https://mostlysignssomeportents.tumblr.com/tagged/pluralistic "When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla READ CAREFULLY: By reading this, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ISSN: 3066-764X

16 hours ago 3 votes
ML for SWEs 65: The AI bubble is popping and why that's a good thing

The future of the industry and how to get the most out of your AI coding assistant

2 days ago 4 votes
Pluralistic: By all means, tread on those people (26 Aug 2025)

Today's links By all means, tread on those people: We know you love freedom, we just wish you'd share. Hey look at this: Delights to delectate. Object permanence: The right to bear cameras; GOP wants slavery for undocumented migrants; Telepresence Nazi-punching. Upcoming appearances: Where to find me. Recent appearances: Where I've been. Latest books: You keep readin' em, I'll keep writin' 'em. Upcoming books: Like I said, I'll keep writin' 'em. Colophon: All the rest. By all means, tread on those people (permalink) Just as Martin Niemöller's "First They Came" has become our framework for understanding the rise of fascism in Nazi Germany, so, too is Wilhoit's Law the best way to understand America's decline into fascism: https://en.wikipedia.org/wiki/First_They_Came In case you're not familiar with Frank Wilhoit's amazing law, here it is: Conservatism consists of exactly one proposition, to wit: There must be in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect. https://crookedtimber.org/2018/03/21/liberals-against-progressives/#comment-729288 The thing that makes Wilhoit's Law so apt to this moment – and to our understanding of the recent history that produced this moment – is how it connects the petty with the terrifying, the trivial with the radical, the micro with the macro. It's a way to join the dots between fascists' business dealings, their interpersonal relationships, and their political views. It describes a continuum that ranges from minor commercial grifts to martial law, and shows how tolerance for the former creates the conditions for the latter. The gross ways in which Wilhoit's Law applies are easy to understand. The dollar value of corporate wage-theft far outstrips the total dollars lost to all other forms of property crime, and yet there is virtually no enforcement against bosses who steal their workers' paychecks, while petty property crimes can result in long prison sentences (depending on your skin color and/or bank balance): https://www.opportunityinstitute.org/blog/post/organized-retail-theft-wage-theft/ Elon Musk values "free speech" and insists on his right to brand innocent people as "pedos," but he also wants the courts to destroy organizations that publish their opinions about his shitty business practices: https://www.mediamatters.org/elon-musk Fascists turn crybaby when they're imprisoned for attempting a murderous coup, but buy merch celebrating the construction of domestic concentration camps where people are locked up without trial: https://officialalligatoralcatraz.com/shop That stuff is all easy to see, but I want to draw a line between these gross violations of Wilhoit's Law and pettier practices that have been creating the conditions for the present day Wilhoit Dystopia. Take terms of service. The Federalist Society – whose law library could save a lot of space by throwing away all its books and replacing them with a framed copy of Wilhoit's Law – has long held that merely glancing at a web-page or traversing the doorway of a shop is all it takes for you to enter into a "contract" by which you surrender all of your rights. Every major corporation – and many smaller ones – now routinely seek to bind both workers and customers to garbage-novellas of onerous, unreadable legal conditions. If we accept that this is how contracts work, then this should be perfectly valid, right? By reading these words, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. This indemnity will survive the termination of your relationship with your employer. I mean, why not? What principle – other than "in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect" – makes terms of service valid, and this invalid? Then there's binding arbitration. Corporations routinely bind their workers and customers to terms that force them to surrender their right to sue, no matter how badly they are injured through malice or gross negligence. This practice used to be illegal, until Antonin Scalia opened the hellmouth and unleashed binding arbitration on the world: https://brooklynworks.brooklaw.edu/cgi/viewcontent.cgi?article=1443&&context=blr There's a pretty clever hack around binding arbitration: mass arbitration, whereby lots of wronged people coordinate to file claims, which can cost a dirty corporation more than a plain old class-action suit: https://pluralistic.net/2021/06/02/arbitrary-arbitration/#petard Of course, Wilhoit's Law provides corporations with a way around this: they can reserve the right not to arbitrate and to force you into a class action suit if that's advantageous to them: https://pluralistic.net/2025/08/15/dogs-breakfast/#by-clicking-this-you-agree-on-behalf-of-your-employer-to-release-me-from-all-obligations-and-waivers-arising-from-any-and-all-NON-NEGOTIATED-agreements Heads they win, tails you lose. Or take the nature of property rights themselves. Conservatives say they revere property rights above all else, claiming that every other human right stems from the vigorous enforcement of property relations. What is private property? For that, we turn to the key grifter thinkfluencer Sir William Blackstone, and his 1768 "Commentaries on the Laws of England": That sole and despotic dominion which one man claims and exercises over the external things of the world, in total exclusion of the right of any other individual in the universe. https://oll.libertyfund.org/pages/blackstone-on-property-1753 Corporations love the idea of their property rights, but they're not so keen on your property rights. Think of the practice of locking down digital devices – from phones to cars to tractors – so that they can't be repaired by third parties, use generic ink or parts, or load third-party apps except via an "app store": https://memex.craphound.com/2012/01/10/lockdown-the-coming-war-on-general-purpose-computing/ A device you own, but can only use in ways that its manufacturer approves of, sure doesn't sound like "sole and despotic dominion" to me. Some corporations (and their weird apologists) like to claim that, by buying their product, you've agreed not to use it except in ways that benefit their shareholders, even when that is to your own detriment: https://pluralistic.net/2024/01/12/youre-holding-it-wrong/#if-dishwashers-were-iphones Apple will say, "We've been selling iPhones for nearly 20 years now. It can't possibly come as a surprise to you that you're not allowed to install apps that we haven't approved. If that's important to you, you shouldn't have bought an iPhone." But the obvious rejoinder to this is, "People have been given sole and despotic dominion over the things they purchased since time immemorial. If the thought of your customers using their property in ways that displease you causes you to become emotionally disregulated, perhaps you shouldn't have gotten into the manufacturing business." But as indefensibly wilhoitian as Apple's behavior might be, Google has just achieved new depths of wilhoitian depravity, with a rule that says that starting soon, you will no longer be able to install apps of your choosing on your Android device unless Google first approves of them: https://9to5google.com/2025/08/25/android-apps-developer-verification/ Like Apple, Google says that this is to prevent you from accidentally installing malicious software. Like Apple, Google does put a lot of effort into preventing its customers from being remotely attacked. And, like Apple, Google will not protect you from itself: https://pluralistic.net/2023/02/05/battery-vampire/#drained When it comes to vetoing your decisions about which programs your Android device can run, Google has an irreconcilable conflict of interest. Google, after all, is a thrice-convicted monopolist who have an interest in blocking you from installing programs that interfere with its profits, under the pretense of preventing you from coming to harm. And – like Apple – Google has a track record of selling its users out to oppressive governments. Apple blocked all working privacy tools for its Chinese users at the behest of the Chinese government, while Google secretly planned to release a version of its search engine that would enforce Chinese censorship edicts and help the Chinese government spy on its people: https://en.wikipedia.org/wiki/Dragonfly_(search_engine) Google's CEO Sundar Pichai, personally gave one million dollars to Donald Trump for a seat on the dais at this year's inauguration (so did Apple CEO Tim Cook). Both men are in a position to help the self-described dictator make good on his promise to spy on and arrest Americans who disagree with his totalitarian edicts. All of this makes Google's announcement extraordinarily reckless, but also very, very wilhoitian. After all, Google jealously guards its property rights from you, but insists that your property rights need to be subordinated to its corporate priorities: "in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect." We can see this at work in the way that Google treats open source software and free software. Google's software is "open source" – for us. We have the right to look at the code and do free work for Google to identify and fix bugs in the code. But only Google gets a say in how that code is deployed on its cloud servers. They have software freedom, while we merely have software transparency: https://pluralistic.net/2025/07/14/pole-star/#gnus-not-utilitarian Big companies love to both assert their own property rights while denying you yours. Take the music industry: they are required to pay different royalties to musicians depending on whether they're "selling" music, or "licensing" music. Sales pay a fraction of the royalties of a licensing deal, so it's far better for musicians when their label licenses their music than when they sell it. When you or I click the "buy" button in an online music store, we are confronted with a "licensing agreement," that limits what we may do with our digital purchase. Things that you get automatically when you buy music in physical form – on a CD, say – are withheld through these agreements. You can't re-sell your digital purchases as used goods. You can't give them away. You can't lend them out. You can't divide them up in a divorce. You can't leave them to your kids in your will. It's not a sale, so the file isn't your property. But when the label accounts for that licensing deal to a musician, the transaction is booked as a sale, which entitles the creative worker to a fraction of the royalties that they'd get from a license. Somehow, digital media exists in quantum superposition: it is a licensing deal when we click the buy button, but it is a sale when it shows up on a royalty statement. It's Schroedinger's download: https://pluralistic.net/2022/06/21/early-adopters/#heads-i-win Now, a class action suit against Amazon over this very issue has been given leave to progress to trial: https://www.hollywoodreporter.com/business/business-news/prime-video-lawsuit-movie-license-ownership-1236353127/ The plaintiffs insist that because Amazon showed them a button that said, "Buy this video" but then slapped it with licensing conditions that take away all kinds of rights (Amazon can even remotely delete your videos after you "buy" them) that they have been ripped off in a bait-and-switch. Amazon's defense is amazing. They've done what any ill-prepared fifth grader would do when called on the carpet; they quoted Webster's: Quoting Webster’s Dictionary, it said that the term means “rights to the use or services of payment” rather than perpetual ownership and that its disclosures properly warn people that they may lose access. People are increasingly pissed off with this bullshit, whereby things that you "buy" are not yours, and your access to them can be terminated at any time. The Stop Killing Games campaign is pushing for the rights of gamers to own the games they buy forever, even if the company decides to shut down its servers: https://www.stopkillinggames.com/ I've been pissed off about this bullshit since forever. It's one of the main reasons I convinced my publishers to let me sell my own ebooks and audiobooks, out of my own digital storefront. All of those books are sold, not licensed, and come without any terms or conditions: https://craphound.com/shop/ The ability to change the terms after the sale is a major source of enshittification. I call it the "Darth Vader MBA," as in "I am altering the deal. Pray I do not alter it any further": https://pluralistic.net/2023/10/26/hit-with-a-brick/#graceful-failure Naturally the ebooks and audiobooks in the Kickstarter for pre-sales of my next book, Enshittification are also sold without any terms and conditions: https://www.kickstarter.com/projects/doctorow/enshittification-the-drm-free-audiobook/ Look, I don't think that personal consumption choices can fix systemic problems. You're not going to fix enshittification – let alone tyranny – by shopping, even if you're very careful: https://pluralistic.net/2025/07/31/unsatisfying-answers/#systemic-problems But that doesn't mean that there isn't a connection between the unfair bullshit that monopolies cram down our throat and the rise of fascism. It's not just that the worst enshittifiers also the biggest Trump donors, it's that Wilhoit's Law powers enshittification. Wiloitism is shot through the Maga movement. The Flu Klux Klan wants to ban you from wearing a mask for health reasons, but they will defend to the death the right of ICE brownshirts to run around in gaiters and Oakleys as they kidnap our neighbors off the streets. Conservative bedwetters will donate six figures to a Givesendgo set up by some crybaby with a viral Rumble video about getting 86'ed from a restaurant for wearing a Maga hat, but they literally want to imprison trans people for wearing clothes that don't conform to their assigned-at-birth genders. They'll piss and moan about being "canceled" because of hecklers at the speeches they give for the campus chapter of the Hitler Youth, but they experience life-threatening priapism when students who object to the Israeli genocide of Palestinians are expelled, arrested and deported. Then there's their abortion policies, which hold that personhood begins at conception, but ends at birth, and can only be re-established by forming an LLC. It's "in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect" all the way down. I'm not saying that bullshit terms of service, wage theft, binding arbitration gotchas, or victim complexes about your kids going no-contact because you won't shut the fuck up about "the illegals" at Thanksgiving are the same as the actual fascist dictatorship being born around us right now or the genocide taking place in Gaza. But I am saying that they come from the same place. The ideology of "in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect" underpins the whole ugly mess. After we defeat these fucking fascists, after the next installment of the Nuremburg trials, after these eichmenn and eichwomenn get their turns in the dock, we're going to have to figure out how to keep them firmly stuck to the scrapheap of history. For this, I propose a form of broken windows policing; zero-tolerance for any activity or conduct that implies that there are "in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect." We should treat every attempt to pull any of these scams as an inch (or a yard, or a mile) down the road to fascist collapse. We shouldn't suffer practitioners of this ideology to be in our company, to run our institutions, or to work alongside of us. We should recognize them for the monsters they are. Hey look at this (permalink) Citizen Is Using AI to Generate Crime Alerts With No Human Review. It’s Making a Lot of Mistakes https://www.404media.co/citizen-is-using-ai-to-generate-crime-alerts-with-no-human-review-its-making-a-lot-of-mistakes/ How To Argue With An AI Booster https://www.wheresyoured.at/how-to-argue-with-an-ai-booster/ We must fight age verification with all we have https://www.usermag.co/p/we-must-fight-age-verification-with Sqinks: A Transreal Cyberpunk Love Story https://www.kickstarter.com/projects/rudyrucker/sqinks LibreOffice 25.8: a Strategic Asset for Governments and Enterprises Focused on Digital Sovereignty and Privacy https://blog.documentfoundation.org/blog/2025/08/25/libreoffice-25-8-backgrounder/ Object permanence (permalink) #20yrsago Oakland sheriffs detain people for carrying cameras https://thomashawk.com/2005/08/right-to-bear-cameras.html #10yrsago New Zealand gov’t promises secret courts for accused terrorists https://www.nzherald.co.nz/nz/attorney-general-says-law-society-got-it-wrong-over-secret-courts/E5JHYBTMVSIBZ62UNGEWB4DPEA/?c_id=1&objectid=11503094 #10yrsago Platform Cooperativism: a worker-owned Uber for everything https://platformcoop.net/ #10yrsago GOP “kingmaker” proposes enslavement as an answer to undocumented migrants https://www.thedailybeast.com/iowa-gop-kingmaker-has-a-slavery-proposal-for-immigration/ #10yrsago Six years after unprovoked beating, Denver cop finally fired https://kdvr.com/news/video-evidence-determined-fate-of-denver-officer-in-excessive-force-dispute-fired-after-6-years/ #10yrsago Samsung fridges can leak your Gmail logins https://web.archive.org/web/20150825014450/https://www.pentestpartners.com/blog/hacking-defcon-23s-iot-village-samsung-fridge/ #10yrsago German student ditches apartment, buys an unlimited train pass https://www.washingtonpost.com/news/worldviews/wp/2015/08/22/how-one-german-millennial-chose-to-live-on-trains-rather-than-pay-rent/ #10yrsago Ashley Madison’s founding CTO claimed he hacked competing dating site https://www.wired.com/2015/08/ashley-madison-leak-reveals-ex-cto-hacked-competing-site/ #5yrsago Telepresence Nazi-punching https://pluralistic.net/2020/08/25/anxietypunk/#smartibots #5yrsago Ballistic Kiss https://pluralistic.net/2020/08/25/anxietypunk/#bk Upcoming appearances (permalink) Ithaca: AD White keynote (Cornell), Sep 12 https://deanoffaculty.cornell.edu/events/keynote-cory-doctorow-professor-at-large/ DC: Enshittification at Politics and Prose, Oct 8 https://politics-prose.com/cory-doctorow-10825 New Orleans: DeepSouthCon63, Oct 10-12 http://www.contraflowscifi.org/ Chicago: Enshittification with Kara Swisher (Chicago Humanities), Oct 15 https://www.oldtownschool.org/concerts/2025/10-15-2025-kara-swisher-and-cory-doctorow-on-enshittification/ San Francisco: Enshittification at Public Works (The Booksmith), Oct 20 https://app.gopassage.com/events/doctorow25 Miami: Enshittification at Books & Books, Nov 5 https://www.eventbrite.com/e/an-evening-with-cory-doctorow-tickets-1504647263469 Recent appearances (permalink) Divesting from Amazon’s Audible and the Fight for Digital Rights (Libro.fm) https://pocketcasts.com/podcasts/9349e8d0-a87f-013a-d8af-0acc26574db2/00e6cbcf-7f27-4589-a11e-93e4ab59c04b The Utopias Podcast https://www.buzzsprout.com/2272465/episodes/17650124 Tariffs vs IP Law (Firewalls Don't Stop Dragons) https://www.youtube.com/watch?v=LFABFe-5-uQ Latest books (permalink) "Picks and Shovels": a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books (US), Head of Zeus (UK), February 2025 (https://us.macmillan.com/books/9781250865908/picksandshovels). "The Bezzle": a sequel to "Red Team Blues," about prison-tech and other grifts, Tor Books (US), Head of Zeus (UK), February 2024 (the-bezzle.org). "The Lost Cause:" a solarpunk novel of hope in the climate emergency, Tor Books (US), Head of Zeus (UK), November 2023 (http://lost-cause.org). "The Internet Con": A nonfiction book about interoperability and Big Tech (Verso) September 2023 (http://seizethemeansofcomputation.org). Signed copies at Book Soup (https://www.booksoup.com/book/9781804291245). "Red Team Blues": "A grabby, compulsive thriller that will leave you knowing more about how the world works than you did before." Tor Books http://redteamblues.com. "Chokepoint Capitalism: How to Beat Big Tech, Tame Big Content, and Get Artists Paid, with Rebecca Giblin", on how to unrig the markets for creative labor, Beacon Press/Scribe 2022 https://chokepointcapitalism.com Upcoming books (permalink) "Canny Valley": A limited edition collection of the collages I create for Pluralistic, self-published, September 2025 "Enshittification: Why Everything Suddenly Got Worse and What to Do About It," Farrar, Straus, Giroux, October 7 2025 https://us.macmillan.com/books/9780374619329/enshittification/ "Unauthorized Bread": a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2026 "Enshittification, Why Everything Suddenly Got Worse and What to Do About It" (the graphic novel), Firstsecond, 2026 "The Memex Method," Farrar, Straus, Giroux, 2026 "The Reverse-Centaur's Guide to AI," a short book about being a better AI critic, Farrar, Straus and Giroux, 2026 Colophon (permalink) Today's top sources: Currently writing: "The Reverse Centaur's Guide to AI," a short book for Farrar, Straus and Giroux about being an effective AI critic. (1019 words yesterday, 42282 words total). A Little Brother short story about DIY insulin PLANNING This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net. https://creativecommons.org/licenses/by/4.0/ Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution. How to get Pluralistic: Blog (no ads, tracking, or data-collection): Pluralistic.net Newsletter (no ads, tracking, or data-collection): https://pluralistic.net/plura-list Mastodon (no ads, tracking, or data-collection): https://mamot.fr/@pluralistic Medium (no ads, paywalled): https://doctorow.medium.com/ Twitter (mass-scale, unrestricted, third-party surveillance and advertising): https://twitter.com/doctorow Tumblr (mass-scale, unrestricted, third-party surveillance and advertising): https://mostlysignssomeportents.tumblr.com/tagged/pluralistic "When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla READ CAREFULLY: By reading this, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ISSN: 3066-764X

3 days ago 5 votes