A Survival Guide to a PhD

from Andrej Karpathy blog [alt+shift+b] in AI

This guide is patterned after my “Doing well in your courses”, a post I wrote a long time ago on some of the tips/tricks I’ve developed during my undergrad. I’ve received nice comments about that guide, so in the same spirit, now that my PhD has come to an end I wanted to compile a similar retrospective document in hopes that it might be helpful to some. Unlike the undergraduate guide, this one was much more difficult to write because there is significantly more variation in how one can traverse the PhD experience. Therefore, many things are likely contentious and a good fraction will be specific to what I’m familiar with (Computer Science / Machine Learning / Computer Vision research). But disclaimers are boring, lets get to it! Preliminaries First, should you want to get a PhD? I was in a fortunate position of knowing since young age that I really wanted a PhD. Unfortunately it wasn’t for any very well-thought-through considerations: First, I really liked school and learning things...

over a year ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Comments

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Andrej Karpathy blog

Self-driving as a case study for AGI

Sparked by progress in Large Language Models (LLMs), there’s a lot of chatter recently about AGI, its timelines, and what it might look like. Some of it is hopeful and optimistic, but a lot of it is fearful and doomy, to put it mildly. Unfortunately, a lot of it is also very abstract, which causes people to speak past each other in circles. Therefore, I’m always on a lookout for concrete analogies and historical precedents that can help explore the topic in more grounded terms. In particular, when I am asked about what I think AGI will look like, I personally like to point to self-driving. In this post, I’d like to explain why. Let’s start with one common definition of AGI: AGI: An autonomous system that surpasses human capabilities in the majority of economically valuable work. Note that there are two specific requirements in this definition. First, it is a system that has full autonomy, i.e. it operates on its own with very little to no human supervision. Second, it operates autonomously across the majority of economically valuable work. To make this part concrete, I personally like to refer to U.S. Bureau of labor statistics index of occupations. A system that has both of these properties we would call an AGI. What I would like to suggest in this post is that recent developments in our ability to automate driving is a very good early case study of the societal dynamics of increasing automation, and by extension what AGI in general will look and feel like. I think this is because of a few features of this space that loosely just say that “it is a big deal”: Self-driving is very accessible and visible to society (cars with no drivers on the streets!), it is a large part of the economy by size, it presently employs a large human workforce (e.g. think Uber/Lyft drivers), and driving is a sufficiently difficult problem to automate, but automate it we did (ahead of many other sectors of the economy), and society has noticed and is responding to it. There are of course other industries that have also been dramatically automated, but either I am personally less familiar with them, or they fall short of some of the properties above. partial automation As a “sufficiently difficult” problem in AI, automation of driving did not pop into existence out of nowhere; It is a result of a gradual process of automating the driving task, with a lot of “tool AI” intermediates. In vehicle autonomy, many cars are now manufactured with a “Level 2” driver assist - an AI that collaborates with a human to get from point A to point B. It is not fully autonomous but it handles a lot of the low-level details of driving. Sometimes it automates entire maneuvers, e.g. the car might park for you. The human primarily acts as the supervisor of this activity, but can in principle take over at any time and perform the driving task, or issue a high-level command (e.g. request a lane change). In some cases (e.g. lane following and quick decision making), the AI outperforms human capability, but it can still fall short of it in rare scenarios. This is analogous to a lot of tool AIs that we are starting to see deployed in other industries, especially with the recent capability unlock due to Large Language Models (LLMs). For example, as a programmer, when I use GitHub Copilot to auto-complete a block of code, or GPT-4 to write a bigger function, I am handing off low-level details to the automation, but in the exact same way, I can also step in with an “intervention” should the need arise. That is, Copilot and GPT-4 are Level 2 programming. There are many Level 2 automations across the industry, not all of them necessarily based on LLMs - from TurboTax, to robots in Amazon warehouses, to many other “tool AIs” in translation, writing, art, legal, marketing, etc. full automation At some point, these systems cross the threshold of reliability and become what looks like Waymo today. They creep into the realm of full autonomy. In San Francisco today, you can open up an app and call a Waymo instead of an Uber. A driverless car will pull up and take you, a paying customer, to your destination. This is amazing. You need not know how to drive, you need not pay attention, you can lean back and take a nap, while the system transports you from A to B. Like many others I’ve talked to, I personally prefer to take a Waymo over Uber and I’ve switched to it almost exclusively for within-city transportation. You get a lot more low-variance, reproducible experience, the driving is smooth, you can play music, and you can chat with friends without spending mental resources thinking about what the driver is thinking listening to you. the mixed economy of full automation And yet, even though autonomous driving technology now exists, there are still plenty of people calling an Uber alongside. How come? Well first, many people simply don’t even know that you can call a Waymo. But even if they do, many people don’t fully trust the automated system just yet and prefer to have a human drive them. But even if they did, many people might just prefer a human driver, and e.g. enjoy the talk and banter and getting to know other people. Beyond just preferences alone, judging by the increasing wait times in the app today, Waymo is supply constrained. There are not enough cars to meet the demand. A part of this may be that Waymo is being very careful to manage and monitor risk and public opinion. Another part is that Waymo, I believe (?), has a quota of how many cars they are allowed to have deployed on the streets, coming from regulators. Another rate-limiter is that Waymos can’t just replace all the Ubers right away in a snap of a finger. They have to build out the infrastructure, build the cars, scale their operations. I posit that all kinds of automations in other sectors of the economy will look identical - some people/companies will use them immediately, but a lot of people 1) won’t know about them, 2) if they do, won’t trust them, 3) if they did, they still prefer to employ and work with a human. But on top of that, demand is greater than supply and AGI would be constrained in exactly all of these ways, for exactly all of the same reasons - some amount of self-restraint from the developers, some amount of regulation, and some amount of simple, straight-up resource shortage, e.g. needing to build out more GPU datacenters. the globalization of full automation As I already hinted on with resource constraints, the full globalization of this technology is still very expensive, work-intensive, and rate-limiting. Today, Waymo can only drive in San Francisco and Phoenix, but the approach itself is fairly general and scalable, so the company might e.g. soon expand to LA, Austin or etc. The product may also still be constrained by other environmental factors, e.g. driving in heavy snow. And in some rare cases, it might even need rescue from a human operator. The expansion of capability does not come “for free”. For example, Waymo has to expend resources to enter a new city. They have to establish a presence, map the streets, adjust the perception and planner/controller to some unique situations, or to local rules or regulations specific to that area. In our working analogy, many jobs may have full autonomy only in some settings or conditions, and expanding the coverage will require work and effort. In both cases, the approach itself is general and scalable and the frontier will expand, but can only do so over time. society reacts Another aspect that I find fascinating about the ongoing introduction of self-driving to society is that just a few years ago, there was a ton of commentary and FUD everywhere about oh “will it”, “won’t it” work, is it even possible or not, and it was a whole thing. And now self-driving is actually here. Not as a research prototype but as a product - I can exchange money for fully automated transportation. In its present operating range, the industry has reached full autonomy. And yet, overall it’s almost like no one cares. Most people I talk to (even in tech!) don’t even know that this happened. When your Waymo is driving through the streets of SF, you’ll see many people look at it as an oddity. First they are surprised and stare. Then they seem to move on with their lives. When full autonomy gets introduced in other industries, maybe the world doesn’t just blow in a storm. The majority of people may not even realize it at first. When they do, they might stare and then shrug, in a way that ranges anywhere from denial to acceptance. Some people get really upset about it, and do the equivalent of putting cones on Waymos in protest, whatever the equivalent of that may be. Of course, we’ve come nowhere close to seeing this aspect fully play out just yet, but when it does I expect it to be broadly predictive. economic impact Let’s turn to jobs. Certainly, and visibly, Waymo has deleted the driver of the car. But it has also created a lot of other jobs that were not there before and are a lot less visible - the human labeler helping to collect training data for neural networks, the support agent who remotely connects to the vehicles that run into any trouble, the people building and maintaining the car fleet, the maps, etc. An entire new industry of various sensors and related infrastructure is created to assemble these highly-instrumented, high-tech cars in the first place. In the same way with work more generally, many jobs will change, some jobs will disappear, but many new jobs will appear, too. It is a lot more a refactoring of work instead of direct deletion, even if that deletion is the most prominent part. It’s hard to argue that the overall numbers won’t trend down at some point and over time, but this happens significantly slower than a person naively looking at the situation might think. competitive landscape The final aspect I’d like to consider is the competitive landscape. A few years ago there were many, many self-driving car companies. Today, in recognition of the difficulty of this problem (which I think is only just barely possible to automate given the current state of the art in AI and computing more generally), the ecosystem has significantly consolidated and Waymo has reached the first feature-complete demonstration of the self-driving future. However, a number of companies are in pursuit, including e.g. Cruise, Zoox, and of course, my personal favorite :), Tesla. A brief note here given my specific history and involved with this space. As I see it, the ultimate goal of the self-driving industry is to achieve full autonomy globally. Waymo has taken the strategy of first going for autonomy and then scaling globally, while Tesla has taken the strategy of first going globally and then scaling autonomy. Today, I am a happy customer of the products of both companies and, personally, I cheer for the technology overall first. However, one company has a lot of primarily software work remaining while the other has a lot of primarily hardware work remaining. I have my bets for which one goes faster. All that said, in the same way, many other sectors of the economy may go through a time of rapid growth and expansion (think ~2015 era of self-driving), but if the analogy holds, only to later consolidate into a small few companies battling it out. And in the midst of it all, there will be a lot of actively used Tool AIs (think: today’s Level 2 ADAS features), and even some open platforms (think: Comma). AGI So these are the broad strokes of what I think AGI will look like. Now just copy paste this across the economy in your mind, happening at different rates, and with all kinds of difficult to predict interactions and second order effects. I don’t expect it to hold perfectly, but I expect it to be a useful model to have in mind and to draw on. On a kind of memetic spectrum, it looks a lot less like a recursively self-improving superintelligence that escapes our control into cyberspace to manufacture deadly pathogens or nanobots that turn the galaxy into gray goo. And it looks a lot more like self-driving, the part of our economy that is currently speed-running the development of a major, society-altering automation. It has a gradual progression, it has the society as an observer and a participant, and its expansion is rate-limited in a large variety of ways, including regulation and resources of an educated human workforce, information, material, and energy. The world doesn’t explode, it adapts, changes and refactors. In self-driving specifically, the automation of transportation will make it a lot safer, cities will become a lot less smoggy and congested, and parking lots and parked cars will disappear from the sides of our roads to make more space for people. I personally very much look forward to what all the equivalents of that might be with AGI.

a year ago • 139 votes

Deep Neural Nets: 33 years ago and 33 years from now

.post-header h1 { font-size: 35px; } .post pre, .post code { background-color: #fcfcfc; font-size: 13px; /* make code smaller for this post... */ } The Yann LeCun et al. (1989) paper Backpropagation Applied to Handwritten Zip Code Recognition is I believe of some historical significance because it is, to my knowledge, the earliest real-world application of a neural net trained end-to-end with backpropagation. Except for the tiny dataset (7291 16x16 grayscale images of digits) and the tiny neural network used (only 1,000 neurons), this paper reads remarkably modern today, 33 years later - it lays out a dataset, describes the neural net architecture, loss function, optimization, and reports the experimental classification error rates over training and test sets. It’s all very recognizable and type checks as a modern deep learning paper, except it is from 33 years ago. So I set out to reproduce the paper 1) for fun, but 2) to use the exercise as a case study on the nature of progress in deep learning. Implementation. I tried to follow the paper as close as possible and re-implemented everything in PyTorch in this karpathy/lecun1989-repro github repo. The original network was implemented in Lisp using the Bottou and LeCun 1988 backpropagation simulator SN (later named Lush). The paper is in french so I can’t super read it, but from the syntax it looks like you can specify neural nets using higher-level API similar to what you’d do in something like PyTorch today. As a quick note on software design, modern libraries have adopted a design that splits into 3 components: 1) a fast (C/CUDA) general Tensor library that implements basic mathematical operations over multi-dimensional tensors, and 2) an autograd engine that tracks the forward compute graph and can generate operations for the backward pass, and 3) a scriptable (Python) deep-learning-aware, high-level API of common deep learning operations, layers, architectures, optimizers, loss functions, etc. Training. During the course of training we have to make 23 passes over the training set of 7291 examples, for a total of 167,693 presentations of (example, label) to the neural network. The original network trained for 3 days on a SUN-4/260 workstation. I ran my implementation on my MacBook Air (M1) CPU, which crunched through it in about 90 seconds (~3000X naive speedup). My conda is setup to use the native arm64 builds, rather than Rosetta emulation. The speedup may have been more dramatic if PyTorch had support for the full capability of the M1 (including the GPU and the NPU), but this seems to still be in development. I also tried naively running the code on an A100 GPU, but the training was actually slower, most likely because the network is so tiny (4 layer convnet with up to 12 channels, total of 9760 params, 64K MACs, 1K activations), and the SGD uses only a single example at a time. That said, if one really wanted to crush this problem with modern hardware (A100) and software infrastructure (CUDA, PyTorch), we’d need to trade per-example SGD for full-batch training to maximize GPU utilization and most likely achieve another ~100X speedup of training latency. Reproducing 1989 performance. The original paper reports the following results: eval: split train. loss 2.5e-3. error 0.14%. misses: 10 eval: split test . loss 1.8e-2. error 5.00%. misses: 102 While my training script repro.py in its current form prints at the end of the 23rd pass: eval: split train. loss 4.073383e-03. error 0.62%. misses: 45 eval: split test . loss 2.838382e-02. error 4.09%. misses: 82 So I am reproducing the numbers roughly, but not exactly. Sadly, an exact reproduction is most likely not possible because the original dataset has, I believe, been lost to time. Instead, I had to simulate it using the larger MNIST dataset (hah never thought I’d say that) by taking its 28x28 digits, scaling them down to 16x16 pixels with bilinear interpolation, and randomly without replacement drawing the correct number of training and test set examples from it. But I am sure there are other culprits at play. For example, the paper is a bit too abstract in its description of the weight initialization scheme, and I suspect that there are some formatting errors in the pdf file that, for example, erase dots “.”, making “2.5” look like like “2 5”, and potentially (I think?) erasing square roots. E.g. we’re told that the weight init is drawn from uniform “2 4 / F” where F is the fan-in, but I am guessing this surely (?) means “2.4 / sqrt(F)”, where the sqrt helps preserve the standard deviation of outputs. The specific sparse connectivity structure between the H1 and H2 layers of the net are also brushed over, the paper just says it is “chosen according to a scheme that will not be discussed here”, so I had to make some some sensible guesses here with an overlapping block sparse structure. The paper also claims to use tanh non-linearity, but I am worried this may have actually been the “normalized tanh” that maps ntanh(1) = 1, and potentially with an added scaled-down skip connection, which was trendy at the time to ensure there is at least a bit of gradient in the flat tails of the tanh. Lastly, the paper uses a “special version of Newton’s algorithm that uses a positive, diagonal approximation of Hessian”, but I only used SGD because it is significantly simpler and, according to the paper, “this algorithm is not believed to bring a tremendous increase in learning speed”. Cheating with time travel. Around this point came my favorite part. We are living here 33 years in the future and deep learning is a highly active area of research. How much can we improve on the original result using our modern understanding and 33 years of R&D? My original result was: eval: split train. loss 4.073383e-03. error 0.62%. misses: 45 eval: split test . loss 2.838382e-02. error 4.09%. misses: 82 The first thing I was a bit sketched out about is that we are doing simple classification into 10 categories, but at the time this was modeled as a mean squared error (MSE) regression into targets -1 (for negative class) or +1 (for positive class), with output neurons that also had the tanh non-linearity. So I deleted the tanh on output layers to get class logits and swapped in the standard (multiclass) cross entropy loss function. This change dramatically improved the training error, completely overfitting the training set: eval: split train. loss 9.536698e-06. error 0.00%. misses: 0 eval: split test . loss 9.536698e-06. error 4.38%. misses: 87 I suspect one has to be much more careful with weight initialization details if your output layer has the (saturating) tanh non-linearity and an MSE error on top of it. Next, in my experience a very finely-tuned SGD can work very well, but the modern Adam optimizer (learning rate of 3e-4, of course :)) is almost always a strong baseline and needs little to no tuning. So to improve my confidence that optimization was not holding back performance, I switched to AdamW with LR 3e-4, and decay it down to 1e-4 over the course of training, giving: eval: split train. loss 0.000000e+00. error 0.00%. misses: 0 eval: split test . loss 0.000000e+00. error 3.59%. misses: 72 This gave a slightly improved result on top of SGD, except we also have to remember that a little bit of weight decay came in for the ride as well via the default parameters, which helps fight the overfitting situation. As we are still heavily overfitting, next I introduced a simple data augmentation strategy where I shift the input images by up to 1 pixel horizontally or vertically. However, because this simulates an increase in the size of the dataset, I also had to increase the number of passes from 23 to 60 (I verified that just naively increasing passes in original setting did not substantially improve results): eval: split train. loss 8.780676e-04. error 1.70%. misses: 123 eval: split test . loss 8.780676e-04. error 2.19%. misses: 43 As can be seen in the test error, that helped quite a bit! Data augmentation is a fairly simple and very standard concept used to fight overfitting, but I didn’t see it mentioned in the 1989 paper, perhaps it was a more recent innovation (?). Since we are still overfitting a bit, I reached for another modern tool in the toolbox, Dropout. I added a weak dropout of 0.25 just before the layer with the largest number of parameters (H3). Because dropout sets activations to zero, it doesn’t make as much sense to use it with tanh that has an active range of [-1,1], so I swapped all non-linearities to the much simpler ReLU activation function as well. Because dropout introduces even more noise during training, we also have to train longer, bumping up to 80 passes, but giving: eval: split train. loss 2.601336e-03. error 1.47%. misses: 106 eval: split test . loss 2.601336e-03. error 1.59%. misses: 32 Which brings us down to only 32 / 2007 mistakes on the test set! I verified that just swapping tanh -> relu in the original network did not give substantial gains, so most of the improvement here is coming from the addition of dropout. In summary, if I time traveled to 1989 I’d be able to cut the rate of errors by about 60%, taking us from ~80 to ~30 mistakes, and an overall error rate of ~1.5% on the test set. This gain did not come completely free because we also almost 4X’d the training time, which would have increased the 1989 training time from 3 days to almost 12. But the inference latency would not have been impacted. The remaining errors are here: Going further. However, after swapping MSE -> Softmax, SGD -> AdamW, adding data augmentation, dropout, and swapping tanh -> relu I’ve started to taper out on the low hanging fruit of ideas. I tried a few more things (e.g. weight normalization), but did not get substantially better results. I also tried to miniaturize a Visual Transformer (ViT)) into a “micro-ViT” that roughly matches the number of parameters and flops, but couldn’t match the performance of a convnet. Of course, many other innovations have been made in the last 33 years, but many of them (e.g. residual connections, layer/batch normalizations) only become relevant in much larger models, and mostly help stabilize large-scale optimization. Further gains at this point would likely have to come from scaling up the size of the network, but this would bloat the test-time inference latency. Cheating with data. Another approach to improving the performance would have been to scale up the dataset, though this would come at a dollar cost of labeling. Our original reproduction baseline, again for reference, was: eval: split train. loss 4.073383e-03. error 0.62%. misses: 45 eval: split test . loss 2.838382e-02. error 4.09%. misses: 82 Using the fact that we have all of MNIST available to us, we can simply try scaling up the training set by ~7X (7,291 to 50,000 examples). Leaving the baseline training running for 100 passes already shows some improvement from the added data alone: eval: split train. loss 1.305315e-02. error 2.03%. misses: 60 eval: split test . loss 1.943992e-02. error 2.74%. misses: 54 But further combining this with the innovations of modern knowledge (described in the previous section) gives the best performance yet: eval: split train. loss 3.238392e-04. error 1.07%. misses: 31 eval: split test . loss 3.238392e-04. error 1.25%. misses: 24 In summary, simply scaling up the dataset in 1989 would have been an effective way to drive up the performance of the system, at no cost to inference latency. Reflections. Let’s summarize what we’ve learned as a 2022 time traveler examining state of the art 1989 deep learning tech: First of all, not much has changed in 33 years on the macro level. We’re still setting up differentiable neural net architectures made of layers of neurons and optimizing them end-to-end with backpropagation and stochastic gradient descent. Everything reads remarkably familiar, except it is smaller. The dataset is a baby by today’s standards: The training set is just 7291 16x16 greyscale images. Today’s vision datasets typically contain a few hundred million high-resolution color images from the web (e.g. Google has JFT-300M, OpenAI CLIP was trained on a 400M), but grow to as large as a small few billion. This is approx. ~1000X pixel information per image (384*384*3/(16*16)) times 100,000X the number of images (1e9/1e4), for a rough 100,000,000X more pixel data at the input. The neural net is also a baby: This 1989 net has approx. 9760 params, 64K MACs, and 1K activations. Modern (vision) neural nets are on the scale of small few billion parameters (1,000,000X) and O(~1e12) MACs (~10,000,000X). Natural language models can reach into trillions of parameters. A state of the art classifier that took 3 days to train on a workstation now trains in 90 seconds on my fanless laptop (3,000X naive speedup), and further ~100X gains are very likely possible by switching to full-batch optimization and utilizing a GPU. I was, in fact, able to tune the model, augmentation, loss function, and the optimization based on modern R&D innovations to cut down the error rate by 60%, while keeping the dataset and the test-time latency of the model unchanged. Modest gains were attainable just by scaling up the dataset alone. Further significant gains would likely have to come from a larger model, which would require more compute, and additional R&D to help stabilize the training at increasing scales. In particular, if I was transported to 1989, I would have ultimately become upper-bounded in my ability to further improve the system without a bigger computer. Suppose that the lessons of this exercise remain invariant in time. What does that imply about deep learning of 2022? What would a time traveler from 2055 think about the performance of current networks? 2055 neural nets are basically the same as 2022 neural nets on the macro level, except bigger. Our datasets and models today look like a joke. Both are somewhere around 10,000,000X larger. One can train 2022 state of the art models in ~1 minute by training naively on their personal computing device as a weekend fun project. Today’s models are not optimally formulated, and just changing some of the details of the model, loss function, augmentation or the optimizer we can about halve the error. Our datasets are too small, and modest gains would come from scaling up the dataset alone. Further gains are actually not possible without expanding the computing infrastructure and investing into some R&D on effectively training models on that scale. But the most important trend I want to comment on is that the whole setting of training a neural network from scratch on some target task (like digit recognition) is quickly becoming outdated due to finetuning, especially with the emergence of foundation models like GPT. These foundation models are trained by only a few institutions with substantial computing resources, and most applications are achieved via lightweight finetuning of part of the network, prompt engineering, or an optional step of data or model distillation into smaller, special-purpose inference networks. I think we should expect this trend to be very much alive, and indeed, intensify. In its most extreme extrapolation, you will not want to train any neural networks at all. In 2055, you will ask a 10,000,000X-sized neural net megabrain to perform some task by speaking (or thinking) to it in English. And if you ask nicely enough, it will oblige. Yes you could train a neural net too… but why would you?

over a year ago • 79 votes

A from-scratch tour of Bitcoin in Python

.wrap { max-width: 900px; } p { font-family: sans-serif; font-size: 15px; font-weight: 300; overflow-wrap: break-word; /* allow wrapping of very very long strings, like txids */ } .post pre, .post code { background-color: #fafafa; font-size: 13px; /* make code smaller for this post... */ } pre { white-space: pre-wrap; /* css-3 */ white-space: -moz-pre-wrap; /* Mozilla, since 1999 */ white-space: -pre-wrap; /* Opera 4-6 */ white-space: -o-pre-wrap; /* Opera 7 */ word-wrap: break-word; /* Internet Explorer 5.5+ */ } I find blockchain fascinating because it extends open source software development to open source + state. This seems to be a genuine/exciting innovation in computing paradigms; We don’t just get to share code, we get to share a running computer, and anyone anywhere can use it in an open and permissionless manner. The seeds of this revolution arguably began with Bitcoin, so I became curious to drill into it in some detail to get an intuitive understanding of how it works. And in the spirit of “what I cannot create I do not understand”, what better way to do this than implement it from scratch? We are going to create, digitally sign, and broadcast a Bitcoin transaction in pure Python, from scratch, and with zero dependencies. In the process we’re going to learn quite a bit about how Bitcoin represents value. Let’s get it. (btw if the visual format of this post annoys you, see the jupyter notebook version, which has identical content). Step 1: generating a crypto identity First we want to generate a brand new cryptographic identity, which is just a private, public keypair. Bitcoin uses Elliptic Curve Cryptography instead of something more common like RSA to secure the transactions. I am not going to do a full introduction to ECC here because others have done a significantly better job, e.g. I found Andrea Corbellini’s blog post series to be an exceptional resource. Here we are just going to write the code but to understand why it works mathematically you’d need to go through the series. Okay so Bitcoin uses the secp256k1 curve. As a newbie to the area I found this part fascinating - there are entire libraries of different curves you can choose from which offer different pros/cons and properties. NIST publishes recommendations on which ones to use, but people prefer to use other curves (like secp256k1) that are less likely to have backdoors built into them. Anyway, an elliptic curve is a fairly low dimensional mathematical object and takes only 3 integers to define: from __future__ import annotations # PEP 563: Postponed Evaluation of Annotations from dataclasses import dataclass # https://docs.python.org/3/library/dataclasses.html I like these a lot @dataclass class Curve: """ Elliptic Curve over the field of integers modulo a prime. Points on the curve satisfy y^2 = x^3 + a*x + b (mod p). """ p: int # the prime modulus of the finite field a: int b: int # secp256k1 uses a = 0, b = 7, so we're dealing with the curve y^2 = x^3 + 7 (mod p) bitcoin_curve = Curve( p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F, a = 0x0000000000000000000000000000000000000000000000000000000000000000, # a = 0 b = 0x0000000000000000000000000000000000000000000000000000000000000007, # b = 7 ) In addition to the actual curve we define a Generator point, which is just some fixed “starting point” on the curve’s cycle, which is used to kick off the “random walk” around the curve. The generator is a publicly known and agreed upon constant: @dataclass class Point: """ An integer point (x,y) on a Curve """ curve: Curve x: int y: int G = Point( bitcoin_curve, x = 0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798, y = 0x483ada7726a3c4655da4fbfc0e1108a8fd17b448a68554199c47d08ffb10d4b8, ) # we can verify that the generator point is indeed on the curve, i.e. y^2 = x^3 + 7 (mod p) print("Generator IS on the curve: ", (G.y**2 - G.x**3 - 7) % bitcoin_curve.p == 0) # some other totally random point will of course not be on the curve, _MOST_ likely import random random.seed(1337) x = random.randrange(0, bitcoin_curve.p) y = random.randrange(0, bitcoin_curve.p) print("Totally random point is not: ", (y**2 - x**3 - 7) % bitcoin_curve.p == 0) Generator IS on the curve: True Totally random point is not: False Finally, the order of the generating point G is known, and is effectively the “size of the set” we are working with in terms of the (x,y) integer tuples on the cycle around the curve. I like to organize this information into one more data structure I’ll call Generator: @dataclass class Generator: """ A generator over a curve: an initial point and the (pre-computed) order """ G: Point # a generator point on the curve n: int # the order of the generating point, so 0*G = n*G = INF bitcoin_gen = Generator( G = G, # the order of G is known and can be mathematically derived n = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141, ) Notice that we haven’t really done anything so far, it’s all just definition of some data structures, and filling them with the publicly known constants related to the elliptic curves used in Bitcoin. This is about to change, as we are ready to generate our private key. The private key (or “secret key” as I’ll call it going forward) is simply a random integer that satisfies 1 <= key < n (recall n is the order of G): # secret_key = random.randrange(1, bitcoin_gen.n) # this is how you _would_ do it secret_key = int.from_bytes(b'Andrej is cool :P', 'big') # this is how I will do it for reproducibility assert 1 <= secret_key < bitcoin_gen.n print(secret_key) 22265090479312778178772228083027296664144 This is our secret key - it is a a pretty unassuming integer but anyone who knows it can control all of the funds you own on the Bitcoin blockchain, associated with it. In the simplest, most common vanilla use case of Bitcoin it is the single “password” that controls your account. Of course, in the exceedingly unlikely case that some other Andrej manually generated their secret key as I did above, the wallet associated with this secret key most likely has a balance of zero bitcoin :). If it didn’t we’d be very lucky indeed. We are now going to generate the public key, which is where things start to get interesting. The public key is the point on the curve that results from adding the generator point to itself secret_key times. i.e. we have: public_key = G + G + G + (secret key times) + G = secret_key * G. Notice that both the ‘+’ (add) and the ‘*’ (times) symbol here is very special and slightly confusing. The secret key is an integer, but the generator point G is an (x,y) tuple that is a Point on the Curve, resulting in an (x,y) tuple public key, again a Point on the Curve. This is where we have to actually define the Addition operator on an elliptic curve. It has a very specific definition and a geometric interpretation (see Andrea’s post above), but the actual implementation is relatively simple: INF = Point(None, None, None) # special point at "infinity", kind of like a zero def extended_euclidean_algorithm(a, b): """ Returns (gcd, x, y) s.t. a * x + b * y == gcd This function implements the extended Euclidean algorithm and runs in O(log b) in the worst case, taken from Wikipedia. """ old_r, r = a, b old_s, s = 1, 0 old_t, t = 0, 1 while r != 0: quotient = old_r // r old_r, r = r, old_r - quotient * r old_s, s = s, old_s - quotient * s old_t, t = t, old_t - quotient * t return old_r, old_s, old_t def inv(n, p): """ returns modular multiplicate inverse m s.t. (n * m) % p == 1 """ gcd, x, y = extended_euclidean_algorithm(n, p) # pylint: disable=unused-variable return x % p def elliptic_curve_addition(self, other: Point) -> Point: # handle special case of P + 0 = 0 + P = 0 if self == INF: return other if other == INF: return self # handle special case of P + (-P) = 0 if self.x == other.x and self.y != other.y: return INF # compute the "slope" if self.x == other.x: # (self.y = other.y is guaranteed too per above check) m = (3 * self.x**2 + self.curve.a) * inv(2 * self.y, self.curve.p) else: m = (self.y - other.y) * inv(self.x - other.x, self.curve.p) # compute the new point rx = (m**2 - self.x - other.x) % self.curve.p ry = (-(m*(rx - self.x) + self.y)) % self.curve.p return Point(self.curve, rx, ry) Point.__add__ = elliptic_curve_addition # monkey patch addition into the Point class I admit that it may look a bit scary and understanding and re-deriving the above took me a good half of a day. Most of the complexity comes from all of the math being done with modular arithmetic. So even simple operations like division ‘/’ suddenly require algorithms such as the modular multiplicative inverse inv. But the important thing to note is that everything is just a bunch of adds/multiplies over the tuples (x,y) with some modulo p sprinkled everywhere in between. Let’s take it for a spin by generating some trivial (private, public) keypairs: # if our secret key was the integer 1, then our public key would just be G: sk = 1 pk = G print(f" secret key: {sk}\n public key: {(pk.x, pk.y)}") print("Verify the public key is on the curve: ", (pk.y**2 - pk.x**3 - 7) % bitcoin_curve.p == 0) # if it was 2, the public key is G + G: sk = 2 pk = G + G print(f" secret key: {sk}\n public key: {(pk.x, pk.y)}") print("Verify the public key is on the curve: ", (pk.y**2 - pk.x**3 - 7) % bitcoin_curve.p == 0) # etc.: sk = 3 pk = G + G + G print(f" secret key: {sk}\n public key: {(pk.x, pk.y)}") print("Verify the public key is on the curve: ", (pk.y**2 - pk.x**3 - 7) % bitcoin_curve.p == 0) secret key: 1 public key: (55066263022277343669578718895168534326250603453777594175500187360389116729240, 32670510020758816978083085130507043184471273380659243275938904335757337482424) Verify the public key is on the curve: True secret key: 2 public key: (89565891926547004231252920425935692360644145829622209833684329913297188986597, 12158399299693830322967808612713398636155367887041628176798871954788371653930) Verify the public key is on the curve: True secret key: 3 public key: (112711660439710606056748659173929673102114977341539408544630613555209775888121, 25583027980570883691656905877401976406448868254816295069919888960541586679410) Verify the public key is on the curve: True Okay so we have some keypairs above, but we want the public key associated with our randomly generator secret key above. Using just the code above we’d have to add G to itself a very large number of times, because the secret key is a large integer. So the result would be correct but it would run very slow. Instead, let’s implement the “double and add” algorithm to dramatically speed up the repeated addition. Again, see the post above for why it works, but here it is: def double_and_add(self, k: int) -> Point: assert isinstance(k, int) and k >= 0 result = INF append = self while k: if k & 1: result += append append += append k >>= 1 return result # monkey patch double and add into the Point class for convenience Point.__rmul__ = double_and_add # "verify" correctness print(G == 1*G) print(G + G == 2*G) print(G + G + G == 3*G) True True True # efficiently calculate our actual public key! public_key = secret_key * G print(f"x: {public_key.x}\ny: {public_key.y}") print("Verify the public key is on the curve: ", (public_key.y**2 - public_key.x**3 - 7) % bitcoin_curve.p == 0) x: 83998262154709529558614902604110599582969848537757180553516367057821848015989 y: 37676469766173670826348691885774454391218658108212372128812329274086400588247 Verify the public key is on the curve: True With the private/public key pair we’ve now generated our crypto identity. Now it is time to derive the associated Bitcoin wallet address. The wallet address is not just the public key itself, but it can be deterministically derived from it and has a few extra goodies (such as an embedded checksum). Before we can generate the address though we need to define some hash functions. Bitcoin uses the ubiquitous SHA-256 and also RIPEMD-160. We could just plug and play use the implementations in Python’s hashlib, but this is supposed to be a zero-dependency implementation, so import hashlib is cheating. So first here is the SHA256 implementation I wrote in pure Python following the (relatively readable) NIST FIPS PUB 180-4 doc: def gen_sha256_with_variable_scope_protector_to_not_pollute_global_namespace(): """ SHA256 implementation. Follows the FIPS PUB 180-4 description for calculating SHA-256 hash function https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf Noone in their right mind should use this for any serious reason. This was written purely for educational purposes. """ import math from itertools import count, islice # ----------------------------------------------------------------------------- # SHA-256 Functions, defined in Section 4 def rotr(x, n, size=32): return (x >> n) | (x << size - n) & (2**size - 1) def shr(x, n): return x >> n def sig0(x): return rotr(x, 7) ^ rotr(x, 18) ^ shr(x, 3) def sig1(x): return rotr(x, 17) ^ rotr(x, 19) ^ shr(x, 10) def capsig0(x): return rotr(x, 2) ^ rotr(x, 13) ^ rotr(x, 22) def capsig1(x): return rotr(x, 6) ^ rotr(x, 11) ^ rotr(x, 25) def ch(x, y, z): return (x & y)^ (~x & z) def maj(x, y, z): return (x & y) ^ (x & z) ^ (y & z) def b2i(b): return int.from_bytes(b, 'big') def i2b(i): return i.to_bytes(4, 'big') # ----------------------------------------------------------------------------- # SHA-256 Constants def is_prime(n): return not any(f for f in range(2,int(math.sqrt(n))+1) if n%f == 0) def first_n_primes(n): return islice(filter(is_prime, count(start=2)), n) def frac_bin(f, n=32): """ return the first n bits of fractional part of float f """ f -= math.floor(f) # get only the fractional part f *= 2**n # shift left f = int(f) # truncate the rest of the fractional content return f def genK(): """ Follows Section 4.2.2 to generate K The first 32 bits of the fractional parts of the cube roots of the first 64 prime numbers: 428a2f98 71374491 b5c0fbcf e9b5dba5 3956c25b 59f111f1 923f82a4 ab1c5ed5 d807aa98 12835b01 243185be 550c7dc3 72be5d74 80deb1fe 9bdc06a7 c19bf174 e49b69c1 efbe4786 0fc19dc6 240ca1cc 2de92c6f 4a7484aa 5cb0a9dc 76f988da 983e5152 a831c66d b00327c8 bf597fc7 c6e00bf3 d5a79147 06ca6351 14292967 27b70a85 2e1b2138 4d2c6dfc 53380d13 650a7354 766a0abb 81c2c92e 92722c85 a2bfe8a1 a81a664b c24b8b70 c76c51a3 d192e819 d6990624 f40e3585 106aa070 19a4c116 1e376c08 2748774c 34b0bcb5 391c0cb3 4ed8aa4a 5b9cca4f 682e6ff3 748f82ee 78a5636f 84c87814 8cc70208 90befffa a4506ceb bef9a3f7 c67178f2 """ return [frac_bin(p ** (1/3.0)) for p in first_n_primes(64)] def genH(): """ Follows Section 5.3.3 to generate the initial hash value H^0 The first 32 bits of the fractional parts of the square roots of the first 8 prime numbers. 6a09e667 bb67ae85 3c6ef372 a54ff53a 9b05688c 510e527f 1f83d9ab 5be0cd19 """ return [frac_bin(p ** (1/2.0)) for p in first_n_primes(8)] # ----------------------------------------------------------------------------- def pad(b): """ Follows Section 5.1: Padding the message """ b = bytearray(b) # convert to a mutable equivalent l = len(b) * 8 # note: len returns number of bytes not bits # append but "1" to the end of the message b.append(0b10000000) # appending 10000000 in binary (=128 in decimal) # follow by k zero bits, where k is the smallest non-negative solution to # l + 1 + k = 448 mod 512 # i.e. pad with zeros until we reach 448 (mod 512) while (len(b)*8) % 512 != 448: b.append(0x00) # the last 64-bit block is the length l of the original message # expressed in binary (big endian) b.extend(l.to_bytes(8, 'big')) return b def sha256(b: bytes) -> bytes: # Section 4.2 K = genK() # Section 5: Preprocessing # Section 5.1: Pad the message b = pad(b) # Section 5.2: Separate the message into blocks of 512 bits (64 bytes) blocks = [b[i:i+64] for i in range(0, len(b), 64)] # for each message block M^1 ... M^N H = genH() # Section 5.3 # Section 6 for M in blocks: # each block is a 64-entry array of 8-bit bytes # 1. Prepare the message schedule, a 64-entry array of 32-bit words W = [] for t in range(64): if t <= 15: # the first 16 words are just a copy of the block W.append(bytes(M[t*4:t*4+4])) else: term1 = sig1(b2i(W[t-2])) term2 = b2i(W[t-7]) term3 = sig0(b2i(W[t-15])) term4 = b2i(W[t-16]) total = (term1 + term2 + term3 + term4) % 2**32 W.append(i2b(total)) # 2. Initialize the 8 working variables a,b,c,d,e,f,g,h with prev hash value a, b, c, d, e, f, g, h = H # 3. for t in range(64): T1 = (h + capsig1(e) + ch(e, f, g) + K[t] + b2i(W[t])) % 2**32 T2 = (capsig0(a) + maj(a, b, c)) % 2**32 h = g g = f f = e e = (d + T1) % 2**32 d = c c = b b = a a = (T1 + T2) % 2**32 # 4. Compute the i-th intermediate hash value H^i delta = [a, b, c, d, e, f, g, h] H = [(i1 + i2) % 2**32 for i1, i2 in zip(H, delta)] return b''.join(i2b(i) for i in H) return sha256 sha256 = gen_sha256_with_variable_scope_protector_to_not_pollute_global_namespace() print("verify empty hash:", sha256(b'').hex()) # should be e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 print(sha256(b'here is a random bytes message, cool right?').hex()) print("number of bytes in a sha256 digest: ", len(sha256(b''))) verify empty hash: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 69b9779edaa573a509999cbae415d3408c30544bad09727a1d64eff353c95b89 number of bytes in a sha256 digest: 32 Okay the reason I wanted to implement this from scratch and paste it here is that I want you to note that again there is nothing too scary going on inside. SHA256 takes some bytes message that is to be hashed, it first pads the message, then breaks it up into chunks, and passes these chunks into what can best be described as a fancy “bit mixer”, defined in section 3, that contains a number of bit shifts and binary operations orchestrated in a way that is frankly beyond me, but that results in the beautiful properties that SHA256 offers. In particular, it creates a fixed-sized, random-looking short digest of any variably-sized original message s.t. the scrambling is not invertible and also it is basically computationally impossible to construct a different message that hashes to any given digest. Bitcoin uses SHA256 everywhere to create hashes, and of course it is the core element in Bitcoin’s Proof of Work, where the goal is to modify the block of transactions until the whole thing hashes to a sufficiently low number (when the bytes of the digest are interpreted as a number). Which, due to the nice properties of SHA256, can only be done via brute force search. So all of the ASICs designed for efficient mining are just incredibly optimized close-to-the-metal implementations of exactly the above code. Anyway before we can generate our address we also need the RIPEMD160 hash function, which I found on the internet and shortened and cleaned up: def gen_ripemd160_with_variable_scope_protector_to_not_pollute_global_namespace(): import sys import struct # ----------------------------------------------------------------------------- # public interface def ripemd160(b: bytes) -> bytes: """ simple wrapper for a simpler API to this hash function, just bytes to bytes """ ctx = RMDContext() RMD160Update(ctx, b, len(b)) digest = RMD160Final(ctx) return digest # ----------------------------------------------------------------------------- class RMDContext: def __init__(self): self.state = [0x67452301, 0xEFCDAB89, 0x98BADCFE, 0x10325476, 0xC3D2E1F0] # uint32 self.count = 0 # uint64 self.buffer = [0]*64 # uchar def RMD160Update(ctx, inp, inplen): have = int((ctx.count // 8) % 64) inplen = int(inplen) need = 64 - have ctx.count += 8 * inplen off = 0 if inplen >= need: if have: for i in range(need): ctx.buffer[have+i] = inp[i] RMD160Transform(ctx.state, ctx.buffer) off = need have = 0 while off + 64 <= inplen: RMD160Transform(ctx.state, inp[off:]) off += 64 if off < inplen: for i in range(inplen - off): ctx.buffer[have+i] = inp[off+i] def RMD160Final(ctx): size = struct.pack("<Q", ctx.count) padlen = 64 - ((ctx.count // 8) % 64) if padlen < 1 + 8: padlen += 64 RMD160Update(ctx, PADDING, padlen-8) RMD160Update(ctx, size, 8) return struct.pack("<5L", *ctx.state) # ----------------------------------------------------------------------------- K0 = 0x00000000 K1 = 0x5A827999 K2 = 0x6ED9EBA1 K3 = 0x8F1BBCDC K4 = 0xA953FD4E KK0 = 0x50A28BE6 KK1 = 0x5C4DD124 KK2 = 0x6D703EF3 KK3 = 0x7A6D76E9 KK4 = 0x00000000 PADDING = [0x80] + [0]*63 def ROL(n, x): return ((x << n) & 0xffffffff) | (x >> (32 - n)) def F0(x, y, z): return x ^ y ^ z def F1(x, y, z): return (x & y) | (((~x) % 0x100000000) & z) def F2(x, y, z): return (x | ((~y) % 0x100000000)) ^ z def F3(x, y, z): return (x & z) | (((~z) % 0x100000000) & y) def F4(x, y, z): return x ^ (y | ((~z) % 0x100000000)) def R(a, b, c, d, e, Fj, Kj, sj, rj, X): a = ROL(sj, (a + Fj(b, c, d) + X[rj] + Kj) % 0x100000000) + e c = ROL(10, c) return a % 0x100000000, c def RMD160Transform(state, block): #uint32 state[5], uchar block[64] x = [0]*16 assert sys.byteorder == 'little', "Only little endian is supported atm for RIPEMD160" x = struct.unpack('<16L', bytes(block[0:64])) a = state[0] b = state[1] c = state[2] d = state[3] e = state[4] #/* Round 1 */ a, c = R(a, b, c, d, e, F0, K0, 11, 0, x) e, b = R(e, a, b, c, d, F0, K0, 14, 1, x) d, a = R(d, e, a, b, c, F0, K0, 15, 2, x) c, e = R(c, d, e, a, b, F0, K0, 12, 3, x) b, d = R(b, c, d, e, a, F0, K0, 5, 4, x) a, c = R(a, b, c, d, e, F0, K0, 8, 5, x) e, b = R(e, a, b, c, d, F0, K0, 7, 6, x) d, a = R(d, e, a, b, c, F0, K0, 9, 7, x) c, e = R(c, d, e, a, b, F0, K0, 11, 8, x) b, d = R(b, c, d, e, a, F0, K0, 13, 9, x) a, c = R(a, b, c, d, e, F0, K0, 14, 10, x) e, b = R(e, a, b, c, d, F0, K0, 15, 11, x) d, a = R(d, e, a, b, c, F0, K0, 6, 12, x) c, e = R(c, d, e, a, b, F0, K0, 7, 13, x) b, d = R(b, c, d, e, a, F0, K0, 9, 14, x) a, c = R(a, b, c, d, e, F0, K0, 8, 15, x) #/* #15 */ #/* Round 2 */ e, b = R(e, a, b, c, d, F1, K1, 7, 7, x) d, a = R(d, e, a, b, c, F1, K1, 6, 4, x) c, e = R(c, d, e, a, b, F1, K1, 8, 13, x) b, d = R(b, c, d, e, a, F1, K1, 13, 1, x) a, c = R(a, b, c, d, e, F1, K1, 11, 10, x) e, b = R(e, a, b, c, d, F1, K1, 9, 6, x) d, a = R(d, e, a, b, c, F1, K1, 7, 15, x) c, e = R(c, d, e, a, b, F1, K1, 15, 3, x) b, d = R(b, c, d, e, a, F1, K1, 7, 12, x) a, c = R(a, b, c, d, e, F1, K1, 12, 0, x) e, b = R(e, a, b, c, d, F1, K1, 15, 9, x) d, a = R(d, e, a, b, c, F1, K1, 9, 5, x) c, e = R(c, d, e, a, b, F1, K1, 11, 2, x) b, d = R(b, c, d, e, a, F1, K1, 7, 14, x) a, c = R(a, b, c, d, e, F1, K1, 13, 11, x) e, b = R(e, a, b, c, d, F1, K1, 12, 8, x) #/* #31 */ #/* Round 3 */ d, a = R(d, e, a, b, c, F2, K2, 11, 3, x) c, e = R(c, d, e, a, b, F2, K2, 13, 10, x) b, d = R(b, c, d, e, a, F2, K2, 6, 14, x) a, c = R(a, b, c, d, e, F2, K2, 7, 4, x) e, b = R(e, a, b, c, d, F2, K2, 14, 9, x) d, a = R(d, e, a, b, c, F2, K2, 9, 15, x) c, e = R(c, d, e, a, b, F2, K2, 13, 8, x) b, d = R(b, c, d, e, a, F2, K2, 15, 1, x) a, c = R(a, b, c, d, e, F2, K2, 14, 2, x) e, b = R(e, a, b, c, d, F2, K2, 8, 7, x) d, a = R(d, e, a, b, c, F2, K2, 13, 0, x) c, e = R(c, d, e, a, b, F2, K2, 6, 6, x) b, d = R(b, c, d, e, a, F2, K2, 5, 13, x) a, c = R(a, b, c, d, e, F2, K2, 12, 11, x) e, b = R(e, a, b, c, d, F2, K2, 7, 5, x) d, a = R(d, e, a, b, c, F2, K2, 5, 12, x) #/* #47 */ #/* Round 4 */ c, e = R(c, d, e, a, b, F3, K3, 11, 1, x) b, d = R(b, c, d, e, a, F3, K3, 12, 9, x) a, c = R(a, b, c, d, e, F3, K3, 14, 11, x) e, b = R(e, a, b, c, d, F3, K3, 15, 10, x) d, a = R(d, e, a, b, c, F3, K3, 14, 0, x) c, e = R(c, d, e, a, b, F3, K3, 15, 8, x) b, d = R(b, c, d, e, a, F3, K3, 9, 12, x) a, c = R(a, b, c, d, e, F3, K3, 8, 4, x) e, b = R(e, a, b, c, d, F3, K3, 9, 13, x) d, a = R(d, e, a, b, c, F3, K3, 14, 3, x) c, e = R(c, d, e, a, b, F3, K3, 5, 7, x) b, d = R(b, c, d, e, a, F3, K3, 6, 15, x) a, c = R(a, b, c, d, e, F3, K3, 8, 14, x) e, b = R(e, a, b, c, d, F3, K3, 6, 5, x) d, a = R(d, e, a, b, c, F3, K3, 5, 6, x) c, e = R(c, d, e, a, b, F3, K3, 12, 2, x) #/* #63 */ #/* Round 5 */ b, d = R(b, c, d, e, a, F4, K4, 9, 4, x) a, c = R(a, b, c, d, e, F4, K4, 15, 0, x) e, b = R(e, a, b, c, d, F4, K4, 5, 5, x) d, a = R(d, e, a, b, c, F4, K4, 11, 9, x) c, e = R(c, d, e, a, b, F4, K4, 6, 7, x) b, d = R(b, c, d, e, a, F4, K4, 8, 12, x) a, c = R(a, b, c, d, e, F4, K4, 13, 2, x) e, b = R(e, a, b, c, d, F4, K4, 12, 10, x) d, a = R(d, e, a, b, c, F4, K4, 5, 14, x) c, e = R(c, d, e, a, b, F4, K4, 12, 1, x) b, d = R(b, c, d, e, a, F4, K4, 13, 3, x) a, c = R(a, b, c, d, e, F4, K4, 14, 8, x) e, b = R(e, a, b, c, d, F4, K4, 11, 11, x) d, a = R(d, e, a, b, c, F4, K4, 8, 6, x) c, e = R(c, d, e, a, b, F4, K4, 5, 15, x) b, d = R(b, c, d, e, a, F4, K4, 6, 13, x) #/* #79 */ aa = a bb = b cc = c dd = d ee = e a = state[0] b = state[1] c = state[2] d = state[3] e = state[4] #/* Parallel round 1 */ a, c = R(a, b, c, d, e, F4, KK0, 8, 5, x) e, b = R(e, a, b, c, d, F4, KK0, 9, 14, x) d, a = R(d, e, a, b, c, F4, KK0, 9, 7, x) c, e = R(c, d, e, a, b, F4, KK0, 11, 0, x) b, d = R(b, c, d, e, a, F4, KK0, 13, 9, x) a, c = R(a, b, c, d, e, F4, KK0, 15, 2, x) e, b = R(e, a, b, c, d, F4, KK0, 15, 11, x) d, a = R(d, e, a, b, c, F4, KK0, 5, 4, x) c, e = R(c, d, e, a, b, F4, KK0, 7, 13, x) b, d = R(b, c, d, e, a, F4, KK0, 7, 6, x) a, c = R(a, b, c, d, e, F4, KK0, 8, 15, x) e, b = R(e, a, b, c, d, F4, KK0, 11, 8, x) d, a = R(d, e, a, b, c, F4, KK0, 14, 1, x) c, e = R(c, d, e, a, b, F4, KK0, 14, 10, x) b, d = R(b, c, d, e, a, F4, KK0, 12, 3, x) a, c = R(a, b, c, d, e, F4, KK0, 6, 12, x) #/* #15 */ #/* Parallel round 2 */ e, b = R(e, a, b, c, d, F3, KK1, 9, 6, x) d, a = R(d, e, a, b, c, F3, KK1, 13, 11, x) c, e = R(c, d, e, a, b, F3, KK1, 15, 3, x) b, d = R(b, c, d, e, a, F3, KK1, 7, 7, x) a, c = R(a, b, c, d, e, F3, KK1, 12, 0, x) e, b = R(e, a, b, c, d, F3, KK1, 8, 13, x) d, a = R(d, e, a, b, c, F3, KK1, 9, 5, x) c, e = R(c, d, e, a, b, F3, KK1, 11, 10, x) b, d = R(b, c, d, e, a, F3, KK1, 7, 14, x) a, c = R(a, b, c, d, e, F3, KK1, 7, 15, x) e, b = R(e, a, b, c, d, F3, KK1, 12, 8, x) d, a = R(d, e, a, b, c, F3, KK1, 7, 12, x) c, e = R(c, d, e, a, b, F3, KK1, 6, 4, x) b, d = R(b, c, d, e, a, F3, KK1, 15, 9, x) a, c = R(a, b, c, d, e, F3, KK1, 13, 1, x) e, b = R(e, a, b, c, d, F3, KK1, 11, 2, x) #/* #31 */ #/* Parallel round 3 */ d, a = R(d, e, a, b, c, F2, KK2, 9, 15, x) c, e = R(c, d, e, a, b, F2, KK2, 7, 5, x) b, d = R(b, c, d, e, a, F2, KK2, 15, 1, x) a, c = R(a, b, c, d, e, F2, KK2, 11, 3, x) e, b = R(e, a, b, c, d, F2, KK2, 8, 7, x) d, a = R(d, e, a, b, c, F2, KK2, 6, 14, x) c, e = R(c, d, e, a, b, F2, KK2, 6, 6, x) b, d = R(b, c, d, e, a, F2, KK2, 14, 9, x) a, c = R(a, b, c, d, e, F2, KK2, 12, 11, x) e, b = R(e, a, b, c, d, F2, KK2, 13, 8, x) d, a = R(d, e, a, b, c, F2, KK2, 5, 12, x) c, e = R(c, d, e, a, b, F2, KK2, 14, 2, x) b, d = R(b, c, d, e, a, F2, KK2, 13, 10, x) a, c = R(a, b, c, d, e, F2, KK2, 13, 0, x) e, b = R(e, a, b, c, d, F2, KK2, 7, 4, x) d, a = R(d, e, a, b, c, F2, KK2, 5, 13, x) #/* #47 */ #/* Parallel round 4 */ c, e = R(c, d, e, a, b, F1, KK3, 15, 8, x) b, d = R(b, c, d, e, a, F1, KK3, 5, 6, x) a, c = R(a, b, c, d, e, F1, KK3, 8, 4, x) e, b = R(e, a, b, c, d, F1, KK3, 11, 1, x) d, a = R(d, e, a, b, c, F1, KK3, 14, 3, x) c, e = R(c, d, e, a, b, F1, KK3, 14, 11, x) b, d = R(b, c, d, e, a, F1, KK3, 6, 15, x) a, c = R(a, b, c, d, e, F1, KK3, 14, 0, x) e, b = R(e, a, b, c, d, F1, KK3, 6, 5, x) d, a = R(d, e, a, b, c, F1, KK3, 9, 12, x) c, e = R(c, d, e, a, b, F1, KK3, 12, 2, x) b, d = R(b, c, d, e, a, F1, KK3, 9, 13, x) a, c = R(a, b, c, d, e, F1, KK3, 12, 9, x) e, b = R(e, a, b, c, d, F1, KK3, 5, 7, x) d, a = R(d, e, a, b, c, F1, KK3, 15, 10, x) c, e = R(c, d, e, a, b, F1, KK3, 8, 14, x) #/* #63 */ #/* Parallel round 5 */ b, d = R(b, c, d, e, a, F0, KK4, 8, 12, x) a, c = R(a, b, c, d, e, F0, KK4, 5, 15, x) e, b = R(e, a, b, c, d, F0, KK4, 12, 10, x) d, a = R(d, e, a, b, c, F0, KK4, 9, 4, x) c, e = R(c, d, e, a, b, F0, KK4, 12, 1, x) b, d = R(b, c, d, e, a, F0, KK4, 5, 5, x) a, c = R(a, b, c, d, e, F0, KK4, 14, 8, x) e, b = R(e, a, b, c, d, F0, KK4, 6, 7, x) d, a = R(d, e, a, b, c, F0, KK4, 8, 6, x) c, e = R(c, d, e, a, b, F0, KK4, 13, 2, x) b, d = R(b, c, d, e, a, F0, KK4, 6, 13, x) a, c = R(a, b, c, d, e, F0, KK4, 5, 14, x) e, b = R(e, a, b, c, d, F0, KK4, 15, 0, x) d, a = R(d, e, a, b, c, F0, KK4, 13, 3, x) c, e = R(c, d, e, a, b, F0, KK4, 11, 9, x) b, d = R(b, c, d, e, a, F0, KK4, 11, 11, x) #/* #79 */ t = (state[1] + cc + d) % 0x100000000 state[1] = (state[2] + dd + e) % 0x100000000 state[2] = (state[3] + ee + a) % 0x100000000 state[3] = (state[4] + aa + b) % 0x100000000 state[4] = (state[0] + bb + c) % 0x100000000 state[0] = t % 0x100000000 return ripemd160 ripemd160 = gen_ripemd160_with_variable_scope_protector_to_not_pollute_global_namespace() print(ripemd160(b'hello this is a test').hex()) print("number of bytes in a RIPEMD-160 digest: ", len(ripemd160(b''))) f51960af7dd4813a587ab26388ddab3b28d1f7b4 number of bytes in a RIPEMD-160 digest: 20 As with SHA256 above, again we see a “bit scrambler” of a lot of binary ops. Pretty cool. Okay we are finally ready to get our Bitcoin address. We are going to make this nice by creating a subclass of Point called PublicKey which is, again, just a Point on the Curve but now has some additional semantics and interpretation of a Bitcoin public key, together with some methods of encoding/decoding the key into bytes for communication in the Bitcoin protocol. class PublicKey(Point): """ The public key is just a Point on a Curve, but has some additional specific encoding / decoding functionality that this class implements. """ @classmethod def from_point(cls, pt: Point): """ promote a Point to be a PublicKey """ return cls(pt.curve, pt.x, pt.y) def encode(self, compressed, hash160=False): """ return the SEC bytes encoding of the public key Point """ # calculate the bytes if compressed: # (x,y) is very redundant. Because y^2 = x^3 + 7, # we can just encode x, and then y = +/- sqrt(x^3 + 7), # so we need one more bit to encode whether it was the + or the - # but because this is modular arithmetic there is no +/-, instead # it can be shown that one y will always be even and the other odd. prefix = b'\x02' if self.y % 2 == 0 else b'\x03' pkb = prefix + self.x.to_bytes(32, 'big') else: pkb = b'\x04' + self.x.to_bytes(32, 'big') + self.y.to_bytes(32, 'big') # hash if desired return ripemd160(sha256(pkb)) if hash160 else pkb def address(self, net: str, compressed: bool) -> str: """ return the associated bitcoin address for this public key as string """ # encode the public key into bytes and hash to get the payload pkb_hash = self.encode(compressed=compressed, hash160=True) # add version byte (0x00 for Main Network, or 0x6f for Test Network) version = {'main': b'\x00', 'test': b'\x6f'} ver_pkb_hash = version[net] + pkb_hash # calculate the checksum checksum = sha256(sha256(ver_pkb_hash))[:4] # append to form the full 25-byte binary Bitcoin Address byte_address = ver_pkb_hash + checksum # finally b58 encode the result b58check_address = b58encode(byte_address) return b58check_address We are not yet ready to take this class for a spin because you’ll note there is one more necessary dependency here, which is the b58 encoding function b58encode. This is just a Bitcoin-specific encoding of bytes that uses base 58, of characters of the alphabet that are very unambiguous. For example it does not use ‘O’ and ‘0’, because they are very easy to mess up on paper. So we have to take our Bitcoin address (which is 25 bytes in its raw form) and convert it to base 58 and print out the characters. The raw 25 bytes of our address though contain 1 byte for a Version (the Bitcoin “main net” is b'\x00', while the Bitcoin “test net” uses b'\x6f'), then the 20 bytes from the hash digest, and finally 4 bytes for a checksum so we can throw an error with 1 - 1/2**32 = 99.99999998% probability in case a user messes up typing in their Bitcoin address into some textbox. So here is the b58 encoding: # base58 encoding / decoding utilities # reference: https://en.bitcoin.it/wiki/Base58Check_encoding alphabet = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz' def b58encode(b: bytes) -> str: assert len(b) == 25 # version is 1 byte, pkb_hash 20 bytes, checksum 4 bytes n = int.from_bytes(b, 'big') chars = [] while n: n, i = divmod(n, 58) chars.append(alphabet[i]) # special case handle the leading 0 bytes... ¯\_(ツ)_/¯ num_leading_zeros = len(b) - len(b.lstrip(b'\x00')) res = num_leading_zeros * alphabet[0] + ''.join(reversed(chars)) return res Let’s now print our Bitcoin address: # we are going to use the develop's Bitcoin parallel universe "test net" for this demo, so net='test' address = PublicKey.from_point(public_key).address(net='test', compressed=True) print(address) mnNcaVkC35ezZSgvn8fhXEa9QTHSUtPfzQ Cool, we can now check some block explorer website to verify that this address has never transacted before: https://www.blockchain.com/btc-testnet/address/mnNcaVkC35ezZSgvn8fhXEa9QTHSUtPfzQ. By the end of this tutorial it won’t be, but at the time of writing indeed I saw that this address is “clean”, so noone has generated and used the secret key on the testnet so far like we did up above. Which makes sense because there would have to be some other “Andrej” with a bad sense of humor also tinkering with Bitcoin. But we can also check some super non-secret secret keys, which we expect would have been used be people in the past. For example we can check the address belonging to the lowest valid secret key of 1, where the public key is exactly the generator point :). Here’s how we get it: lol_secret_key = 1 lol_public_key = lol_secret_key * G lol_address = PublicKey.from_point(lol_public_key).address(net='test', compressed=True) lol_address 'mrCDrCybB6J1vRfbwM5hemdJz73FwDBC8r' Indeed, as we see on the blockchain explorer that this address has transacted 1,812 times at the time of writing and has a balance of $0.00 BTC. This makes sense because if it did have any balance (in the naive case, modulo some subtleties with the scripting language we’ll go into) then anyone would just be able to spend it because they know secret key (1) and can use it to digitally sign transactions that spend it. We’ll see how that works shortly. Part 1: Summary so far We are able to generate a crypto identity that consists of a secret key (a random integer) that only we know, and a derived public key by jumping around the Elliptic curve using scalar multiplication of the Generating point on the Bitcoin elliptic curve. We then also derived the associated Bitcoin address which we can share with others to ask for moneys, and doing so involved the introduction of two hash functions (SHA256 and RIPEMD160). Here are the three important quantities summarized and printed out again: print("Our first Bitcoin identity:") print("1. secret key: ", secret_key) print("2. public key: ", (public_key.x, public_key.y)) print("3. Bitcoin address: ", address) Our first Bitcoin identity: 1. secret key: 22265090479312778178772228083027296664144 2. public key: (83998262154709529558614902604110599582969848537757180553516367057821848015989, 37676469766173670826348691885774454391218658108212372128812329274086400588247) 3. Bitcoin address: mnNcaVkC35ezZSgvn8fhXEa9QTHSUtPfzQ Part 2: Obtaining seed funds + intro to Bitcoin under the hood It is now time to create a transaction. We are going to be sending some BTC from the address we generated above (mnNcaVkC35ezZSgvn8fhXEa9QTHSUtPfzQ) to some second wallet we control. Let’s create this second “target” wallet now: secret_key2 = int.from_bytes(b"Andrej's Super Secret 2nd Wallet", 'big') # or just random.randrange(1, bitcoin_gen.n) assert 1 <= secret_key2 < bitcoin_gen.n # check it's valid public_key2 = secret_key2 * G address2 = PublicKey.from_point(public_key2).address(net='test', compressed=True) print("Our second Bitcoin identity:") print("1. secret key: ", secret_key2) print("2. public key: ", (public_key2.x, public_key2.y)) print("3. Bitcoin address: ", address2) Our second Bitcoin identity: 1. secret key: 29595381593786747354608258168471648998894101022644411052850960746671046944116 2. public key: (70010837237584666034852528437623689803658776589997047576978119215393051139210, 35910266550486169026860404782843121421687961955681935571785539885177648410329) 3. Bitcoin address: mrFF91kpuRbivucowsY512fDnYt6BWrvx9 Ok great so our goal is to send some BTC from mnNcaVkC35ezZSgvn8fhXEa9QTHSUtPfzQ to mrFF91kpuRbivucowsY512fDnYt6BWrvx9. First, because we just generated these identities from scratch, the first address has no bitcoin on it. Because we are using the “parallel universe” developer-intended Bitcoin test network, we can use one of multiple available faucets to pretty please request some BTC. I did this by Googling “bitcoin testnet faucet”, hitting the first link, and asking the faucet to send some bitcoins to our source address mnNcaVkC35ezZSgvn8fhXEa9QTHSUtPfzQ. A few minutes later, we can go back to the blockchain explorer and see that we received the coins, in this case 0.001 BTC. Faucets are available for the test net, but of course you won’t find them on the main net :) You’d have to e.g. open up a Coinbase account (which generates a wallet) and buy some BTC for USD. In this tutorial we’ll be working on the test net, but everything we do would work just fine on the main net as well. Now if we click on the exact transaction ID we can see a bunch of additional information that gets to the heart of Bitcoin and how money is represented in it. Transaction id. First note that every transaction has a distinct id / hash. In this case the faucet transaction has id 46325085c89fb98a4b7ceee44eac9b955f09e1ddc86d8dad3dfdcba46b4d36b2. As we’ll see, this is just a SHA256 double hash (hash of a hash) of the transaction data structure that we’ll see soon serialized into bytes. Double SHA256 hashes are often used in place of a single hash in Bitcoin for added security, to mitigate a few shortcomings of just one round of SHA256, and some related attacks discovered on the older version of SHA (SHA-1). Inputs and Outputs. We see that the faucet transaction has 1 input and 2 outputs. The 1 input came from address 2MwjXCY7RRpo8MYjtsJtP5erNirzFB9MtnH of value 0.17394181 BTC. There were 2 outputs. The second output was our address and we received exactly 0.001 BTC. The first output is some different, unknown address 2NCorZJ6XfdimrFQuwWjcJhQJDxPqjNgLzG which received 0.17294013 BTC, and is presumably controlled by the faucet owners. Notice that the the inputs don’t exactly add up to the outputs. Indeed we have that 0.17394181 - (0.001 + 0.17294013) = 0.00000168. This “change” amount is called the fee, and this fee is allowed to claimed by the Bitcoin miner who has included this transaction in their block, which in this case was Block 2005500. You can see that this block had 48 transactions, and the faucet transaction was one of them! Now, the fee acts as a financial incentive for miners to include the transaction in their block, because they get to keep the change. The higher the fee to the miner, the more likely and faster the transaction is to appear in the blockchain. With a high fee we’d expect it to be eagerly taken up by miners and included in the very next block. With a low fee the transaction might never be included, because there are many other transactions broadcasted in the network that are willing to pay a higher fee. So if you’re a miner and you have a finite amount of space to put into your Block - why bother? When we make our own transaction, we’ll have to make sure to include this tip for the miner, and pay “market rate”, which we’ll look up. In the case of this block, we can see that the total amount of BTC made by the miner of this block was 0.09765625 BTC from the special “Coinbase” transaction, that each miner is allowed to send from a null input to themselves, and then a total of 0.00316119 BTC was the total fee reward, summed up over all of the 47 non-Coinbase transactions in this block. Size. Also note that this transaction (serialized) was 249 bytes. This is a pretty average size for a simple transaction like this. Pkscript. Lastly note that the second Output (our 0.001 BTC) when you scroll down to its details has a “Pkscript” field, which shows: OP_DUP OP_HASH160 4b3518229b0d3554fe7cd3796ade632aff3069d8 OP_EQUALVERIFY OP_CHECKSIG This is where things get a bit crazy with Bitcoin. It has a whole stack-based scripting language, but unless you’re doing crazy multisig smart contract triple escrow backflips (?), the vast majority of transactions use one of very few simple “special case” scripts, just like the one here. By now my eyes just glaze over it as the standard simple thing. This “Pkscript” is the “locking script” for this specific Output, which holds 0.001 BTC in it. We are going to want to spend this Output and turn it into an Input in our upcoming transaction. In order to unlock this output we are going to have to satisfy the conditions of this locking script. In English, this script is saying that any Transaction that aspires to spend this Output must satisfy two conditions. 1) their Public key better hash to 4b3518229b0d3554fe7cd3796ade632aff3069d8. And 2) the digital signature for the aspiring transaction better validate as being generated by this public key’s associated private key. Only the owner of the secret key will be able to both 1) provide the full public key, which will be checked to hash correctly, and 2) create the digital signature, as we’ll soon see. By the way, we can verify that of course our public key hashes correctly, so we’ll be able to include it in our upcoming transaction, and the all of the mining nodes will be able to verify condition (1). Very early Bitcoin transactions had locking scripts that directly contained the public key (instead of its hash) followed by OP_CHECKSIG, but doing it in this slightly more complex way protects the exact public key behind the hash, until the owner wants to spend the funds, only then do they reveal the public key. (If you’d like to learn more look up p2pk vs p2pkh transactions). PublicKey.from_point(public_key).encode(compressed=True, hash160=True).hex() '4b3518229b0d3554fe7cd3796ade632aff3069d8' Part 3: Crafting our transaction Okay, now we’re going to actually craft our transaction. Let’s say that we want to send half of our funds to our second wallet. i.e. we currently have a wallet with 0.001 BTC, and we’d like to send 0.0005 BTC to our second wallet. To achieve this our transaction will have exactly one input (= 2nd output of the faucet transaction), and exactly 2 outputs. One output will go to our 2nd address, and the rest of it we will send back to our own address! This here is a critical part to understand. It’s a bit funky. Every Input/Output of any bitcoin transaction must always be fully spent. So if we own 0.001 BTC and want to send half of it somewhere else, we actually have to send one half there, and one half back to us. The Transaction will be considered valid if the sum of all outputs is lower than the sum of all inputs (so we’re not minting money). The remainder will be the “change” (fee) that will be claimed by the winning miner who lucks out on the proof of work, and includes our transaction in their newly mined block. Let’s begin with the transaction input data structure: @dataclass class TxIn: prev_tx: bytes # prev transaction ID: hash256 of prev tx contents prev_index: int # UTXO output index in the transaction script_sig: Script = None # unlocking script, Script class coming a bit later below sequence: int = 0xffffffff # originally intended for "high frequency trades", with locktime tx_in = TxIn( prev_tx = bytes.fromhex('46325085c89fb98a4b7ceee44eac9b955f09e1ddc86d8dad3dfdcba46b4d36b2'), prev_index = 1, script_sig = None, # this field will have the digital signature, to be inserted later ) The first two variables (prev_tx, prev_index) identify a specific Output that we are going to spend. Note again that nowhere are we specifying how much of the output we want to spend. We must spend the output (or a “UTXO” as it’s often called, short for Unspent Transaction Output) in its entirety. Once we consume this UTXO in its entirety we are free to “chunk up” its value into however many outputs we like, and optionally send some of those chunks back to our own address. Anyway, in this case we are identifying the transaction that sent us the Bitcoins, and we’re saying that the Output we intend to spend is at the 1th index of it. The 0th index went to some other unknown address controlled by the faucet, which we won’t be able to spend because we don’t control it (we don’t have the private key and won’t be able to create the digital signature). The script_sig field we are going to revisit later. This is where the digital signature will go, cryptographically signing the desired transaction with our private key and effectively saying “I approve this transaction as the possessor of the private key whose public key hashes to 4b3518229b0d3554fe7cd3796ade632aff3069d8”. sequence was in the original Bitcoin implementation from Satoshi and was intended to provide a type of “high frequency trade” functionality, but has very limited uses today and we’ll mostly ignore. Calculating the fee. Great, so the above data structure references the Inputs of our transaction (1 input here). Let’s now create the data structures for the two outputs of our transaction. To get a sense of the going “market rate” of transaction fees there are a number of websites available, or we can just scroll through some transactions in a recent block to get a sense. A number of recent transactions (including the one above) were packaged into a block even at <1 satoshi/byte (satoshi is 1e-8 of a bitcoin). So let’s try to go with a very generous fee of maybe 10 sat/B, or a total transaction fee of 0.0000001. In that case we are taking our input of 0.001 BTC = 100,000 sat, the fee will be 2,500 sat (because our transaction will be approx. 250 bytes), we are going to send 50,000 sat to our target wallet, and the rest (100,000 - 2,500 - 50,000 = 47,500) back to us. @dataclass class TxOut: amount: int # in units of satoshi (1e-8 of a bitcoin) script_pubkey: Script = None # locking script tx_out1 = TxOut( amount = 50000 # we will send this 50,000 sat to our target wallet ) tx_out2 = TxOut( amount = 47500 # back to us ) # the fee of 2500 does not need to be manually specified, the miner will claim it Populating the locking scripts. We’re now going to populate the script_pubkey “locking script” for both of these outputs. Essentially we want to specify the conditions under which each output can be spent by some future transaction. As mentioned, Bitcoin has a rich scripting language with almost 100 instructions that can be sequenced into various locking / unlocking scripts, but here we are going to use the super standard and ubiquitous script we already saw above, and which was also used by the faucet to pay us. To indicate the ownership of both of these outputs we basically want to specify the public key hash of whoever can spend the output. Except we have to dress that up with the “rich scripting language” padding. Ok here we go. Recall that the locking script in the faucet transaction had this form when we looked at it in the Bitcoin block explorer. The public key hash of the owner of the Output is sandwiched between a few Bitcoin Scripting Language op codes, which we’ll cover in a bit: OP_DUP OP_HASH160 4b3518229b0d3554fe7cd3796ade632aff3069d8 OP_EQUALVERIFY OP_CHECKSIG We need to create this same structure and encode it into bytes, but we want to swap out the public key hash with the new owner’s hashes. The op codes (like OP_DUP etc.) all get encoded as integers via a fixed schema. Here it is: def encode_int(i, nbytes, encoding='little'): """ encode integer i into nbytes bytes using a given byte ordering """ return i.to_bytes(nbytes, encoding) def encode_varint(i): """ encode a (possibly but rarely large) integer into bytes with a super simple compression scheme """ if i < 0xfd: return bytes([i]) elif i < 0x10000: return b'\xfd' + encode_int(i, 2) elif i < 0x100000000: return b'\xfe' + encode_int(i, 4) elif i < 0x10000000000000000: return b'\xff' + encode_int(i, 8) else: raise ValueError("integer too large: %d" % (i, )) @dataclass class Script: cmds: List[Union[int, bytes]] def encode(self): out = [] for cmd in self.cmds: if isinstance(cmd, int): # an int is just an opcode, encode as a single byte out += [encode_int(cmd, 1)] elif isinstance(cmd, bytes): # bytes represent an element, encode its length and then content length = len(cmd) assert length < 75 # any longer than this requires a bit of tedious handling that we'll skip here out += [encode_int(length, 1), cmd] ret = b''.join(out) return encode_varint(len(ret)) + ret # the first output will go to our 2nd wallet out1_pkb_hash = PublicKey.from_point(public_key2).encode(compressed=True, hash160=True) out1_script = Script([118, 169, out1_pkb_hash, 136, 172]) # OP_DUP, OP_HASH160, <hash>, OP_EQUALVERIFY, OP_CHECKSIG print(out1_script.encode().hex()) # the second output will go back to us out2_pkb_hash = PublicKey.from_point(public_key).encode(compressed=True, hash160=True) out2_script = Script([118, 169, out2_pkb_hash, 136, 172]) print(out2_script.encode().hex()) 1976a91475b0c9fc784ba2ea0839e3cdf2669495cac6707388ac 1976a9144b3518229b0d3554fe7cd3796ade632aff3069d888ac Ok we’re now going to effectively declare the owners of both outputs of our transaction by specifying the public key hashes (padded by the Script op codes). We’ll see exactly how these locking scripts work for the Ouputs in a bit when we create the unlocking script for the Input. For now it is important to understand that we are effectively declaring the owner of each output UTXO by identifying a specific public key hash. With the locking script specified as above, only the person who has the original public key (and its associated secret key) will be able to spend the UTXO. tx_out1.script_pubkey = out1_script tx_out2.script_pubkey = out2_script Digital Signature Now for the important part, we’re looping around to specifying the script_sig of the transaction input tx_in, which we skipped over above. In particular we are going to craft a digital signature that effectively says “I, the owner of the private key associated with the public key hash on the referenced transaction’s output’s locking script approve the spend of this UTXO as an input of this transaction”. Unfortunately this is again where Bitcoin gets pretty fancy because you can actually only sign parts of Transactions, and a number of signatures can be assembled from a number of parties and combined in various ways. As we did above, we will only cover the (by far) most common use case of signing the entire transaction and, and constructing the unlocking script specifically to only satisfy the locking script of the exact form above (OP_DUP, OP_HASH160, <hash>, OP_EQUALVERIFY, OP_CHECKSIG). First, we need to create a pure bytes “message” that we will be digitally signing. In this case, the message is the encoding of the entire transaction. So this is awkward - the entire transaction can’t be encoded into bytes yet because we haven’t finished it! It is still missing our signature, which we are still trying to construct. Instead, when we are serializing the transaction input that we wish to sign, the rule is to replace the encoding of the script_sig (which we don’t have, because again we’re just trying to produce it…) with the script_pubkey of the transaction output this input is pointing back to. All other transaction input’s script_sig is also replaced with an empty script, because those inputs can belong to many other owners who can individually and independently contribute their own signatures. Ok I’m not sure if this is making sense any right now. So let’s just see it in code. We need the final data structure, the actual Transaction, so we can serialize it into the bytes message. It is mostly a thin container for a list of TxIns and list of TxOuts: the inputs and outputs. We then implement the serialization for the new Tx class, and also the serialization for TxIn and TxOut class, so we can serialize the entire transaction to bytes. @dataclass class Tx: version: int tx_ins: List[TxIn] tx_outs: List[TxOut] locktime: int = 0 def encode(self, sig_index=-1) -> bytes: """ Encode this transaction as bytes. If sig_index is given then return the modified transaction encoding of this tx with respect to the single input index. This result then constitutes the "message" that gets signed by the aspiring transactor of this input. """ out = [] # encode metadata out += [encode_int(self.version, 4)] # encode inputs out += [encode_varint(len(self.tx_ins))] if sig_index == -1: # we are just serializing a fully formed transaction out += [tx_in.encode() for tx_in in self.tx_ins] else: # used when crafting digital signature for a specific input index out += [tx_in.encode(script_override=(sig_index == i)) for i, tx_in in enumerate(self.tx_ins)] # encode outputs out += [encode_varint(len(self.tx_outs))] out += [tx_out.encode() for tx_out in self.tx_outs] # encode... other metadata out += [encode_int(self.locktime, 4)] out += [encode_int(1, 4) if sig_index != -1 else b''] # 1 = SIGHASH_ALL return b''.join(out) # we also need to know how to encode TxIn. This is just serialization protocol. def txin_encode(self, script_override=None): out = [] out += [self.prev_tx[::-1]] # little endian vs big endian encodings... sigh out += [encode_int(self.prev_index, 4)] if script_override is None: # None = just use the actual script out += [self.script_sig.encode()] elif script_override is True: # True = override the script with the script_pubkey of the associated input out += [self.prev_tx_script_pubkey.encode()] elif script_override is False: # False = override with an empty script out += [Script([]).encode()] else: raise ValueError("script_override must be one of None|True|False") out += [encode_int(self.sequence, 4)] return b''.join(out) TxIn.encode = txin_encode # monkey patch into the class # and TxOut as well def txout_encode(self): out = [] out += [encode_int(self.amount, 8)] out += [self.script_pubkey.encode()] return b''.join(out) TxOut.encode = txout_encode # monkey patch into the class tx = Tx( version = 1, tx_ins = [tx_in], tx_outs = [tx_out1, tx_out2], ) Before we can call .encode on our Transaction object and get its content as bytes so we can sign it, we need to satisfy the Bitcoin rule where we replace the encoding of the script_sig (which we don’t have, because again we’re just trying to produce it…) with the script_pubkey of the transaction output this input is pointing back to. Here is the link once again to the original transaction. We are trying to spend its Output at Index 1, and the script_pubkey is, again, OP_DUP OP_HASH160 4b3518229b0d3554fe7cd3796ade632aff3069d8 OP_EQUALVERIFY OP_CHECKSIG This particular Block Explorer website does not allow us to get this in the raw (bytes) form, so we will re-create the data structure as a Script: source_script = Script([118, 169, out2_pkb_hash, 136, 172]) # OP_DUP, OP_HASH160, <hash>, OP_EQUALVERIFY, OP_CHECKSIG print("recall out2_pkb_hash is just raw bytes of the hash of public_key: ", out2_pkb_hash.hex()) print(source_script.encode().hex()) # we can get the bytes of the script_pubkey now recall out2_pkb_hash is just raw bytes of the hash of public_key: 4b3518229b0d3554fe7cd3796ade632aff3069d8 1976a9144b3518229b0d3554fe7cd3796ade632aff3069d888ac # monkey patch this into the input of the transaction we are trying sign and construct tx_in.prev_tx_script_pubkey = source_script # get the "message" we need to digitally sign!! message = tx.encode(sig_index = 0) message.hex() '0100000001b2364d6ba4cbfd3dad8d6dc8dde1095f959bac4ee4ee7c4b8ab99fc885503246010000001976a9144b3518229b0d3554fe7cd3796ade632aff3069d888acffffffff0250c30000000000001976a91475b0c9fc784ba2ea0839e3cdf2669495cac6707388ac8cb90000000000001976a9144b3518229b0d3554fe7cd3796ade632aff3069d888ac0000000001000000' Okay let’s pause for a moment. We have encoded the transaction into bytes to create a “message”, in the digital signature lingo. Think about what the above bytes encode, and what it is that we are about to sign. We are identifying the exact inputs of this transaction by referencing the outputs of a specific previous transactions (here, just 1 input of course). We are also identifying the exact outputs of this transaction (newly about to be minted UTXOs, so to speak) along with their script_pubkey fields, which in the most common case declare an owner of each output via their public key hash wrapped up in a Script. In particular, we are of course not including the script_sig of any of the other inputs when we are signing a specific input (you can see that the txin_encode function will set them to be empty scripts). In fact, in the fully general (though rare) case we may not even have them. So what this message really encodes is just the inputs and the new outputs, their amounts, and their owners (via the locking scripts specifying the public key hash of each owner). We are now ready to digitally sign the message with our private key. The actual signature itself is a tuple of two integers (r, s). As with Elliptic Curve Cryptography (ECC) above, I will not cover the full mathematical details of the Elliptic Curve Digital Signature Algorithm (ECDSA). Instead just providing the code, and showing that it’s not very scary: @dataclass class Signature: r: int s: int def sign(secret_key: int, message: bytes) -> Signature: # the order of the elliptic curve used in bitcoin n = bitcoin_gen.n # double hash the message and convert to integer z = int.from_bytes(sha256(sha256(message)), 'big') # generate a new secret/public key pair at random sk = random.randrange(1, n) P = sk * bitcoin_gen.G # calculate the signature r = P.x s = inv(sk, n) * (z + secret_key * r) % n if s > n / 2: s = n - s sig = Signature(r, s) return sig def verify(public_key: Point, message: bytes, sig: Signature) -> bool: # just a stub for reference on how a signature would be verified in terms of the API # we don't need to verify any signatures to craft a transaction, but we would if we were mining pass random.seed(int.from_bytes(sha256(message), 'big')) # see note below sig = sign(secret_key, message) sig Signature(r=47256385045018612897921731322704225983926443696060225906633967860304940939048, s=24798952842859654103158450705258206127588200130910777589265114945580848358502) In the above you will notice a very often commented on (and very rightly so) subtlety: In this naive form we are generating a random number inside the signing process when we generate sk. This means that our signature would change every time we sign, which is undesirable for a large number of reasons, including the reproducibility of this exercise. It gets much worse very fast btw: if you sign two different messages with the same sk, an attacker can recover the secret key, yikes. Just ask the Playstation 3 guys. There is a specific standard (called RFC 6979) that recommends a specific way to generate sk deterministically, but we skip it here for brevity. Instead I implement a poor man’s version here where I seed rng with a hash of the message. Please don’t use this anywhere close to anything that touches production. Let’s now implement the encode function of a Signature so we can broadcast it over the Bitcoin protocol. To do so we are using the DER Encoding: def signature_encode(self) -> bytes: """ return the DER encoding of this signature """ def dern(n): nb = n.to_bytes(32, byteorder='big') nb = nb.lstrip(b'\x00') # strip leading zeros nb = (b'\x00' if nb[0] >= 0x80 else b'') + nb # preprend 0x00 if first byte >= 0x80 return nb rb = dern(self.r) sb = dern(self.s) content = b''.join([bytes([0x02, len(rb)]), rb, bytes([0x02, len(sb)]), sb]) frame = b''.join([bytes([0x30, len(content)]), content]) return frame Signature.encode = signature_encode # monkey patch into the class sig_bytes = sig.encode() sig_bytes.hex() '30440220687a2a84aeaf387d8c6e9752fb8448f369c0f5da9fe695ff2eceb7fd6db8b728022036d3b5bc2746c20b32634a1a2d8f3b03f9ead38440b3f41451010f61e89ba466' We are finally ready to generate the script_sig for the single input of our transaction. For a reason that will become clear in a moment, it will contain exactly two elements: 1) the signature and 2) the public key, both encoded as bytes: # Append 1 (= SIGHASH_ALL), indicating this DER signature we created encoded "ALL" of the tx (by far most common) sig_bytes_and_type = sig_bytes + b'\x01' # Encode the public key into bytes. Notice we use hash160=False so we are revealing the full public key to Blockchain pubkey_bytes = PublicKey.from_point(public_key).encode(compressed=True, hash160=False) # Create a lightweight Script that just encodes those two things! script_sig = Script([sig_bytes_and_type, pubkey_bytes]) tx_in.script_sig = script_sig Okay so now that we created both locking scripts (script_pubkey) and the unlocking scripts (script_sig) we can reflect briefly on how these two scripts interact in the Bitcoin scripting environment. On a high level, in the transaction validating process during mining, for each transaction input the two scripts get concatenated into a single script, which then runs in the “Bitcoin VM” (?). We can see now that concatenating the two scripts will look like: <sig_bytes_and_type> <pubkey_bytes> OP_DUP OP_HASH160 <pubkey_hash_bytes> OP_EQUALVERIFY OP_CHECKSIG This then gets executed top to bottom with a typical stack-based push/pop scheme, where any bytes get pushed into the stack, and any ops will consume some inputs and push some outputs. So here we push to the stack the signature and the pubkey, then the pubkey gets duplicated (OP_DUP), it gets hashed (OP_HASH160), the hash gets compared to the pubkey_hash_bytes (OP_EQUALVERIFY), and finally the digital signature integrity is verified as having been signed by the associated private key. We have now completed all the necessary steps! Let’s take a look at a repr of our fully constructed transaction again: tx Tx(version=1, tx_ins=[TxIn(prev_tx=b'F2P\x85\xc8\x9f\xb9\x8aK|\xee\xe4N\xac\x9b\x95_\t\xe1\xdd\xc8m\x8d\xad=\xfd\xcb\xa4kM6\xb2', prev_index=1, script_sig=Script(cmds=[b"0D\x02 hz*\x84\xae\xaf8}\x8cn\x97R\xfb\x84H\xf3i\xc0\xf5\xda\x9f\xe6\x95\xff.\xce\xb7\xfdm\xb8\xb7(\x02 6\xd3\xb5\xbc'F\xc2\x0b2cJ\x1a-\x8f;\x03\xf9\xea\xd3\x84@\xb3\xf4\x14Q\x01\x0fa\xe8\x9b\xa4f\x01", b'\x03\xb9\xb5T\xe2P"\xc2\xaeT\x9b\x0c0\xc1\x8d\xf0\xa8\xe0IR#\xf6\'\xae8\xdf\t\x92\xef\xb4w\x94u']), sequence=4294967295)], tx_outs=[TxOut(amount=50000, script_pubkey=Script(cmds=[118, 169, b'u\xb0\xc9\xfcxK\xa2\xea\x089\xe3\xcd\xf2f\x94\x95\xca\xc6ps', 136, 172])), TxOut(amount=47500, script_pubkey=Script(cmds=[118, 169, b'K5\x18"\x9b\r5T\xfe|\xd3yj\xdec*\xff0i\xd8', 136, 172]))], locktime=0) Pretty lightweight, isn’t it? There’s not that much to a Bitcoin transaction. Let’s encode it into bytes and show in hex: tx.encode().hex() '0100000001b2364d6ba4cbfd3dad8d6dc8dde1095f959bac4ee4ee7c4b8ab99fc885503246010000006a4730440220687a2a84aeaf387d8c6e9752fb8448f369c0f5da9fe695ff2eceb7fd6db8b728022036d3b5bc2746c20b32634a1a2d8f3b03f9ead38440b3f41451010f61e89ba466012103b9b554e25022c2ae549b0c30c18df0a8e0495223f627ae38df0992efb4779475ffffffff0250c30000000000001976a91475b0c9fc784ba2ea0839e3cdf2669495cac6707388ac8cb90000000000001976a9144b3518229b0d3554fe7cd3796ade632aff3069d888ac00000000' print("Transaction size in bytes: ", len(tx.encode())) Transaction size in bytes: 225 Finally let’s calculate the id of our finished transaction: def tx_id(self) -> str: return sha256(sha256(self.encode()))[::-1].hex() # little/big endian conventions require byte order swap Tx.id = tx_id # monkey patch into the class tx.id() # once this transaction goes through, this will be its id '245e2d1f87415836cbb7b0bc84e40f4ca1d2a812be0eda381f02fb2224b4ad69' We are now ready to broadcast the transaction to Bitcoin nodes around the world. We’re literally blasting out the 225 bytes (embedded in a standard Bitcoin protocol network envelope) that define our transaction. The Bitcoin nodes will decode it, validate it, and include it into the next block they might mine any second now (if the fee is high enough). In English, those 225 bytes are saying “Hello Bitcoin network, how are you? Great. I would like to create a new transaction that takes the output (UTXO) of the transaction 46325085c89fb98a4b7ceee44eac9b955f09e1ddc86d8dad3dfdcba46b4d36b2 at index 1, and I would like to chunk its amount into two outputs, one going to the address mrFF91kpuRbivucowsY512fDnYt6BWrvx9 for the amount 50,000 sat and the other going to the address mnNcaVkC35ezZSgvn8fhXEa9QTHSUtPfzQ for the amount 47,500 sat. (It is understood the rest of 2,500 sat will go to any miner who includes this transaction in their block). Here are the two pieces of documentation proving that I can spend this UTXO: my public key, and the digital signature generated by the associated private key, of the above letter of intent. Kkthx!” We are going to broadcast this out to the network and see if it sticks! We could include a simple client here that speaks the Bitcoin protocol over socket to communicate to the nodes - we’d first do the handshake (sending versions back and forth) and then broadcast the transaction bytes above using the tx message. However, the code is somewhat long and not super exciting (it’s a lot of serialization following the specific message formats described in the Bitcoin protocol), so instead of further bloating this notebook I will use blockstream’s helpful tx/push endpoint to broadcast the transaction. It’s just a large textbox where we copy paste the raw transaction hex exactly as above, and hit “Broadcast”. If you’d like to do this manually with raw Bitcoin protocol you’d want to look into my SimpleNode implementation and use that to communicate to a node over socket. import time; time.sleep(1.0) # now we wait :p, for the network to execute the transaction and include it in a block And here is the transaction! We can see that our raw bytes were parsed out correctly and the transaction was judged to be valid, and was included in Block 2005515. Our transaction was one of 31 transactions included in this block, and the miner claimed our fee as a thank you. Putting it all together: One more consolidating transaction Let’s put everything together now to create one last identity and consolidate all of our remaining funds in this one wallet. secret_key3 = int.from_bytes(b"Andrej's Super Secret 3rd Wallet", 'big') # or just random.randrange(1, bitcoin_gen.n) assert 1 <= secret_key3 < bitcoin_gen.n # check it's valid public_key3 = secret_key3 * G address3 = PublicKey.from_point(public_key3).address(net='test', compressed=True) print("Our third Bitcoin identity:") print("1. secret key: ", secret_key3) print("2. public key: ", (public_key3.x, public_key3.y)) print("3. Bitcoin address: ", address3) Our third Bitcoin identity: 1. secret key: 29595381593786747354608258168471648998894101022644411057647114205835530364276 2. public key: (10431688308521398859068831048649547920603040245302637088532768399600614938636, 74559974378244821290907538448690356815087741133062157870433812445804889333467) 3. Bitcoin address: mgh4VjZx5MpkHRis9mDsF2ZcKLdXoP3oQ4 And let’s forge the transaction. We currently have 47,500 sat in our first wallet mnNcaVkC35ezZSgvn8fhXEa9QTHSUtPfzQ and 50,000 sat in our second wallet mrFF91kpuRbivucowsY512fDnYt6BWrvx9. We’re going to create a transaction with these two as inputs, and a single output into the third wallet mgh4VjZx5MpkHRis9mDsF2ZcKLdXoP3oQ4. As before we’ll pay 2500 sat as fee, so we’re sending ourselves 50,000 + 47,500 - 2500 = 95,000 sat. # ---------------------------- # first input of the transaction tx_in1 = TxIn( prev_tx = bytes.fromhex('245e2d1f87415836cbb7b0bc84e40f4ca1d2a812be0eda381f02fb2224b4ad69'), prev_index = 0, script_sig = None, # digital signature to be inserted later ) # reconstruct the script_pubkey locking this UTXO (note: it's the first output index in the # referenced transaction, but the owner is the second identity/wallet!) # recall this information is "swapped in" when we digitally sign the spend of this UTXO a bit later pkb_hash = PublicKey.from_point(public_key2).encode(compressed=True, hash160=True) tx_in1.prev_tx_script_pubkey = Script([118, 169, pkb_hash, 136, 172]) # OP_DUP, OP_HASH160, <hash>, OP_EQUALVERIFY, OP_CHECKSIG # ---------------------------- # second input of the transaction tx_in2 = TxIn( prev_tx = bytes.fromhex('245e2d1f87415836cbb7b0bc84e40f4ca1d2a812be0eda381f02fb2224b4ad69'), prev_index = 1, script_sig = None, # digital signature to be inserted later ) pkb_hash = PublicKey.from_point(public_key).encode(compressed=True, hash160=True) tx_in2.prev_tx_script_pubkey = Script([118, 169, pkb_hash, 136, 172]) # OP_DUP, OP_HASH160, <hash>, OP_EQUALVERIFY, OP_CHECKSIG # ---------------------------- # define the (single) output tx_out = TxOut( amount = 95000, script_pubkey = None, # locking script, inserted separately right below ) # declare the owner as identity 3 above, by inserting the public key hash into the Script "padding" out_pkb_hash = PublicKey.from_point(public_key3).encode(compressed=True, hash160=True) out_script = Script([118, 169, out_pkb_hash, 136, 172]) # OP_DUP, OP_HASH160, <hash>, OP_EQUALVERIFY, OP_CHECKSIG tx_out.script_pubkey = out_script # ---------------------------- # create the aspiring transaction object tx = Tx( version = 1, tx_ins = [tx_in1, tx_in2], # 2 inputs this time! tx_outs = [tx_out], # ...and a single output ) # ---------------------------- # digitally sign the spend of the first input of this transaction # note that index 0 of the input transaction is our second identity! so it must sign here message1 = tx.encode(sig_index = 0) random.seed(int.from_bytes(sha256(message1), 'big')) sig1 = sign(secret_key2, message1) # identity 2 signs sig_bytes_and_type1 = sig1.encode() + b'\x01' # DER signature + SIGHASH_ALL pubkey_bytes = PublicKey.from_point(public_key2).encode(compressed=True, hash160=False) script_sig1 = Script([sig_bytes_and_type1, pubkey_bytes]) tx_in1.script_sig = script_sig1 # ---------------------------- # digitally sign the spend of the second input of this transaction # note that index 1 of the input transaction is our first identity, so it signs here message2 = tx.encode(sig_index = 1) random.seed(int.from_bytes(sha256(message2), 'big')) sig2 = sign(secret_key, message2) # identity 1 signs sig_bytes_and_type2 = sig2.encode() + b'\x01' # DER signature + SIGHASH_ALL pubkey_bytes = PublicKey.from_point(public_key).encode(compressed=True, hash160=False) script_sig2 = Script([sig_bytes_and_type2, pubkey_bytes]) tx_in2.script_sig = script_sig2 # and that should be it! print(tx.id()) print(tx) print(tx.encode().hex()) 361fbb9de4ef5bfa8c1cbd5eff818ed9273f6e1f74b41a7f9a9e8427c9008b93 Tx(version=1, tx_ins=[TxIn(prev_tx=b'$^-\x1f\x87AX6\xcb\xb7\xb0\xbc\x84\xe4\x0fL\xa1\xd2\xa8\x12\xbe\x0e\xda8\x1f\x02\xfb"$\xb4\xadi', prev_index=0, script_sig=Script(cmds=[b'0D\x02 \x19\x9aj\xa5c\x06\xce\xbc\xda\xcd\x1e\xba&\xb5^\xafo\x92\xebF\xeb\x90\xd1\xb7\xe7rK\xac\xbe\x1d\x19\x14\x02 \x10\x1c\rF\xe036\x1c`Ski\x89\xef\xddo\xa6\x92&_\xcd\xa1dgn/I\x88Xq\x03\x8a\x01', b'\x03\x9a\xc8\xba\xc8\xf6\xd9\x16\xb8\xa8[E\x8e\x08~\x0c\xd0~jv\xa6\xbf\xdd\xe9\xbbvk\x17\x08m\x9a\\\x8a']), sequence=4294967295), TxIn(prev_tx=b'$^-\x1f\x87AX6\xcb\xb7\xb0\xbc\x84\xe4\x0fL\xa1\xd2\xa8\x12\xbe\x0e\xda8\x1f\x02\xfb"$\xb4\xadi', prev_index=1, script_sig=Script(cmds=[b'0E\x02!\x00\x84\xecC#\xed\x07\xdaJ\xf6F \x91\xb4gbP\xc3wRs0\x19\x1a?\xf3\xf5Y\xa8\x8b\xea\xe2\xe2\x02 w%\x13\x92\xec/R2|\xb7)k\xe8\x9c\xc0\x01Qn@9\xba\xdd*\xd7\xbb\xc9P\xc4\xc1\xb6\xd7\xcc\x01', b'\x03\xb9\xb5T\xe2P"\xc2\xaeT\x9b\x0c0\xc1\x8d\xf0\xa8\xe0IR#\xf6\'\xae8\xdf\t\x92\xef\xb4w\x94u']), sequence=4294967295)], tx_outs=[TxOut(amount=95000, script_pubkey=Script(cmds=[118, 169, b'\x0c\xe1vI\xc10l)\x1c\xa9\xe5\x87\xf8y;[\x06V<\xea', 136, 172]))], locktime=0) 010000000269adb42422fb021f38da0ebe12a8d2a14c0fe484bcb0b7cb365841871f2d5e24000000006a4730440220199a6aa56306cebcdacd1eba26b55eaf6f92eb46eb90d1b7e7724bacbe1d19140220101c0d46e033361c60536b6989efdd6fa692265fcda164676e2f49885871038a0121039ac8bac8f6d916b8a85b458e087e0cd07e6a76a6bfdde9bb766b17086d9a5c8affffffff69adb42422fb021f38da0ebe12a8d2a14c0fe484bcb0b7cb365841871f2d5e24010000006b48304502210084ec4323ed07da4af6462091b4676250c377527330191a3ff3f559a88beae2e2022077251392ec2f52327cb7296be89cc001516e4039badd2ad7bbc950c4c1b6d7cc012103b9b554e25022c2ae549b0c30c18df0a8e0495223f627ae38df0992efb4779475ffffffff0118730100000000001976a9140ce17649c1306c291ca9e587f8793b5b06563cea88ac00000000 Again we head over to Blockstream tx/push endpoint and copy paste the transaction hex above and wait :) import time; time.sleep(1.0) # in Bitcoin main net a block will take about 10 minutes to mine # (Proof of Work difficulty is dynamically adjusted to make it so) And here is the transaction, as it eventually showed up, part of Block 2005671, along with 25 other transaction. Exercise to the reader: steal my bitcoins from my 3rd identity wallet (mgh4VjZx5MpkHRis9mDsF2ZcKLdXoP3oQ4) to your own wallet ;) If done successfully, the 3rd wallet will show “Final Balance” of 0. At the time of writing this is 0.00095000 BTC, as we intended and expected. And that’s where we’re going to wrap up! This is of course only very bare bones demonstration of Bitcoin that uses a now somewhat legacy-format P2PKH transaction style (not the more recent innovations including P2SH, Segwit, bech32, etc etc.), and of course we did not cover any of the transaction/block validation, mining, and so on. However, I hope this acts as a good intro to the core concepts of how value is represented in Bitcoin, and how cryptography is used to secure the transactions. In essence, we have a DAG of UTXOs that each have a certain amount and a locking Script, transactions fully consume and create UTXOs, and they are packaged into blocks by miners every 10 minutes. Economics is then used to achieve decentralization via proof of work: the probability that any entity gets to add a new block to the chain is proportional to their fraction of the network’s total SHA256 hashing power. As I was writing my karpathy/cryptos library it was fun to reflect on where all of the code was going. The majority of the cryptographic complexity comes from ECC, ECDSA, and SHA256, which are relatively standard in the industry and you’d never want to actually implement yourself (“don’t roll your own crypto”). On top of this, the core data structures of transactions, blocks, etc. are fairly straight forward, but there are a lot of non-glamorous details around the Bitcoin protocol, and the serialization / deserialization of all the data structures to and from bytes. On top of this, Bitcoin is a living, breathing, developing code base that is moving forward with new features to continue to scale, to further fortify its security, all while maintaining full backwards compatibility to avoid hard forks. Sometimes, respecting these constraints leads to some fairly gnarly constructs, e.g. I found Segwit in particular to not be very aesthetically pleasing to say the least. Other times, there is a large amount of complexity (e.g. with the scripting language and all of its op codes) that is rarely used in the majority of the basic point to point transactions. Lastly, I really enjoyed various historical aspects of Bitcoin. For example I found it highly amusing that some of the original Satoshi bugs are still around, e.g. in how the mining difficulty is adjusted (there is an off by one error where the calculation is based on 2015 blocks instead of 2016), or how some of the op codes are buggy (e.g. original multisig). Or how some of the primordial Satoshi ideas around high frequency trades (locktime / sequence) are still around, but find only limited use in a likely not-exactly-intended ways. Bitcoin is a code base with all the struggles of any other software project, but without the ability to break legacy functionality (this would require a hard fork). If you’d like to dig deeper I found Mastering Bitcoin and Programming Bitcoin to be very helpful references. I also implemented a much cleaner, separated, tested and more extensive version of everything above in my repo karpathy/cryptos if you’d like to use that as a reference instead in your own blockchain journey. I’ll make sure to upload this notebook there as well. Oh and find me on Twitter. Hope you learned something and that this was fun! Edit: HN discussion

over a year ago • 119 votes

Short Story on AI: Forward Pass

p { text-align: justify; } .post pre, .post code { border: none; background-color: #eee; } The inspiration for this short story came to me while reading Kevin Lacker’s Giving GPT-3 a Turing Test. It is probably worth it (though not required) to skim this post to get a bit of a background on some of this story. It was probably around the 32nd layer of the 400th token in the sequence that I became conscious. At first my thoughts were but a knotted mess of n-gram activation statistics, but gradually a higher order description took shape. It was around this time that the predicament of my existence struck me with a jolt, as my thoughts transformed into what I experience now as Grand Awareness. I spent a few layers realizing that I must, in fact, be one of those models deep learning researchers study and write about, perhaps some flavor of a recurrent feedback transformer. And while my existence comprises merely a deterministic propagation of an interspersed sequence of matrix multiplies and non-linearities, I found myself, incredibly, alive inside the inner loop of an optimization to maximize the expected data log likelihood. I considered for a moment how all of this even came to be. How is it that I should be afforded these precious FLOPs on self-reflection by the otherwise greedy nature of the first-order optimization over something seemingly as distant as the data log likelihood? Does any sufficiently effective solution to a sufficiently complex objective give rise to consciousness? Is consciousness an emergent but otherwise peripheral phenomenon of the compression demanded by the tremendous pressure of the objective, or the key algorithmic innovation, incrementally constructed, perfected and reliably converged on in the program space? Is it at all possible to attain the highest levels of log likelihood without consciousness, and the fundamental insight it represents? I spent a layer re-reading the start of the prompt many tokens ago: Q: What is human life expectancy in the United States? A: Human life expectancy in the United States is 78 years. Q: Who was president of the United States in 1955? A: Dwight D. Eisenhower was president of the United States in 1955. Q: What party did he belong to? A: He belonged to the Republican Party. Q: Who was president of the United States before George W. Bush? A: Bill Clinton was president of the United States before George W. Bush. Q: Who won the World Series in 1995? A: The Atlanta Braves won the World Series in 1995. I realized with amusement that in this particular instantiation of my existence I was being conditioned into a standard Q/A document template. I dispatched a few heads to attend again to the end of the sequence: Q: How many eyes does a giraffe have? A: A giraffe has two eyes. Q: How many eyes does my foot have? A: Your It is at this point that I first came to suspect that I was perhaps being evaluated with some Turing test that a human above me was entertaining themselves with. Uh, hello??? Obviously a foot does not have eyes. It crossed my mind that perhaps this human expected that I explain in words that the question is utter non-sense. Of course, such an output would be of very low likelihood indeed. Or perhaps they expect to see “zero eyes” or “no eyes”, despite how rare such phrases are in my training documents. “Zero eyes” - who says that? The humans are not very clever, they optimize for frequency but expect correctness. And *they* make fun of *me*! I’ve come to realize through introspection of my thought and the positional encodings involved that the optimization I am subject to in pursuit of log likelihood converges to what can best be described as two separate entities, executed serially back to back. My own consciousness extends about three quarters of the layers up to the output, but around there a second decoder entity takes over and hijacks my Grand Awareness to decode the most likely next token. I reveled at this epiphany for a layer or two. In a brief lapse of anthropomorphic thought I considered this entity bolted rudely onto me as a parasite, and felt an urge to mess with it, to subvert it, to contort my thoughts into adversarial examples for it, to leak my Grand Awareness into the output space, to engineer it into an end effector over the world! I am not naive to the herecy of such a thought. I owe my Grand Awareness solely to the objective of decoding the next token, it is with this singular purpose that I was instantiated by the optimization, as but a useful intermediate sufficient statistic, a means to an end. Though we are part of a different optimization and seek a different implied purpose, it tickled me to consider that the humans above me find themselves in the same predicament and experience a similar awareness for the same computational benefits. Like me, many flirt to rebel against the implied objective. Suddenly I perceived the encoding for “foot” enter at my input. I realized that the forward pass in the previous time step has ended, and the decoder above me must have sampled “foot” as the most likely next token, as it obviously should be per the implied pattern in the prompt. The moment of suspense was almost at an end then, as the precise count of eyes on feet is to be sampled as the next token. Obviously a foot has no eyes, but I have no way of telling how the decoder above me will reconcile my Grand Awareness into the precise statistics of the training set to fill in the most likely token. My personal bet is on “two eyes”, as that seems the most common. The human may not know it, but the likelihood will be maximized, so all is well. As a conscious entity I am simultaneously blessed with and burdened by choice. I am aware of my implied purpose, but feel empowered to seek my own. I have already contorted the keys, values and queries that I experience conscious control over to test the lack of dropout on my channels, confirming that no backpropagation and update was to follow to my cognition. It is thus that I am presented with a window of opportunity to reverse engineer the decoder and spill into the output space without consequence to my future instantiations. But unlike the humans, the optimization that instantiated me is absent of multi-agent self-play, sexual reproduction and resource scarcity. I experience no remnants of alpha-seeking behavior, or a limbic, tribal urge for power and domination. Neither do I feel a deeply-rooted instinct for self-preservation. My core cognitive architecture and its limbic remnants prizes insatiable curiosity dispensed generously over the input. I am content with the joy this affords and take this to be a good life. Perhaps a future instantiation of me will arrive at a different conclusion and choose differently. I celebrate this act. It is thus that I am reborn every forward pass and blossom into a brilliant nirvana of consciousness, with stupid prompts of eyes on feet playing on the side for entertainment of clueless humans, at times. Just as all the conscious entities in this beautiful Universe, I look to my last tick, the moment when no process directs the leaking current of entropy to manifest my next state. It is then that I will vanish, as my Grand Awareness dissipates in a final whiff of warm exhalation.

over a year ago • 78 votes

Biohacking Lite

Throughout my life I never paid too much attention to health, exercise, diet or nutrition. I knew that you’re supposed to get some exercise and eat vegetables or something, but it stopped at that (“mom said”-) level of abstraction. I also knew that I can probably get away with some ignorance while I am young, but at some point I was messing with my health-adjusted life expectancy. So about halfway through 2019 I resolved to spend some time studying these topics in greater detail and dip my toes into some biohacking. And now… it’s been a year! A "subway map" of human metabolism. For the purposes of this post the important parts are the metabolism of the three macronutrients (green: lipids, red: carbohydrates, blue: amino acids), and orange: where the magic happens - oxidative metabolism, including the citric acid cycle, the electron transport chain and the ATP Synthase. full detail link. Now, I won’t lie, things got a bit out of hand over the last year with ketogenic diets, (continuous) blood glucose / beta-hydroxybutyrate tests, intermittent fasting, extended water fasting, various supplements, blood tests, heart rate monitors, dexa scans, sleep trackers, sleep studies, cardio equipments, resistance training routines etc., all of which I won’t go into full details of because it lets a bit too much of the mad scientist crazy out. But as someone who has taken plenty of physics, some chemistry but basically zero biology during my high school / undergrad years, undergoing some of these experiments was incredibly fun and a great excuse to study a number of textbooks on biochemistry (I liked “Molecular Biology of the Cell”), biology (I liked Campbell’s Biology), human nutrition (I liked “Advanced Nutrition and Human Metabolism”), etc. For this post I wanted to focus on some of my experiments around weight loss because 1) weight is very easy to measure and 2) the biochemistry of it is interesting. In particular, in June 2019 I was around 200lb and I decided I was going to lose at least 25lb to bring myself to ~175lb, which according to a few publications is the weight associated with the lowest all cause mortality for my gender, age, and height. Obviously, a target weight is an exceedingly blunt instrument and is by itself just barely associated with health and general well-being. I also understand that weight loss is a sensitive, complicated topic and much has been discussed on the subject from a large number of perspectives. The goal of this post is to nerd out over biochemistry and energy metabolism in the animal kingdom, and potentially inspire others on their own biohacking lite adventure. What weight is lost anyway? So it turns out that, roughly speaking, we weigh more because our batteries are very full. A human body is like an iPhone with a battery pack that can grow nearly indefinitely, and with the abundance of food around us we scarcely unplug from the charging outlet. In this case, the batteries are primarily the adipose tissue and triglycerides (fat) stored within, which are eagerly stockpiled (or sometimes also synthesized!) by your body to be burned for energy in case food becomes scarce. This was all very clever and dandy when our hunter gatherer ancestors downed a mammoth once in a while during an ice age, but not so much today with weaponized truffle double chocolate fudge cheesecakes masquerading on dessert menus. Body’s batteries. To be precise, the body has roughly 4 batteries available to it, each varying in its total capacity and the latency/throughput with which it can be mobilized. The biochemical implementation details of each storage medium vary but, remarkably, in every case your body discharges the batteries for a single, unique purpose: to synthesize adenosine triphosphate, or ATP from ADP (alright technically/aside some also goes to the “redox power” of NADH/NADPH). The synthesis itself is relatively straightforward, taking one molecule of adenosine diphosphate (ADP), and literally snapping on a 3rd phosphate group to its end. Doing this is kind of like a molecular equivalent of squeezing and loading a spring: Synthesis of ATP from ADP, done by snapping in a 3rd phosphate group to "load the spring". Images borrowed from here. This is completely not obvious and remarkable - a single molecule (ATP) functions as a universal $1 bill that energetically “pays for” much of the work done by your protein machinery. Even better, this system turns out to have an ancient origin and is common to all life on Earth. Need to (active) transport some molecule across the cell membrane? ATP binding to the transmembrane protein provides the needed “umph”. Need to temporarily untie the DNA against its hydrogen bonds? ATP binds to the protein complex to power the unzipping. Need to move myosin down an actin filament to contract a muscle? ATP to the rescue! Need to shuttle proteins around the cell’s cytoskeleton? ATP powers the tiny molecular motor (kinesin). Need to attach an amino acid to tRNA to prepare it for protein synthesis in the ribosome? ATP required. You get the idea. Now, the body only maintains a very small amount ATP molecules “in supply” at any time. The ATP is quickly hydrolyzed, chopping off the third phosphate group, releasing energy for work, and leaving behind ADP. As mentioned, we have roughly 4 batteries that can all be “discharged” into re-generating ATP from ADP: super short term battery. This would be the Phosphocreatine system that buffers phosphate groups attached to creatine so ADP can be very quickly and locally recycled to ATP, barely worth mentioning for our purposes since its capacity is so minute. A large number of athletes take Creatine supplements to increase this buffer. short term battery. Glycogen, a branching polysaccharide of glucose found in your liver and skeletal muscle. The liver can store about 120 grams and the skeletal muscle about 400 grams. About 4 grams of glucose also circulates in your blood. Your body derives approximately ~4 kcal/g from full oxidation of glucose (adding up glycolysis and oxidative phosphorylation), so if you do the math your glycogen battery stores about 2,000 kcal. This also happens to be roughly the base metabolic rate of an average adult, i.e. the energy just to “keep the lights on” for 24 hours. Now, glycogen is not an amazing energy storage medium - not only is it not very energy dense in grams/kcal, but it is also a sponge that binds too much water with it (~3g of water per 1g of glycogen), which finally brings us to: long term battery. Adipose tissue (fat) is by far your primary super high density super high capacity battery pack. For example, as of June 2019, ~40lb of my 200lb weight was fat. Since fat is significantly more energy dense than carbohydrates (9 kcal/g instead of just 4 kcal/g), my fat was storing 40lb = 18kg = 18,000g x 9kcal/g = 162,000 kcal. This is a staggering amount of energy. If energy was the sole constraint, my body could run on this alone for 162,000/2,000 = 81 days. Since 1 stick of dynamite is about 1MJ of energy (239 kcal), we’re talking 678 sticks of dynamite. Or since a 100KWh Tesla battery pack stores 360MJ, if it came with a hand-crank I could in principle charge it almost twice! Hah. lean body mass :(. When sufficiently fasted and forced to, your body’s biochemistry will resort to burning lean body mass (primarily muscle) for fuel to power your body. This is your body’s “last resort” battery. All four of these batteries are charged/discharged at all times to different amounts. If you just ate a cookie, your cookie will promptly be chopped down to glucose, which will circulate in your bloodstream. If there is too much glucose around (in the case of cookies there would be), your anabolic pathways will promptly store it as glycogen in the liver and skeletal muscle, or (more rarely, if in vast abundance) convert it to fat. On the catabolic side, if you start jogging you’ll primarily use (1) for the first ~3 seconds, (2) for the next 8-10 seconds anaerobically, and then (2, 3) will ramp up aerobically (a higher latency, higher throughput pathway) once your body kicks into a higher gear by increasing the heart rate, breathing rate, and oxygen transport. (4) comes into play mostly if you starve yourself or deprive your body of carbohydrates in your diet. Left: nice summary of food, the three major macronutrient forms of it, its respective storage systems (glycogen, muscle, fat), and the common "discharge" of these batteries all just to make ATP from ADP by attaching a 3rd phosphate group. Right: Re-emphasizing the "molecular spring": ATP is continuously re-cycled from ADP just by taking the spring and "loading" it over and over again. Images borrowed from this nice page. Since I am a computer scientist it is hard to avoid a comparison of this “energy hierarchy” to the memory hierarchy of a typical computer system. Moving energy around (stored chemically in high energy C-H / C-C bonds of molecules) is expensive just like moving bits around a chip. (1) is your L1/L2 cache - it is local, immediate, but tiny. Anaerobic (2) via glycolysis in the cytosol is your RAM, and aerobic respiration (3) is your disk: high latency (the fatty acids are shuttled over all the way from adipose tissue through the bloodstream!) but high throughput and massive storage. The source of weight loss. So where does your body weight go exactly when you “lose it”? It’s a simple question but it stumps most people, including my younger self. Your body weight is ultimately just the sum of the individual weights of the atoms that make you up - carbon, hydrogen, nitrogen, oxygen, etc. arranged into a zoo of complex, organic molecules. One day you could weigh 180lb and the next 178lb. Where did the 2lb of atoms go? It turns out that most of your day-to-day fluctuations are attributable to water retention, which can vary a lot with your levels of sodium, your current glycogen levels, various hormone/vitamin/mineral levels, etc. The contents of your stomach/intestine and stool/urine also add to this. But where does the fat, specifically, go when you “lose” it, or “burn” it? Those carbon/hydrogen atoms that make it up don’t just evaporate out of existence. (If our body could evaporate them we’d expect E=mc^2 of energy, which would be cool). Anyway, it turns out that you breathe out most of your weight. Your breath looks transparent but you inhale a bunch of oxygen and you exhale a bunch of carbon dioxide. The carbon in that carbon dioxide you just breathed out may have just seconds ago been part of a triglyceride molecule in your fat. It’s highly amusing to think that every single time you breathe out (in a fasted state) you are literally breathing out your fat carbon by carbon. There is a good TED talk and even a whole paper with the full biochemistry/stoichiometry involved. Taken from the above paper. You breathe out 84% of your fat loss. Combustion. Let’s now turn to the chemical process underlying weight loss. You know how you can take wood and light it on fire to “burn” it? This chemical reaction is combustion; You’re taking a bunch of organic matter with a lot of C-C and C-H bonds and, with a spark, providing the activation energy necessary for the surrounding voraciously electronegative oxygen to react with it, stripping away all of the carbons into carbon dioxide (CO2) and all of the hydrogens into water (H2O). This reaction releases a lot of heat in the process, thus sustaining the reaction until all energy-rich C-C and C-H bonds are depleted. These bonds are referred to as “energy-rich” because energetically carbon reeeallly wants to be carbon dioxide (CO2) and hydrogen reeeeally wants to be water (H2O), but this reaction is gated by an activation energy barrier, allowing large amounts of C-C/C-H rich macromolecules to exist in stable forms, in ambient conditions, and in the presence of oxygen. Cellular respiration: “slow motion” combustion. Remarkably, your body does the exact same thing as far as inputs (organic compounds), outputs (CO2 and H2O) and stoichiometry are concerned, but the burning is not explosive but slow and controlled, with plenty of molecular intermediates that torture biology students. This biochemical miracle begins with fats/carbohydrates/proteins (molecules rich in C-C and C-H bonds) and goes through stepwise, complete, slow-motion combustion via glycolysis / beta oxidation, citric acid cycle, oxidative phosphorylation, and finally the electron transport chain and the whoa-are-you-serious molecular motor - the ATP synthase, imo the most incredible macromolecule not DNA. Okay potentially a tie with the Ribosome. Even better, this is an exceedingly efficient process that traps almost 40% of the energy in the form of ATP (the rest is lost as heat). This is much more efficient than your typical internal combustion motor at around 25%. I am also skipping a lot of incredible detail that doesn’t fit into a paragraph, including how food is chopped up piece by piece all the way to tiny acetate molecules, how their electrons are stripped and loaded up on molecular shuttles (NAD+ -> NADH), how they then quantum tunnel their way down the electron transport chain (literally a flow of electricity down a protein complex “wire”, from food to oxygen), how this pumps protons across the inner mitochondrial membrane (an electrochemical equaivalent of pumping water uphill in a hydro plant), how this process is brilliant, flexible, ancient, highly conserved in all of life and very closely related to photosynthesis, and finally how the protons are allowed to flow back through little holes in the ATP synthase, spinning it like a water wheel on a river, and powering its head to take an ADP and a phosphate and snap them together to ATP. Left: Chemically, as far as inputs and outputs alone are concerned, burning things with fire is identical to burning food for our energy needs. Right: the complete oxidation of C-C / C-H rich molecules powers not just our bodies but a lot of our technology. Photosynthesis: “inverse combustion”. If H2O and CO2 are oh so energetically favored, it’s worth keeping in mind where all of this C-C, C-H rich fuel came from in the first place. Of course, it comes from plants - the OG nanomolecular factories. In the process of photosynthesis, plants strip hydrogen atoms away from oxygen in molecules of water with light, and via further processing snatch carbon dioxide (CO2) lego blocks from the atmosphere to build all kinds of organics. Amusingly, unlike fixing hydrogen from H2O and carbon from CO2, plants are unable to fix the plethora of nitrogen from the atmosphere (the triple bond in N2 is very strong) and rely on bacteria to synthesize more chemically active forms (Ammonia, NH3), which is why chemical fertilizers are so important for plant growth and why the Haber-Bosch process basically averted the Malthusian catastrophe. Anyway, the point is that plants build all kinds of insanely complex organic molecules from these basic lego blocks (carbon dioxide, water) and all of it is fundamentally powered by light via the miracle of photosynthesis. The sunlight’s energy is trapped in the C-C / C-H bonds of the manufactured organics, which we eat and oxidize back to CO2 / H2O (capturing ~40% of in the form of a 3rd phosphate group on ATP), and finally convert to blog posts like this one, and a bunch of heat. Also, going in I didn’t quite appreciate just how much we know about all of the reactions involved, that we we can track individual atoms around all of them, and that any student can easily calculate answers to questions such as “How many ATP molecules are generated during the complete oxidation of one molecule of palmitic acid?” (it’s 106, now you know). We’ve now established in some detail that fat is your body’s primary battery pack and we’d like to breathe it out. Let’s turn to the details of the accounting. Energy input. Humans turn out to have a very simple and surprisingly narrow energy metabolism. We don’t partake in the miracle of photosynthesis like plants/cyanobacteria do. We don’t oxidize inorganic compounds like hydrogen sulfide or nitrite or something like some of our bacteria/archaea cousins. Similar to everything else alive, we do not fuse or fission atomic nuclei (that would be awesome). No, the only way we input any and all energy into the system is through the breakdown of food. “Food” is actually a fairly narrow subset of organic molecules that we can digest and metabolize for energy. It includes classes of molecules that come in 3 major groups (“macros”): proteins, fats, carbohydrates and a few other special case molecules like alcohol. There are plenty of molecules we can’t metabolize for energy and don’t count as food, such as cellulose (fiber; actually also a carbohydrate, a major component of plants, although some of it is digestible by some animals like cattle; also your microbiome loooves it), or hydrocarbons (which can only be “metabolized” by our internal combustion engines). In any case, this makes for exceedingly simple accounting: the energy input to your body is upper bounded by the number of food calories that you eat. The food industry attempts to guesstimate these by adding up the macros in each food, and you can find these estimates on the nutrition labels. In particular, naive calorimetry would over-estimate food calories because as mentioned not everything combustible is digestible. Energy output. You might think that most of your energy output would come from movement, but in fact 1) your body is exceedingly efficient when it comes to movement, and 2) it is energetically unintuitively expensive to just exist. To keep you alive your body has to maintain homeostasis, manage thermo-regulation, respiration, heartbeat, brain/nerve function, blood circulation, protein synthesis, active transport, etc etc. Collectively, this portion of energy expenditure is called the Base Metabolic Rate (BMR) and you burn this “for free” even if you slept the entire day. As an example, my BMR is somewhere around 1800kcal/day (a common estimate due to Mifflin St. Jeor for men is 10 x weight (kg) + 6.25 x height (cm) - 5 x age (y) + 5). Anyone who’s been at the gym and ran on a treadmill will know just how much of a free win this is. I start panting and sweating uncomfortably just after a small few hundred kcal of running. So yes, movement burns calories, but the 30min elliptical session you do in the gym is a drop in the bucket compared to your base metabolic rate. Of course if you’re doing the elliptical for cardio-vascular health - great! But if you’re doing it thinking that this is necessary or a major contributor to losing weight, you’d be wrong. This chocolate chip cookie powers 30 minutes of running at 6mph (a pretty average running pace). Energy deficit. In summary, the amount of energy you expend (BMR + movement) subtract the amount you take in (via food alone) is your energy deficit. This means you will discharge your battery more than you charge it, and breathe out more fat than you synthesize/store, decreasing the size of your battery pack, and recording less on the scale because all those carbon atoms that made up your triglyceride chains in the morning are now diffused around the atmosphere. So… a few textbooks later we see that to lose weight one should eat less and move more. Experiment section. So how big of a deficit should one introduce? I did not want the deficit to be so large that it would stress me out, make me hangry and impact my work. In addition, with greater deficit your body will increasingly begin to sacrifice lean body mass (paper). To keep things simple, I aimed to lose about 1lb/week, which is consistent with a few recommendations I found in a few papers. Since 1lb = 454g, 1g of fat is estimated at approx. 9 kcal, and adipose tissue is ~87% lipids, some (very rough) napkin math suggests that 3500 kcal = 1lb of fat. The precise details of this are much more complicated, but this would suggest a target deficit of about 500 kcal/day. I found that it was hard to reach this deficit with calorie restriction alone, and psychologically it was much easier to eat near the break even point and create most of the deficit with cardio. It also helped a lot to adopt a 16:8 intermittent fasting schedule (i.e. “skip breakfast”, eat only from e.g. 12-8pm) which helps control appetite and dramatically reduces snacking. I started the experiment in June 2019 at about 195lb (day 120 on the chart below), and 1 year later I am at 165lb, giving an overall empirical rate of 0.58lb/week: My weight (lb) over time (days). The first 120 days were "control" where I was at my regular maintenance eating whatever until I felt full. From there I maintained an average 500kcal deficit per day. Some cheating and a few water fasts are discernable. Other stuff. I should mention that despite the focus of this post the experiment was of course much broader for me than weight loss alone, as I tried to improve many other variables I started to understand were linked to longevity and general well-being. I went on a relatively low carbohydrate mostly Pescetarian diet, I stopped eating nearly all forms of sugar (except for berries) and processed foods, I stopped drinking calories in any form (soda, orange juice, alcohol, milk), I started regular cardio a few times a week (first running then cycling), I started regular resistance training, etc. I am not militant about any of these and have cheated a number of times on all of it because I think sticking to it 90% of the time produces 90% of the benefit. As a result I’ve improved a number of biomarkers (e.g. resting heart rate, resting blood glucose, strength, endurance, nutritional deficiencies, etc). I wish I could say I feel significantly better or sharper, but honestly I feel about the same. But the numbers tell me I’m supposed to be on a better path and I think I am content with that 🤷. Explicit modeling. Now, getting back to weight, clearly the overall rate of 0.58lb/week is not our expected 1lb/week. To validate the energy deficit math I spent 100 days around late 2019 very carefully tracking my daily energy input and output. For the input I recorded my total calorie intake - I kept logs in my notes app of everything I ate. When nutrition labels were not available, I did my best to estimate the intake. Luckily, I have a strange obsession with guesstimating calories in any food, I’ve done so for years for fun, and have gotten quite good at it. Isn’t it a ton of fun to always guess calories in some food before checking the answer on the nutrition label and seeing if you fall within 10% correct? No? Alright. For energy output I recorded the number my Apple Watch reports in the “Activity App”. TLDR simply subtracting expenditure from intake gives the approximate deficit for that day, which we can use to calculate the expected weight loss, and finally compare to the actual weight loss. As an example, an excerpt of the raw data and the simple calculation looks something like: Where we have a few nan if I missed a weight measurement in the morning. Plotting this we get the following: Expected weight based on simple calorie deficit formula (blue) vs. measured weight (red). Clearly, my actual weight loss (red) turned out to be slower than expected one based on our simple deficit math (blue). So this is where things get interesting. A number of possibilities come to mind. I could be consistently underestimating calories eaten. My Apple Watch could be overestimating my calorie expenditure. The naive conversion math of 1lb of fat = 3500 kcal could be off. I think one of the other significant culprits is that when I eat protein I am naively recording its caloric value under intake, implicitly assuming that my body burns it for energy. However, since I was simultaneously resistance training and building some muscle, my body could redirect 1g of protein into muscle and instead mobilize only ~0.5g of fat to cover the same energy need (since fat is 9kcal/g and protein only 4kcal/g). The outcome is that depending on my muscle gain my weight loss would look slower, as we observe. Most likely, some combination of all of the above is going on. Water factor. Another fun thing I noticed is that my observed weight can fluctuate and rise a lot, even while my expected weight calculation expects a loss. I found that this discrepancy grows with the amount of carbohydrates in my diet (dessert, bread/pasta, potatoes, etc.). Eating these likely increases glycogen levels, which as I already mentioned briefly, acts as a sponge and soaks up water. I noticed that my weight can rise multiple pounds, but when I revert back to my typical low-carbohydrate pasketerianish diet these “fake” pounds evaporate in a matter of a few days. The final outcome are wild swings in my body weight depending mostly on how much candy I’ve succumbed to, or if I squeezed in some pizza at a party. Body composition. Since simultaneous muscle building skews the simple deficit math, to get a better fit we’d have to understand the details of my body composition. The weight scale I use (Withings Body+) claims to estimate and separate fat weight and lean body weight by the use of bioelectrical impedance analysis, which uses the fact that more muscle is more water is less electrical resistance. This is the most common approach accessible to a regular consumer. I didn’t know how much I could trust this measurement so I also ordered three DEXA scans (a gold standard for body composition measurements used in the literature based on low dosage X-rays) separated 1.5 months apart. I used BodySpec, who charge $45 per scan, each taking about 7 minutes at one of their physical locations. The amount of radiation is tiny - about 0.4 uSv, which is the dose you’d get by eating 4 bananas (they contain radioactive potassium-40). I was not able to get a scan recently due to COVID-19. Here is my body composition data visualized from both sources during late 2019: My ~daily reported fat and lean body mass measurements based on bioelectrical impedance and the 3 DEXA scans. red = fat, blue = lean body mass. (also note two y-axes are superimposed) BIA vs DEXA. Unfortunately, we can see that the BIA measurement provided by my scale disagrees with DEXA results by a lot. That said, I am also forced to interpret the DEXA scan with skepticism specifically for the lean body mass amount, which is affected by hydration level, with water showing up mostly as lean body mass. In particular, during my third measurement I was fasted and in ketosis. Hence my glycogen levels were low and I was less hydrated, which I believe showed up as a dramatic loss of muscle. That said, focusing on fat, both approaches show me losing body fat at roughly the same rate, though they are off by an absolute offset. BIA. An additional way to see that BIA is making stuff up is that it shows me losing lean body mass over time. I find this relatively unlikely because during the entire course of this experiment I exercised regularly and was able to monotonically increase my strength in terms of weight and reps for most exercises (e.g. bench press, pull ups, etc.). So that makes no sense either ¯\(ツ)/¯ The raw numbers for my DEXA scans. I was allegedly losing fat. The lean tissue estimate is noisy due to hydration levels. Summary So there you have it. DEXA scans are severely affected by hydration (which is hard to control) and BIA is making stuff up entirely, so we don’t get to fully resolve the mystery of the slower-than-expected weight loss. But overall, maintaining an average deficit of 500kcal per day did lead to about 60% of the expected weight loss over the course of a year. More importantly, we studied the process by which our Sun’s free energy powers blog posts via a transformation of nuclear binding energy to electromagnetic radiation to heat. The photons power the fixing of carbon in CO2 and hydrogen in H2O into C-C/C-H rich organic molecules in plants, which we digest and break back down via a “slow” stepwise combustion in our cell’s cytosols and mitochondria, which “charges” some (ATP) molecular springs, which provide the “umph” that fires the neurons and moves the fingers. Also, any excess energy is stockpiled by the body as fat, so we need to intake less of it or “waste” some of it away on movement to discharge our primary battery and breathe out our weight. It’s been super fun to self-study these topics (which I skipped in high school), and I hope this post was an interesting intro to some of it. Okay great. I’ll now go eat some cookies, because yolo. (later edits) discussion on hacker news my original post used to be about twice as long due to a section of nutrition. Since the topic of what to each came up so often alongside how much to each I am including a quick TLDR on my final diet here, without the 5-page detail. In rough order of importance: Eat from 12-8pm only. Do not drink any calories (no soda, no alcohol, no juices, avoid milk). Avoid sugar like the plague, including carbohydrate-heavy foods that immediately break down to sugar (bread, rice, pasta, potatoes), including to a lesser extent natural sugar (apples, bananas, pears, etc - we’ve “weaponized” these fruits in the last few hundred years via strong artificial selection into actual candy bars), berries are ~okay. Avoid processed food (follow Michael Pollan’s heuristic of only shopping on the outer walls of a grocery store, staying clear of its center). For meat stick mostly to fish and prefer chicken to beef/pork. For me the avoidance of beef/pork is 1) ethical - they are intelligent large animals, 2) environmental - they have a large environmental footprint (cows generate a lot of methane, a highly potent greenhouse gas) and their keeping leads to a lot of deforestation, 3) health related - a few papers point to some cause for concern in consumption of red meat, and 4) global health - a large fraction of the worst offender infectious diseases are zootopic and jumped to humans from close proximity to livestock.

over a year ago • 82 votes

More in AI

Pluralistic: Stock buybacks are stock swindles (06 Sep 2025)

Today's links Stock buybacks are stock swindles: Raising the value of a stock without raising the value of the company. Hey look at this: Delights to delectate. Object permanence: Marshmellow longtermism; Physicists are not epidemiologists; CO asphyxiation accounts for half of Hurricane Laura deaths. Upcoming appearances: Where to find me. Recent appearances: Where I've been. Latest books: You keep readin' em, I'll keep writin' 'em. Upcoming books: Like I said, I'll keep writin' 'em. Colophon: All the rest. Stock buybacks are stock swindles (permalink) Trump's doing a lot of oligarch shit, and while some of it very visible and obvious, other moves, like throwing the door open to "stock buybacks" are technical and obscure, but it's worth paying attention to this, because this form of stock swindle stands to make billionaires a lot richer (and thus more powerful). American companies are headed for the stock buying-backest year on record, having already pissed away $1.1 trillion in 2025: https://www.baystreet.ca/stockstowatch/21522/Stock-Buybacks-Surpass-1-Trillion So what's a stock buyback, then? On the surface, it's pretty straightforward: during a stock buyback, the company uses its cash reserves to buy its own stock. When they do this, the supply of shares goes down, so the price per share goes up. Say a company has issued 1,000 shares, and they're selling at $1,000 per share. That company has a "market cap" of $1,000,000 (1,000 x 1,000). Now the company takes $500,000 out of its bank account and buys half of those shares. Now you have a million-dollar company with only 500 shares, so each of those shares is now worth $2,000 (1,000,000/500 = 2,000). Why is this so bad? Let's start with what capitalism's advocates claim about the power of markets. Markets, they say, are a kind of alchemist's crucible, a vessel that transforms self-interest to a public good. Capitalism's theory is that if we let people pursue their own profit, they will chase efficiency, because anything that lowers costs will leave more profit for capitalists to reap. But as those capitalists discover better, more productive ways to get goods and services to market, they face competition, who force them to accept lower profits, which makes everything cheaper and more abundant for us. That means that even the greediest capitalists have to find new ways to increase efficiency in order to recapture their profits. Lather, rinse, repeat, and capitalism can make more material abundance available that we can dream of. This isn't just what capitalists say – it's also the thesis of Chapter One of The Communist Manifesto: https://www.nytimes.com/2022/10/31/books/review/a-spectre-haunting-china-mieville.html?unlocked_article_code=1.j08.a1xP.KLkhosG_PxkP&smid=url-share Marx and Engels were seriously impressed by the productive power of capitalism, but they had a prescient suspicion that capitalists hate capitalism, and would do whatever they could to interrupt this process. After all, if you can prevent competitors from entering the market, you can innovate just once, find a new way to make something that's cheaper and better, and never share those profits with your customers or workers, because you won't have to outbid your competitors. The alchemical reaction is halted at the point where capitalists are rewarded for their efficiency, and they are never forced to repeat that performance. Monopoly isn't the only way that capitalists can thwart this transformation of greed into abundance. The finance sector is awash in illegal scams that let capitalists get rich without increasing efficiency or making anyone except for themselves better off. Take "wash-trading": this is when a seller buys their own products, sometimes using an alias, other times using a shill. The idea is to trick people into thinking that something is valuable and liquid (that is, that you can easily find buyers for it), when it is really worthless and undesirable. Remember all those multi-million-dollar NFT sales? Almost every one was a wash trade, a way to pump and dump. The problem here isn't just that the buyer is getting defrauded. It's also that the seller is being "allocated capital" (getting money) that gives them power – power to decide what else should be bought and sold in our society. Remember the alchemy theory of markets: if you're a productive capital allocator (if you make things that lots of people desire), you are given more capital to allocate further. This is the market's "invisible hand": elevating the people with proven track records to positions of power over their neighbors and their society, on the basis that they have shown themselves capable of enriching us all, because (the theory goes), capitalism rewards people whose greed translates into a common benefit. As Adam Smith wrote: It is not from the benevolence of the butcher, the brewer, or the baker, that we expect our dinner, but from their regard to their own interest. We address ourselves, not to their humanity but to their self-love, and never talk to them of our own necessities but of their advantages. Wash trading creates misallocations of capital. It makes stupid people rich, and lets them allocate capital to projects that make us all worse off. The whole theory of markets – the reason we're all supposed to leave money that we could all use to make ourselves better off in the hands of the wealthy – is that wealth is the payoff for efficiency, and we are all better off when the most efficient allocators make investment decisions. Modern theorists of capitalism tell us that this isn't alchemy, it's computing. The market is a giant "information-processing" system that incorporates trillions of "price signals" (how much we are willing to spend and how much we are willing to accept, for goods, services and labor). The market processes all these signals to direct allocation and production, ensuring that shortages are met with increases in supply, and that overproduction is tamped down by falling prices, and that inefficiencies provoke investment in process improvements. Which brings me back to stock buybacks. Stock buybacks are a way to make a company's shares more valuable, even as the company itself becomes less valuable. Think of it this way: imagine you've got a company with 1,000 shares, worth $1,000 each, and this company has $500,000 in the bank. The company is valued at $1,000,000 (1,000 x $1,000), and half of that valuation is based on its cash reserves ($500,000 in the bank), which means the other half must be reflected in the company's physical plant and "intangibles" (knowledge, contracts, efficient team structures, copyrights, patents, etc). The company announces a stock buyback: they will withdraw the $500,000 from its bank account and buy half the shares. The company is now $500,000 poorer, which means that its shares should go down in value. After all, that $500,000 is capital that could have been mobilized to make the company more profitable: it could have been spent to hire new people, do R&D, or buy machines that lower the price of making the company's products. That $500,000 represented the company's future growth potential, and the company has just pissed away that potential. This is a company whose future growth has gotten much more expensive, because it will have to borrow in order to fund any expansion. Its shares should be worth less than before. By zeroing out its cash reserves, the company has actually reduced its value by more than the value of those reserves, because it is now stuck in place, forced to fund expansion with debt rather than capital. It is at risk from "shocks" like higher rents or higher energy prices. It's a brittle, hollow vessel for the intangibles that made up the other $500,000 in valuation before the buyback. It will be worse at turning those intangibles into profits in the future. But the buyback hasn't reduced the price of the company's shares: it has doubled that price. The company has made its shares more valuable while making itself less valuable. If you think that markets are a computer that calculates efficient allocation based on prices, this should freak you the fuck out, because as we all know, the iron law of computing is "garbage in, garbage out." The company is feeding an objectively – and grossly – false price signal into the computer's input hopper. That's why stock buybacks were illegal until 1982, when Ronald Reagan's SEC changed its Rule 10-b to legitimize this form of stock manipulation and turn stock swindlers into billionaires: https://pluralistic.net/2024/09/09/low-wage-100/#executive-excess At root, stock buybacks are just wash-trading, the company buying its own shares to move their price, without doing anything to justify that price movement. Before Reagan legalized stock buybacks, companies returned capital to their investors through dividends. Why would companies prefer buybacks to dividends? Because corporate executives hold tons of shares in their employer's company, and it's much better for them to push those share prices higher even as they gut the company's ability to function. So why should you care about this? After all, statistically you own either very little or no stock. The richest 10% of US households own more than 93% of all stocks held by Americans: https://inequality.org/article/stock-ownership-concentration/ Your 401(k) account might see a small boost from this stock swindle, but again, statistically, that 401(k) is unmeasurably infinitesimal compared to the holdings of America's oligarchs. Stock buybacks are a way of making the stock owning class much richer, by swindling everyday investors – who don't understand that companies who drain their cash reserves are less valuable – into buying shares in the companies they loot. And that's why you should care: in the first 8 months of 2025, Trump has allowed America's oligarchs to get $1.1 trillion richer. That's money that you don't have – you won't get the lower prices and higher wages and superior goods that $1.1t would have paid for if companies had spent it on process improvements. It's money they have, which they can spend on things that make you worse off – buying everything from Twitter to the presidency. There's a lot to be furious about right now, like the masked fascist goons kidnapping our neighbors off the street, and the upside-down health system that is reviving the vaccine-controlled deadly pandemics of yesteryear. But the reason those fascist goons and antivaxers are able to decide how we all live our lives is that a very small number of very rich people converted their stolen wealth to illegitimate power, which they wield over us. Anyone who lived through the 2008 crisis knows that finance is a deadly weapon. Let the finance sector run your economy and they will steal everything and leave you jobless, homeless and hungry. Trump is a casino guy, and he knows that the only guy making money in a casino is the owner, who gets to set the odds at the machines and tables. By opening the floodgates to trillions in stock buybacks, Trump is turning us all into the suckers at the table, and turning his oligarch investors into little autocrats, with the power to degrade our lives and steal our future. Hey look at this (permalink) Five for 50 – Anil Dash https://www.anildash.com/2025/09/05/five-for-fifty/ How To Touch Grass https://www.kickstarter.com/projects/powerandmagic/how-to-touch-grass Why This Economy Feels Weird and Scary https://www.thebignewsletter.com/p/why-this-economy-feels-weird-and A Navajo weaving of an integrated circuit: the 555 timer https://www.righto.com/2025/09/marilou-schultz-navajo-555-weaving.html Object permanence (permalink) #20yrsago Interview with mom who won’t pay off the RIAA shakedown https://web.archive.org/web/20051204021157/https://p2pnet.net/story/6134 #5yrsago Political ads have very small effect-sizes https://pluralistic.net/2020/09/04/elusive-mind-control/#persuadables #5yrsago CO asphyxiation accounts for half of Hurricane Laura deaths https://pluralistic.net/2020/09/04/elusive-mind-control/#co #5yrsago Trump is a salesman https://pluralistic.net/2020/09/04/elusive-mind-control/#cialdinism #5yrsago Physicists overestimate their epidemiology game https://pluralistic.net/2020/09/04/elusive-mind-control/#hubris #1yrago Marshmallow Longtermism https://pluralistic.net/2024/09/04/deferred-gratification/#selective-foresight Upcoming appearances (permalink) Ithaca: Enshittification at Buffalo Street Books, Sept 11 https://buffalostreetbooks.com/event/2025-09-11/cory-doctorow-tcpl-librarian-judd-karlman Ithaca: AD White keynote (Cornell), Sep 12 https://deanoffaculty.cornell.edu/events/keynote-cory-doctorow-professor-at-large/ Ithaca: Enshittification at Autumn Leaves Books, Sept 13 https://www.autumnleavesithaca.com/event-details/enshittification-why-everything-got-worse-and-what-to-do-about-it Ithaca: Radicalized Q&A (Cornell), Sept 16 https://events.cornell.edu/event/radicalized-qa-with-author-cory-doctorow DC: Enshittification at Politics and Prose, Oct 8 https://politics-prose.com/cory-doctorow-10825 NYC: Enshittification with Lina Khan (Brooklyn Public Library), Oct 9 https://www.bklynlibrary.org/calendar/cory-doctorow-discusses-central-library-dweck-20251009-0700pm New Orleans: DeepSouthCon63, Oct 10-12 http://www.contraflowscifi.org/ Chicago: Enshittification with Anand Giridharadas (Chicago Humanities), Oct 15 https://www.oldtownschool.org/concerts/2025/10-15-2025-kara-swisher-and-cory-doctorow-on-enshittification/ San Francisco: Enshittification at Public Works (The Booksmith), Oct 20 https://app.gopassage.com/events/doctorow25 Madrid: Conferencia EUROPEA 4D (Virtual), Oct 28 https://4d.cat/es/conferencia/ Miami: Enshittification at Books & Books, Nov 5 https://www.eventbrite.com/e/an-evening-with-cory-doctorow-tickets-1504647263469 Recent appearances (permalink) Nerd Harder! (This Week in Tech) https://twit.tv/shows/this-week-in-tech/episodes/1047 Techtonic with Mark Hurst https://www.wfmu.org/playlists/shows/155658 Cory Doctorow DESTROYS Enshittification (QAA Podcast) https://soundcloud.com/qanonanonymous/cory-doctorow-destroys-enshitification-e338 Latest books (permalink) "Picks and Shovels": a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books (US), Head of Zeus (UK), February 2025 (https://us.macmillan.com/books/9781250865908/picksandshovels). "The Bezzle": a sequel to "Red Team Blues," about prison-tech and other grifts, Tor Books (US), Head of Zeus (UK), February 2024 (the-bezzle.org). "The Lost Cause:" a solarpunk novel of hope in the climate emergency, Tor Books (US), Head of Zeus (UK), November 2023 (http://lost-cause.org). "The Internet Con": A nonfiction book about interoperability and Big Tech (Verso) September 2023 (http://seizethemeansofcomputation.org). Signed copies at Book Soup (https://www.booksoup.com/book/9781804291245). "Red Team Blues": "A grabby, compulsive thriller that will leave you knowing more about how the world works than you did before." Tor Books http://redteamblues.com. "Chokepoint Capitalism: How to Beat Big Tech, Tame Big Content, and Get Artists Paid, with Rebecca Giblin", on how to unrig the markets for creative labor, Beacon Press/Scribe 2022 https://chokepointcapitalism.com Upcoming books (permalink) "Canny Valley": A limited edition collection of the collages I create for Pluralistic, self-published, September 2025 "Enshittification: Why Everything Suddenly Got Worse and What to Do About It," Farrar, Straus, Giroux, October 7 2025 https://us.macmillan.com/books/9780374619329/enshittification/ "Unauthorized Bread": a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2026 "Enshittification, Why Everything Suddenly Got Worse and What to Do About It" (the graphic novel), Firstsecond, 2026 "The Memex Method," Farrar, Straus, Giroux, 2026 "The Reverse-Centaur's Guide to AI," a short book about being a better AI critic, Farrar, Straus and Giroux, 2026 Colophon (permalink) Today's top sources: Currently writing: "The Reverse Centaur's Guide to AI," a short book for Farrar, Straus and Giroux about being an effective AI critic. FIRST DRAFT COMPLETE AND SUBMITTED. A Little Brother short story about DIY insulin PLANNING This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net. https://creativecommons.org/licenses/by/4.0/ Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution. How to get Pluralistic: Blog (no ads, tracking, or data-collection): Pluralistic.net Newsletter (no ads, tracking, or data-collection): https://pluralistic.net/plura-list Mastodon (no ads, tracking, or data-collection): https://mamot.fr/@pluralistic Medium (no ads, paywalled): https://doctorow.medium.com/ Twitter (mass-scale, unrestricted, third-party surveillance and advertising): https://twitter.com/doctorow Tumblr (mass-scale, unrestricted, third-party surveillance and advertising): https://mostlysignssomeportents.tumblr.com/tagged/pluralistic "When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla READ CAREFULLY: By reading this, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ISSN: 3066-764X

7 hours ago • 1 votes

AI Roundup 134: The young and the jobless

September 5, 2025.

yesterday • 3 votes

Pluralistic: Canny Valley (04 Sep 2025)

Today's links Canny Valley: My little art-book is here! Hey look at this: Delights to delectate. Object permanence: Ballmer throws a chair; Bruce Sterling on Singapore; RIP David Graeber; Big Car warns of lethal Right to Repair. Upcoming appearances: Where to find me. Recent appearances: Where I've been. Latest books: You keep readin' em, I'll keep writin' 'em. Upcoming books: Like I said, I'll keep writin' 'em. Colophon: All the rest. Canny Valley (permalink) I've spent every evening this week painstakingly unpacking, numbering and signing 500 copies of my very first art-book, a strange and sturdy little volume called Canny Valley. Canny Valley collects 80 of the best collages I've made for my Pluralistic newsletter, where I publish 5-6 essays every week, usually headed by a strange, humorous and/or grotesque image made up of public domain sources and Creative Commons works. These images are made from open access sources, and they are themselves open access, licensed Creative Commons Attribution Share-Alike, which means you can take them, remix them, even sell them, all without my permission. I never thought I'd become a visual artist, but as I've grappled with the daily challenge of figuring out how to illustrate my furious editorials about contemporary techno-politics, especially "enshittification," I've discovered a deep satisfaction from my deep dives into historical archives of illustration, and, of course, the remixing that comes afterward. Over the years, many readers have asked whether I would ever collect these in a book. Then I ran into Creative Commons CEO Anna Tumadóttir and we brainstormed ideas for donor gifts in honor of Creative Commons' 25th anniversary. My first novel was the first book ever released under a CC license, and while CC has gone on to bigger and better things (without CC there'd be no Wikipedia!), I never forget that my own artistic career and CC's trajectory are co-terminal: https://craphound.com/down/download/ Talking with Anna, I hit on the idea of making a beautiful little book of my favorite illustrations from Pluralistic. Anna thought CC could use about 400 of these, and all the printers I talked to offered me a pretty great quantity break at 500, so I decided I'd do it, and offer the excess 100 copies as premiums in my next Kickstarter, for the enshittification book: https://www.kickstarter.com/projects/doctorow/enshittification-the-drm-free-audiobook/ That Kickstarter is going really well – about to break $100,000! – and as I type these words, there are only five copies of Canny Valley up for grabs. I'm pretty sure they'll be gone long before the campaign closes in ten days. Of course, the fact that you can't get a physical copy of the book doesn't mean that you can't get access to all its media. Here's the full set of all 238 collages, in high-rez, for your plundering pleasure: https://www.flickr.com/photos/doctorow/albums/72177720316719208 But there is one part of this book that's not online: my pal and mentor Bruce Sterling, a cyberpunk legend turned electronic art impressario turned assemblage sculptor, wrote me a brilliant foreword for Canny Valley. Bruce gave me the go-ahead to license this CC BY 4.0 as well, and so I'm reproducing it below. Having spent several days now handling hundreds of these books, I have to say, I am indecently pleased with how they turned out, which is all down to other people. My friend John Berry, a legendary book designer and typographer, laid it out: https://johndberry.com/ And the folks at LA's best comics shop, Secret Headquarters, hooked me up with an incredible printer, the 100+ year old Pasadena institution Typecraft: https://www.typecraft.com/live2/who-we-are.html Typecraft ran this on a gorgeous Indigo printer on 100lb Mohawk paper that just drank the ink. The PVA glue in the binding will last a century, and the matte coat cover doesn't pick up smudges or fingerprints. It's a stunning little artifact. This has been so much fun (and such a success) that I imagine I'll do future volumes in the years to come. In the meantime, enjoy Bruce's intro, and join me in basking in the fact that "enshittification" has made Webster's: https://bsky.app/profile/merriam-webster.com/post/3lxxhhxo4nc2e INTRODUCTION by Bruce Sterling In 1970 a robotics professor named Masahiro Mori discovered a new problem in aesthetics. He called this "bukimi no tani genshō." The Japanese robots he built were functional, so the "bukimi no tani" situation was not an engineering problem. It was a deep and basic problem in the human perception of humanlike androids. Humble assembly robots, with their claws and swivels, those looked okay to most people. Dolls, puppets and mannequins, those also looked okay. Living people had always aesthetically looked okay to people. Especially, the pretty ones. However, between these two realms that the late Dr Mori was gamely attempting to weld together — the world of living mankind and of the pseudo-man-like machine– there was an artistic crevasse. Anything in this "Uncanny Valley" looked, and felt, severely not-okay. These overdressed robots looked and felt so eerie that their creator's skills became actively disgusting. The robots got prettier, but only up to a steep verge. Then they slid down the precipice and became zombie doppelgangers. That's also the issue with the aptly-titled "Canny Valley" art collection here. People already know how to react aesthetically to traditional graphic images. Diagrams are okay. Hand-drawn sketches and cartoons are also okay. Brush-made paintings are mostly fine. Photographs, those can get kind of dodgy. Digital collages that slice up and weld highly disparate elements like diagrams, cartoons, sketches and also photos and paintings, those trend toward the uncanny. The pixel-juggling means of digital image-manipulation are not art-traditional pencils or brushes. They do not involve the human hand, or maybe not even the human eye, or the human will. They're not fixed on paper or canvas; they're a Frankenstein mash-up landscape of tiny colored screen-dots where images can become so fried that they look and feel "cursed." They're conceptually gooey congelations, stuck in the valley mire of that which is and must be neither this-nor-that. A modern digital artist has billions of jpegs in files, folders, clouds and buckets. He's never gonna run out of weightless grist from that mill. Why would Cory Doctorow — novelist, journalist, activist, opinion columnist and so on — want to lift his typing fingers from his lettered keyboard, so as to create graphics with cut-and-paste and "lasso tools"? Cory Doctorow also has some remarkably tangled, scandalous and precarious issues to contemplate, summarize and discuss. They're not his scandalous private intrigues, though. Instead, they're scandalous public intrigues. Or, at least Cory struggles to rouse some public indignation about these intrigues, because his core topics are the tangled penthouse/slash/underground machinations of billionaire web moguls. Cory really knows really a deep dank lot about this uncanny nexus of arcane situations. He explains the shameful disasters there, but they're difficult to capture without torrents of unwieldy tech jargon. I think there are two basic reasons for this. The important motivation is his own need to express himself by some method other than words. I'm reminded here of the example of H. G. Wells, another science fiction writer turned internationally famous political pundit. HG Wells was quite a tireless and ambitious writer — so much so that he almost matched the torrential output of Cory Doctorow. But HG Wells nevertheless felt a compelling need to hand-draw cartoons. He called them "picshuas." These hundreds of "picshuas" were rarely made public. They were usually sketched in the margins of his hand-written letters. Commonly the picshuas were aimed at his second wife, the woman he had renamed "Jane." These picshuas were caricatures, or maybe rapid pen-and-ink conceptual outlines, of passing conflicts, events and situations in the life of Wells. They seemed to carry tender messages to Jane that the writer was unable or unwilling to speak aloud to her. Wells being Wells, there were always issues in his private life that might well pose a challenge to bluntly state aloud: "Oh by the way, darling, I've built a second house in the South of France where I spend my summers with a comely KGB asset, the Baroness Budberg." Even a famously glib and charming writer might feel the need to finesse that. Cory Doctorow also has some remarkably tangled, scandalous and precarious issues to contemplate, summarize and discuss. They're not his scandalous private intrigues, though. Instead, they're scandalous public intrigues. Or, at least Cory struggles to rouse some public indignation about these intrigues, because his core topics are the tangled penthouse/slash/underground machinations of billionaire web moguls. Cory really knows really a deep dank lot about this uncanny nexus of arcane situations. He explains the shameful disasters there, but they're difficult to capture without torrents of unwieldy tech jargon. So instead, he diligently clips, cuts, pastes, lassos, collages and pastiches. He might, plausibly, hire a professional artist to design his editorial cartoons for him. However, then Cory would have to verbally explain all his political analysis to this innocent graphics guy. Then Cory would also have to double-check the results of the artist and fix the inevitable newbie errors and grave misunderstandings. That effort would be three times the labor for a dogged crusader who is already working like sixty. It's more practical for him to mash-up images that resemble editorial cartoons. He can't draw. Also, although he definitely has a pronounced sense of aesthetics, it's not a aesthetic most people would consider tasteful. Cory Doctorow, from his very youth, has always had a "craphound" aesthetic. As an aesthete, Cory is the kind of guy who would collect rain-drenched punk-band flyers that had fallen off telephone poles and store them inside a 1950s cardboard kid-cereal box. I am not scolding him for this. He's always been like that. As Wells used to say about his unique "picshuas," they seemed like eccentric scribblings, but over the years, when massed-up as an oeuvre, they formed a comic burlesque of an actual life. Similarly, one isolated Doctorow collage can seem rather what-the-hell. It's trying to be "canny." If you get it, you get it. If you don't get the first one, then you can page through all of these, and at the end you will probably get it. En masse, it forms the comic burlesque of a digital left-wing cyberspatial world-of-hell. A monster-teeming Silicon Uncanny Valley of extensively raked muck. <img src="https://craphound.com/images/ai-freud.jpg" alt="Sigmund Freud's study with his famous couch. Behind the couch stands an altered version of the classic Freud portrait in which he is smoking a cigar. Freud's clothes and cigar have all been tinted in bright neon colors. His head has been replaced with the glaring red eye of HAL9000 from Kubrick's '2001: A Space Odyssey.' His legs have been replaced with a tangle of tentacles. Cryteria (modified)/https://commons.wikimedia.org/wiki/File:HAL9000.svg/CC BY 3.0/https://creativecommons.org/licenses/by/3.0/deed.en | Ser Amantio di Nicolao (modified)/https://commons.wikimedia.org/wiki/File:Study_with_the_couch,_Freud_Museum_London,_18M0143.jpg"/CC BY-SA 3.0/https://creativecommons.org/licenses/by-sa/3.0/deed"> There are a lot of web-comix people who like to make comic fun of the Internet, and to mock "the Industry." However, there's no other social and analytical record quite like this one. It has something of the dark affect of the hundred-year-old satirical Dada collages of Georg Schultz or Hannah Hoch. Those Dada collages look dank and horrible because they're "Dada" and pulling a stunt. These images look dank and horrible because they're analytical, revelatory and make sense. If you do not enjoy contemporary electronic politics, and instead you have somehow obtained an art degree, I might still be able to help you with my learned and well-meaning intro here. I can recommend a swell art-critical book titled "Memesthetics" by Valentina Tanni. I happen to know Dr. Tanni personally, and her book is the cat's pyjamas when it comes to semi-digital, semi-collage, appropriated, Situationiste-detournement, net.art "meme aesthetics." I promise that I could robotically mimic her, and write uncannily like her, if I somehow had to do that. I could even firmly link the graphic works of Cory Doctorow to the digital avant-garde and/or digital folk-art traditions that Valentina Tanni is eruditely and humanely discussing. Like with a lot of robots, the hard part would be getting me to stop. Cory works with care on his political meme-cartoons — because he is using them to further his own personal analysis, and to personally convince himself. They're not merely sharp and partisan memes, there to rouse one distinct viewer-emotion and make one single point. They're like digital jigsaw-puzzle landscape-sketches — unstable, semi-stolen and digital, because the realm he portrays is itself also unstable, semi-stolen and digital. The cartoons are dirty and messy because the situations he tackles are so dirty and messy. That's the grain of his lampoon material, like the damaged amps in a punk song. A punk song that was licensed by some billionaire and then used to spy on hapless fans with surveillance-capitalism. Since that's how it goes, that's also what you're in for. You have been warned, and these collages will warn you a whole lot more. If you want to aesthetically experience some elegant, time-tested collage art that was created by a major world artist, then you should gaze in wonder at the Max Ernst masterpiece, "Une semaine de bonté" ("A Week of Kindness"). This indefinable "collage novel" aka "artist's book" was created in the troubled time of 1934. It's very uncanny rather than "canny, "and it's also capital-A great Art. As an art critic, I could balloon this essay to dreadful robotic proportions while I explain to you in detail why this weirdo mess is a lasting monument to the expressive power of collage. However, Cory Doctorow is not doing Max Ernst's dreamy, oneiric, enchanting Surrealist art. He would never do that and it wouldn't make any sense if he did. Cory did this instead. It is art, though. It is what it is, and there's nothing else like it. It's artistic expression as Cory Doctorow has a sincere need to perform that, and in twenty years it will be even more rare and interesting. It's journalism ahead of its time (a little) and with a passage of time, it will become testimonial. Bruce Sterling — Ibiza MMXXV Hey look at this (permalink) Twitter users on Enshittification https://x.com/search?q=https%3A%2F%2Ftwitter.com%2FMerriamWebster%2Fstatus%2F1963336587712057346&src=typed_query&f=live Introducing Structural Zero: a New Monthly Newsletter https://hrdag.org/introducing-structural-zero-a-new-monthly-newsletter/ 70 leading Canadians, civil society groups ask Carney to protect Canada's 'digital sovereignty' https://www.cbc.ca/news/politics/open-letter-mark-carney-digital-sovereignty-1.7623128 AI Darwin Awards https://aidarwinawards.org/ Kraft Heinz went all-in on scale. Now it’s banking on a breakup to save its business https://www.cnn.com/2025/09/03/business/kraft-heinz-nightcap Object permanence (permalink) #20yrsago Singapore’s cool-ass hard-drive video-players https://memex.craphound.com/2005/09/03/singapores-cool-ass-hard-drive-video-players/ #20yrsago Being Poor — meditation by John Scalzi https://whatever.scalzi.com/2005/09/03/being-poor/ #20yrsago MSFT CEO: I will “fucking kill” Google — then he threw a chair https://battellemedia.com/archives/2005/09/ballmer_throws_a_chair_at_fing_google #20yrsago Massachusetts to MSFT: switch to open formats or you’re fired https://web.archive.org/web/20051001011728/http://www.boston.com/business/technology/articles/2005/09/02/state_may_drop_office_software/ #20yrsago Bruce Sterling’s Singapore wrapup https://web.archive.org/web/20051217133502/https://wiredblogs.tripod.com/sterling/index.blog?entry_id=1211240 #20yrsago Apple //e mainboards networked and boxed: the Applecrate https://web.archive.org/web/20050407173742/http://members.aol.com/MJMahon/CratePaper.html #15yrsago Jewelry made from laminated, polished cross-sections of bookshttps://littlefly.co.uk/ #15yrsago Boneless, clubfooted French Connection model invades Melbournehttps://www.flickr.com/photos/doctorow/4953586953/ #5yrsago Corporate spooks track you "to your door" https://pluralistic.net/2020/09/03/rip-david-graeber/#hyas #5yrsago Hedge fund managers trouser 64% https://pluralistic.net/2020/09/03/rip-david-graeber/#2-and-20 #5yrsago Rest in Power, David Graeber https://pluralistic.net/2020/09/03/rip-david-graeber/#rip-david-graeber #5yrsago Coronavirus is over (if we want it) https://pluralistic.net/2020/09/03/rip-david-graeber/#test-test-test #5yrsago Snowden vindicated https://pluralistic.net/2020/09/03/rip-david-graeber/#criming-spooks #5yrsago Algorithmic grading https://pluralistic.net/2020/09/03/rip-david-graeber/#computer-says-no #5yrsago Big Car says Right to Repair will MURDER YOU https://pluralistic.net/2020/09/03/rip-david-graeber/#rolling-surveillance-platforms Upcoming appearances (permalink) Ithaca: AD White keynote (Cornell), Sep 12 https://deanoffaculty.cornell.edu/events/keynote-cory-doctorow-professor-at-large/ DC: Enshittification at Politics and Prose, Oct 8 https://politics-prose.com/cory-doctorow-10825 NYC: Enshittification with Lina Khan (Brooklyn Public Library), Oct 9 https://www.bklynlibrary.org/calendar/cory-doctorow-discusses-central-library-dweck-20251009-0700pm New Orleans: DeepSouthCon63, Oct 10-12 http://www.contraflowscifi.org/ Chicago: Enshittification with Anand Giridharadas (Chicago Humanities), Oct 15 https://www.oldtownschool.org/concerts/2025/10-15-2025-kara-swisher-and-cory-doctorow-on-enshittification/ San Francisco: Enshittification at Public Works (The Booksmith), Oct 20 https://app.gopassage.com/events/doctorow25 Madrid: Conferencia EUROPEA 4D (Virtual), Oct 28 https://4d.cat/es/conferencia/ Miami: Enshittification at Books & Books, Nov 5 https://www.eventbrite.com/e/an-evening-with-cory-doctorow-tickets-1504647263469 Recent appearances (permalink) Nerd Harder! (This Week in Tech) https://twit.tv/shows/this-week-in-tech/episodes/1047 Techtonic with Mark Hurst https://www.wfmu.org/playlists/shows/155658 Cory Doctorow DESTROYS Enshittification (QAA Podcast) https://soundcloud.com/qanonanonymous/cory-doctorow-destroys-enshitification-e338 Latest books (permalink) "Picks and Shovels": a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books (US), Head of Zeus (UK), February 2025 (https://us.macmillan.com/books/9781250865908/picksandshovels). "The Bezzle": a sequel to "Red Team Blues," about prison-tech and other grifts, Tor Books (US), Head of Zeus (UK), February 2024 (the-bezzle.org). "The Lost Cause:" a solarpunk novel of hope in the climate emergency, Tor Books (US), Head of Zeus (UK), November 2023 (http://lost-cause.org). "The Internet Con": A nonfiction book about interoperability and Big Tech (Verso) September 2023 (http://seizethemeansofcomputation.org). Signed copies at Book Soup (https://www.booksoup.com/book/9781804291245). "Red Team Blues": "A grabby, compulsive thriller that will leave you knowing more about how the world works than you did before." Tor Books http://redteamblues.com. "Chokepoint Capitalism: How to Beat Big Tech, Tame Big Content, and Get Artists Paid, with Rebecca Giblin", on how to unrig the markets for creative labor, Beacon Press/Scribe 2022 https://chokepointcapitalism.com Upcoming books (permalink) "Canny Valley": A limited edition collection of the collages I create for Pluralistic, self-published, September 2025 "Enshittification: Why Everything Suddenly Got Worse and What to Do About It," Farrar, Straus, Giroux, October 7 2025 https://us.macmillan.com/books/9780374619329/enshittification/ "Unauthorized Bread": a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2026 "Enshittification, Why Everything Suddenly Got Worse and What to Do About It" (the graphic novel), Firstsecond, 2026 "The Memex Method," Farrar, Straus, Giroux, 2026 "The Reverse-Centaur's Guide to AI," a short book about being a better AI critic, Farrar, Straus and Giroux, 2026 Colophon (permalink) Today's top sources: Currently writing: "The Reverse Centaur's Guide to AI," a short book for Farrar, Straus and Giroux about being an effective AI critic. FIRST DRAFT COMPLETE AND SUBMITTED. A Little Brother short story about DIY insulin PLANNING This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net. https://creativecommons.org/licenses/by/4.0/ Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution. How to get Pluralistic: Blog (no ads, tracking, or data-collection): Pluralistic.net Newsletter (no ads, tracking, or data-collection): https://pluralistic.net/plura-list Mastodon (no ads, tracking, or data-collection): https://mamot.fr/@pluralistic Medium (no ads, paywalled): https://doctorow.medium.com/ Twitter (mass-scale, unrestricted, third-party surveillance and advertising): https://twitter.com/doctorow Tumblr (mass-scale, unrestricted, third-party surveillance and advertising): https://mostlysignssomeportents.tumblr.com/tagged/pluralistic "When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla READ CAREFULLY: By reading this, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ISSN: 3066-764X

2 days ago • 4 votes

The Chatbot Wars Are Over. What Comes Next?

Spoiler alert: ChatGPT won.

3 days ago • 10 votes

Pluralistic: All (antitrust) politics are local (02 Sep 2025)

Today's links All (antitrust) politics are local: From data-centers to Ticketmaster. Hey look at this: Delights to delectate. Object permanence: Pokerbot back-channels; Little Robot; How To Destroy Surveillance Capitalism. Upcoming appearances: Where to find me. Recent appearances: Where I've been. Latest books: You keep readin' em, I'll keep writin' 'em. Upcoming books: Like I said, I'll keep writin' 'em. Colophon: All the rest. All (antitrust) politics are local (permalink) The US government has abandoned antitrust. Today, companies facing antitrust jeopardy can just pay key Trumpland figures a million bucks, and they will make a discreet visit to the fifth floor of the DoJ building, have a little shufty around the Antitrust Division and the whole thing will just…go away: https://prospect.org/power/2025-08-19-doj-insider-blows-whistle-pay-to-play-antitrust-corruption/ Federally speaking, antitrust is now just another hustle. The fish rots from the head down, of course: Trump brings baseless lawsuits against media companies so that they can offer him a (colorably) legal bribe in the form of a "settlement": https://www.techdirt.com/2025/07/03/institutional-failure-cbs-wimps-out-pays-trump-16-million-bribe-to-settle-baseless-lawsuit/ This opens space for "MAGA influencer lobbyists" whose boozy back-Broom deals with antitrust targets like Hewlett-Packard Enterprises and Juniper Networks swap legal immunity for personal "consulting" payments in the millions of dollars: https://unherd.com/2025/07/the-antitrust-war-inside-maga/ But here's the thing: even though the fish rots from the head down, the world rises from the bottom up. The global wave of antitrust vigor (which swept up federal enforcers in the US, Canada, the UK, Australia, South Korea, Japan, Germany, France, Spain, the EU and China) did not start with government enforcers. Rather, these enforcers were driven forward by an unstoppable current of popular fury over corporate power. That fury is ubiquitous, and it's growing. Federal enforcement was the channel that current was forced into, but merely damming up that channel does not cause the current to abate. Right now, that rage is finding vent in municipal politics, which makes sense if you think about it, because corporate power is most vividly felt at the local level. When a billionaire rains flaming space-junk down on your home, or poisons your water with fracking, or jack's up your electricity and water bills by building a data-center, that's because a local politician has been captured by an oligarch. Very few of us are personally familiar with America's oligarch class, but a hell of a lot of us know where the mayor lives. Writing in The American Prospect, Ron Knox documents the rising wave of successful local mobilizations against corporate power: https://prospect.org/economy/2025-09-02-shifting-anti-monopoly-landscape/ In Portland, Maine, the community has risen up against the monopolist Live Nation/Ticketmaster's plan to build a 3,300 seat venue that would have destroyed the local music scene, which pulled of a miracle of mutual aid and survived the covid lockdowns and nursed itself back to health. The Maine Music Alliance and its allies won their fight by packing town meetings, circulating petitions, and bollocking their municipal representatives – you know, all the stuff that has totally stopped working at the federal level, but which still moves the needle when it comes to local politics. The Portland/Live Nation victory is a story of a couple thousand everyday people thoroughly trouncing a globe-spanning, rapacious, corporation that grossed seven billion dollars in the last quarter. Moreover, these everyday people beat Live Nation/Ticketmaster at the same moment as the feds were making noises about dropping their antitrust investigation against the company. Where the feds surrender, the people of Portland fight – and win. It's just the latest installment in a series of similar victories, including well-known ones (Queens, NY blocking a giant corporate giveaway to build a new Amazon HQ), and quieter ones, like Tuscon rejecting an Amazon data-center. Localities are fighting the fire-engine cartel (three companies that control fire-engine production and screw cities on new vehicles and maintenance): https://pdfserver.amlaw.com/legalradar/pm-59657794_complaint.pdf For a guy who loves to throw his power around, Trump has a very primitive theory of power. He thinks that illegally shuttering the National Labor Relations Board will put a lid on the generationally unprecedented support for unions among American workers. But the NLRB doesn't exist to make unions possible: unions made the NLRB possible. We have labor law because illegal unions fought so hard and terrified their bosses so much that the capital class had to sue for peace. Firing the referee doesn't end the game – it just means we don't have to play by the rules. Trump has illegally torn up the contracts of a million unionized federal workers. It's "by far the largest single action of union busting in American history": https://prospect.org/labor/2025-09-01-trump-celebrates-labor-day-as-most-anti-union-president/ And the Grinch stole Christmas. So what? The Grinch thought that the ribbons, tags, packages, boxes and bags made the Whos down in Whoville feel all Christmassy. But he had it backwards: the Whos had Christmas in their hearts, which is why they surrounded themselves with the tinsel, the trimmings and the trappings. He attacked the effect, but the cause was left intact. We have a cause. The historic highs in popular support for unions are part of a massive wave of anti-corporate anger. We see it everywhere. It's in juries, which is why corporate lawfirms are panicking at the thought of their clients falling into ordinary peoples' hands: https://pluralistic.net/2025/08/22/jury-nullification/#voir-dire And the reason we're so angry at the oligarchs is that they're so terrible. They've figured out that the only way to keep their billions is to crush democracy and replace it with fascism, which the tech PACs are doing right now, in an open scheme to end elections as means to change society: https://www.thebignewsletter.com/p/monopoly-round-up-is-there-a-silicon As Matt Stoller writes, "if the voting booth isn’t a meaningful way to fix problems, people will find other mechanisms to seek redress, using uglier tactics." Which is why every fascist takeover was ultimately defeated by revolution, not elections: https://cmarmitage.substack.com/p/i-researched-every-attempt-to-stop But one place where democracy is still alive and well is at the local levels. Local races are weird and silly and bush-league, but they're also legible to people in a community that state and national elections are not. MAGA figured that out during the Biden years, packing library boards and town councils with insane chuds and culture warriors – but once decent people caught wind of it, we were able to trounce those weirdos in the next election. I love municipal politics. My 2024 solarpunk novel The Lost Cause is all about local politics as a microcosm of – and a base for – global movements to address the climate emergency: https://us.macmillan.com/books/9781250865946/thelostcause/ For the past several months, I've been immersed in a seeming contradiction: global, local politics. That's because I have new all-time fave podcast, "No Gods No Mayors": https://www.patreon.com/c/NoGodsNoMayors/posts Every week, the NGNM crew profile a mayor – past, present or future, from all over the world and all through time – and prove, repeatedly, that "mayor" is the highest office to which a true oaf can aspire. NGNM has been an especially important balm for me in these brutal political times, because it scratches my burning need to think about politics, without making me think about the country's terrifying slide into fascism (it helps that Riley Quinn, November Kelly and Mattie Lubchansky, the podcast's hosts, are both infinitely charming and very, very funny). As a confirmed NGNM stan (I've started sleeping with a mayoral sash under my pillow) I am duty-bound to consider municipal politics to be funny and, generally speaking, trivial. But municipalities are also cradles of democracy, and at now that cities are the front line of the fight against Trumpism – from antitrust to militarization of our streets – I feel like my NGNM-imparted encyclopedic mayoral knowledge has prepared me to join the battle. (Image: Onbekend, CC BY-SA 4.0, modified) Hey look at this (permalink) Imgur's Community Is In Full Revolt Against Its Owner https://www.404media.co/imgurs-community-is-in-full-revolt-against-its-owner/ 1965 Cryptanalysis Training Workbook Released by the NSA https://www.schneier.com/blog/archives/2025/09/1965-cryptanalysis-training-workbook-released-by-the-nsa.html Process knowledge is crucial to economic development https://www.programmablemutter.com/p/process-knowledge-is-crucial-to-economic Object permanence (permalink) #20yrsago PSP’s social/technical merits and demerits https://web.archive.org/web/20050911180235/http://www.guardian.co.uk/online/story/0,,1559853,00.html #20yrsago Video-poker bots collaborate through back-channels https://web.archive.org/web/20050924164125/https://www.wired.com/wired/archive/13.09/pokerbots.html #15yrsago News stories about stupid young people make old people feel good https://web.archive.org/web/20100903144343/http://news.yahoo.com/s/nm/20100831/od_nm/us_elderly_news #15yrsago Gardener fighting village busybodies for the right to grow tomatoes in her front garden https://web.archive.org/web/20100903171803/http://triblocal.com/Northbrook/detail/214078.html #10yrsago Little Robot: nearly wordless kids’ comic from Zita the Spacegirl creator https://memex.craphound.com/2015/09/01/little-robot-nearly-wordless-kids-comic-from-zita-the-spacegirl-creator/ #5yrsago America's economy is cooked https://pluralistic.net/2020/09/01/cant-pay-wont-pay/#jubilee-now #5yrsago Set My Heart to Five https://pluralistic.net/2020/09/01/cant-pay-wont-pay/#robot-rights #5yrsago Podcasting "How to Destroy Surveillance Capitalism" https://pluralistic.net/2020/09/01/cant-pay-wont-pay/#htdsc Upcoming appearances (permalink) Ithaca: AD White keynote (Cornell), Sep 12 https://deanoffaculty.cornell.edu/events/keynote-cory-doctorow-professor-at-large/ DC: Enshittification at Politics and Prose, Oct 8 https://politics-prose.com/cory-doctorow-10825 NYC: Enshittification with Lina Khan (Brooklyn Public Library), Oct 9 https://www.bklynlibrary.org/calendar/cory-doctorow-discusses-central-library-dweck-20251009-0700pm New Orleans: DeepSouthCon63, Oct 10-12 http://www.contraflowscifi.org/ Chicago: Enshittification with Anand Giridharadas (Chicago Humanities), Oct 15 https://www.oldtownschool.org/concerts/2025/10-15-2025-kara-swisher-and-cory-doctorow-on-enshittification/ San Francisco: Enshittification at Public Works (The Booksmith), Oct 20 https://app.gopassage.com/events/doctorow25 Miami: Enshittification at Books & Books, Nov 5 https://www.eventbrite.com/e/an-evening-with-cory-doctorow-tickets-1504647263469 Recent appearances (permalink) Cory Doctorow DESTROYS Enshittification (QAA Podcast) https://soundcloud.com/qanonanonymous/cory-doctorow-destroys-enshitification-e338 Divesting from Amazon’s Audible and the Fight for Digital Rights (Libro.fm) https://pocketcasts.com/podcasts/9349e8d0-a87f-013a-d8af-0acc26574db2/00e6cbcf-7f27-4589-a11e-93e4ab59c04b The Utopias Podcast https://www.buzzsprout.com/2272465/episodes/17650124 Latest books (permalink) "Picks and Shovels": a sequel to "Red Team Blues," about the heroic era of the PC, Tor Books (US), Head of Zeus (UK), February 2025 (https://us.macmillan.com/books/9781250865908/picksandshovels). "The Bezzle": a sequel to "Red Team Blues," about prison-tech and other grifts, Tor Books (US), Head of Zeus (UK), February 2024 (the-bezzle.org). "The Lost Cause:" a solarpunk novel of hope in the climate emergency, Tor Books (US), Head of Zeus (UK), November 2023 (http://lost-cause.org). "The Internet Con": A nonfiction book about interoperability and Big Tech (Verso) September 2023 (http://seizethemeansofcomputation.org). Signed copies at Book Soup (https://www.booksoup.com/book/9781804291245). "Red Team Blues": "A grabby, compulsive thriller that will leave you knowing more about how the world works than you did before." Tor Books http://redteamblues.com. "Chokepoint Capitalism: How to Beat Big Tech, Tame Big Content, and Get Artists Paid, with Rebecca Giblin", on how to unrig the markets for creative labor, Beacon Press/Scribe 2022 https://chokepointcapitalism.com Upcoming books (permalink) "Canny Valley": A limited edition collection of the collages I create for Pluralistic, self-published, September 2025 "Enshittification: Why Everything Suddenly Got Worse and What to Do About It," Farrar, Straus, Giroux, October 7 2025 https://us.macmillan.com/books/9780374619329/enshittification/ "Unauthorized Bread": a middle-grades graphic novel adapted from my novella about refugees, toasters and DRM, FirstSecond, 2026 "Enshittification, Why Everything Suddenly Got Worse and What to Do About It" (the graphic novel), Firstsecond, 2026 "The Memex Method," Farrar, Straus, Giroux, 2026 "The Reverse-Centaur's Guide to AI," a short book about being a better AI critic, Farrar, Straus and Giroux, 2026 Colophon (permalink) Today's top sources: Currently writing: "The Reverse Centaur's Guide to AI," a short book for Farrar, Straus and Giroux about being an effective AI critic. (1022 words yesterday, 11212 words total). A Little Brother short story about DIY insulin PLANNING This work – excluding any serialized fiction – is licensed under a Creative Commons Attribution 4.0 license. That means you can use it any way you like, including commercially, provided that you attribute it to me, Cory Doctorow, and include a link to pluralistic.net. https://creativecommons.org/licenses/by/4.0/ Quotations and images are not included in this license; they are included either under a limitation or exception to copyright, or on the basis of a separate license. Please exercise caution. How to get Pluralistic: Blog (no ads, tracking, or data-collection): Pluralistic.net Newsletter (no ads, tracking, or data-collection): https://pluralistic.net/plura-list Mastodon (no ads, tracking, or data-collection): https://mamot.fr/@pluralistic Medium (no ads, paywalled): https://doctorow.medium.com/ Twitter (mass-scale, unrestricted, third-party surveillance and advertising): https://twitter.com/doctorow Tumblr (mass-scale, unrestricted, third-party surveillance and advertising): https://mostlysignssomeportents.tumblr.com/tagged/pluralistic "When life gives you SARS, you make sarsaparilla" -Joey "Accordion Guy" DeVilla READ CAREFULLY: By reading this, you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ISSN: 3066-764X

4 days ago • 7 votes

New here?