More from Chris Grossack's Blog
The other day my friend Lucas Salim was asking me some questions about categorical logic and constructive math, and he mentioned he’d never seen a proof that there’s no constructive proof of the intermediate value theorem before. I showed him the usual counterexample, and since my recent blog post about choice was so quick to write I decided to quickly write up a post about this too, since I remember being confused by it back when I was first learning it. The key fact is Soundness and Completeness of the topos semantics of constructive logic. This says that there is a way of interpreting the usual syntax of mathematics into a topos in such a way that (Soundness) If you can constructively prove a statement, then its interpretation in every topos is true (Completeness) If a statement is interpreted as true in every (elementary1) topos, then there must exist a constructive proof For a proof, see Chapter II of Lambek and Scott’s Introduction to Higher Order Categorical Logic or Section D4.3 of The Elephant. This means that as long as we’re careful to avoid choice and excluded middle, anything we prove will be true when interpreted in any topos we like! Then there’s a mechanical procedure that lets us convert this interpretation into a corresponding statement in the “real world2”, and this gives us lots of “theorems for free” for each individual constructive theorem! A nice case study is given by the Wierstrass Approximation Theorem, which I gave a talk on years ago3. Since this theorem is constructively provable4… by interpreting it in the effective topos we learn there’s a computer program $\mathtt{Approx}$ which takes as input a function $f : [0,1] \to \mathbb{R}$5 and an $\epsilon \gt 0$ and outputs the coefficients of a polynomial approximating $f$. by interpreting it in a sheaf topos $\text{Sh}(\Theta)$ we learn that for any continuous family of functions $f_\theta(x) : \Theta \times [0,1] \to \mathbb{R}$, there’s locally6 a polynomial $p_\theta(x)$ whose coefficients are continuous functions of $\theta$ which approximates each $f_\theta(x)$ etc. If you’re interested in learning how to externalize a statement in a topos to a statement about the real world, I highly recommend Ingo Blechschmidt’s excellent paper Exploring mathematical objects from custom-tailored mathematical universes which gives a high level overview of the topic, while still giving enough details to let you externalize a few statements of your own! But where were we? The usual proofs of the intermediate value theorem aren’t constructive. See, for instance, Bauer’s Five Stages of Accepting Constructive Mathematics or Section 1 of Taylor’s A Lambda Calculus for Real Analysis for a discussion of how some common proofs fail (as well as great lists of constructively provable alternatives). Since the usual proofs seem to fail, we might guess that IVT is not provable constructively… but how could we prove this? Say, towards a contradiction, that there were a constructive proof of the intermediate value theorem. Then it would be true in every topos, and thus its various externalizations would all be true in the real world. So to show that there isn’t a constructive proof, all we have to do is find a topos which doesn’t think it’s true! Following many who came before me7, we’re going to use the topos of sheaves on $(-1,1)$. It’s been a minute since we’ve externalized a statement together, so let’s do it now! In full symbolic glory, the IVT says So now using the forcing language for $\text{Sh}((-1,1))$ (check out Theorem $1$ in Chapter VI.7 of Mac Lane and Moerdijk’s Sheaves in Geometry and Logic if you’re not sure what this means), we compute: We cash out the universal quantifiers to get “for every open $U \subseteq (-1,1)$, and for every continuous $f : U \times \mathbb{R} \to \mathbb{R}$, $a : U \to \mathbb{R}$, $b : U \to \mathbb{R}$, we have…” Next, cashing out the implication gives “for every open $U \subseteq (-1,1)$, and for every continuous $f : U \times \mathbb{R} \to \mathbb{R}$, $a : U \to \mathbb{R}$, $b : U \to \mathbb{R}$, so that for all $t \in U$ we know $a(t) \lt b(t)$, $f(t,a(t)) \lt 0$, and $f(t,b(t)) \gt 0$, we have…” Finally, cashing out the existential quantifer and the stuff inside it we get the external statement: The IVT is true inside $\text{Sh}((-1,1))$ if and only if the following is true externally: For every open $U \subseteq (-1,1)$ What a mouthful! Of course, we’re trying to prove this fails, so all we have to do is find an open set $U$ and functions $f$, $a$, and $b$ satisfying the assumptions so that the conclusion fails. We’ll choose $U = (-1,1)$ to be the whole set, $a(t) = -2$ and $b(t) = 2$ to be constant functions, and $f(t,x) : (-1,1) \times \mathbb{R} \to \mathbb{R}$ to be Then we see that, indeed, $f(t,a) \lt 0$ and $f(t,b) \gt 0$ for all $t \in (-1,1)$, so to prove the IVT fails in this topos we just need to show there’s no open cover on which $x(t)$ with $f(t,x(t)) = 0$ can be chosen continuously. The idea is that no matter how hard we try, $x(t)$ cannot be continuous in a neighborhood of $t=0$. Indeed, here’s an animation showing how $x(t)$ changes as we change $t$: You can see that when $t=0$ the root $x(t)$ jumps between $\pm 1$! Indeed, if we plot $x(t)$ we get: and it’s obvious that this is not the graph of a continuous function in any neighborhood of $t=0$. As long as we’re showing pretty graphics, you can also visualize this whole function $f$ as a surface over the strip $(-1,1) \times \mathbb{R}$. Then choosing a $t$ amounts to choosing a “slice” of the surface, and we can see that where that slice intersects the $(t,x)$-plane jumps suddenly as we cross $t=0$. In this example the axes are labeled $x$ and $y$ rather than $x$ and $t$: So where did we start, and where did we end? If the IVT were constructively provable, it would be true inside $\mathsf{Sh}((-1,1))$ and thus for our $f(t,x)$ we could find an open cover on which the zero $x(t)$ varies continuously in $t$. But this can’t possibly happen in a neighborhood of $t=0$, so we learn there is no constructive proof! Buuuuut, all is not lost! Usually classical theorems do have constructive analogues, either by adding new assumptions, weakening the conclusion, or by finding a different statement of the theorem that’s more positive. Andrej Bauer’s paper Five Stages of Accepting Constructive Mathematics lists many possibilities. For instance, one way to weaken the conclusion is to prove that for any $\epsilon$ you like, there’s an $x$ with $|f(x)| \lt \epsilon$. In our example, if we plot those $x$ so that $|f(t,x)| \lt \epsilon$ we get and it’s easy to fit the graph of a continuous selection function $x(t)$ inside this thickened region. Another approach is to recognize that the problem comes from $f$ “hovering” at $0$ when $t=0$. If we forbid this hovering, for instance by assuming $f$ is strictly monotone, then we can constructively prove the IVT (See Bauer’s Five Stages paper again). There’s yet another version, coming from Abstract Stone Duality, where we say that whenever $f(a) \lt 0 \lt f(b)$, the compact subspace $Z_f = { x \in [a,b] \mid f(x) = 0 }$ is occupied (Cor 13.11 in A Lambda Calculus for Real Analysis). This is a condition that’s weaker than inhabited but stronger than nonempty, which you can read about in Section 8 of the same paper. I don’t understand this condition very well, because I haven’t spent as much time thinking about ASD as I would like. Hopefully sometime soon I’ll find some time to work through some examples! Ok, thanks for reading, all! It’s nice to get a few quick posts up while I’m working on some longer stuff. I’m still thinking a lot about a cool circle of ideas involving Fukaya Categories, Skein Theory and T(Q)FTs, and Hall Algebras, and I’m slowly making progress on writing posts about all these fun things. Now, though, I have to go run a review session for a calculus class, haha. I’ll try to resist telling them about the fascinating subtleties that show up when you try to do everything constructively. Stay safe, and we’ll talk soon 💖 Usually on this blog when I talk about topoi I mean grothendieck topoi, but for this completeness result we really do need to allow more general elementary topoi with NNO. Indeed there are statements true in all grothendieck topoi that are not constructively provable (since they fail in some elementary topos). See here for a partial list. I would actually love to know if there’s a reference for what one has to add to IHOL to get something sound and complete for grothendieck topoi… I spent some time looking, but the only thing I found was Topological Completeness for Higher-Order Logic by Awodey and Butz, but it seems like they use $1+1$ in place of $\Omega$, so this isn’t the usual interpretation of logic in a grothendieck topos (which is also where the classical completeness comes from). ↩ Maybe “base topos” would be less philosophically charged ↩ I was really fumbling around with topos theory back then, haha. I’m much more confident now, and in the last few years I’ve just worked through a lot more examples and done more computations and read more papers and generally just learned a lot. Rereading that post was surprisingly… nostalgic isn’t the right word… but it’s fun to see how much I’ve grown! ↩ The usual proof with bernstein polynomials works if we’re careful to check some constructively relevant details. I’ll copy it here in case the wikipedia article changes someday: Fix $\epsilon > 0$. We write $b_{k,n}(x) = \binom{n}{k} x^k (1-x)^{n-k}$, and note that: $\sum_{k=0}^n b_{k,n} = 1$ $\sum_{k=0}^n \left ( x - \frac{k}{n} \right )^2 b_{k,n} = \frac{x(1-x)}{n}$ These are all provable by just expanding the left hand side, which is constructive. We also fix a $\delta$ so that whenever $|x-y| \lt \delta$ we have $|f(x) - f(y)| \lt \epsilon$. This is because, constructively, every continuous function on a compact sublocale of $\mathbb{R}$ is uniformly continuous. Note that here we crucially need to be working with locales! (see, eg, Thm 10.7 in Taylor’s A Lambda Calculus for Real Analysis) Lastly, we fix $M$ an upper bound for $\lvert f \rvert$. This is possible since $[0,1]$ is compact, overt, and inhabited (see Rmk 10.4 in Bauer and Taylor’s The Dedekind Reals in Abstract Stone Duality) thus the continuous $\lvert f \rvert$ admits a maximum (Thm 12.9 in A Lambda Calculus for Real Analysis). Now let $B_n f = \sum_{k=0}^n f(k/n) b_{k,n}$. We compute: In step (a) we use (1), and in step (b) we use the fact that $\forall x \in \mathbb{R} . x \lt \delta \lor x \gt \frac{\delta}{2}$ is constructively true (since the intervals overlap we don’t need excluded middle here!) But we can bound the first sum by noticing in this region $|x-k/n| \lt \epsilon$ so that (using (1) again) And since $\lvert f(x) - f(k/n)\rvert \leq \lvert f(x) \rvert + \lvert f(k/n) \rvert \leq 2M$, we compute: In step (c) we use $|x - k/n| \gt \frac{\delta}{2}$ to say that $\left ( \frac{\delta}{2} \right )^{-2} \left ( x - \frac{k}{k} \right )^2 \geq 1$, so that we can multiply through by it and make our sum bigger. In step (d) we use (2), and at the end we use the fact that $x(1-x) \leq \frac{1}{4}$ on $[0,1]$. Now since $\mathbb{R}$ is constructively archimedian (Defn. 1.1 in The Dedekind Reals in Abstract Stone Duality) we see $\exists n \in \mathbb{N} . \frac{2M}{\delta^2n} \lt \epsilon$. Since these bounds were uniform in $x$, we learn that as desired. ↩ A real number $x$ in this topos is a program that eats a natural number $n$ and outputs a rational number $x(n)$. We think about this as a sequence of rational approximations $x(n)$ converging to some real number $x$. So a function $[0,1] \to \mathbb{R}$ in this topos is a program that takes as input a program $x$ (outputting rational approximations between $0$ and $1$) and outputs a new program $f(x)$ which outputs rational approximations. ↩ Because our statement of Weierstrass $\forall \epsilon . \forall f . \exists p . \lVert f - p \rVert_\infty \lt \epsilon$ includes an existential quantifier there’s no way to get around the fact that the $p$ we build is only defined locally on an open cover. If you were more careful and gave a type theoretic proof that $\prod_\epsilon \prod_f \sum_p \lVert f - p \rVert_\infty \lt \epsilon$ then you could take $p$ to be a single polynomial defined on the whole of $\Theta$… I haven’t thought very hard about how possible this is (mainly because I haven’t spent much time thinking about what theorems about locales are provable in type theory), but I’m sure a talented undergraduate could figure it out. ↩ The earliest version of this example which I’ve seen comes from Stout’s Topological Properties of the Real Numbers Object in a Topos back in 1976. I think basically every other paper I’ve cited in this post gives some version of this example too, so it’s very well trodden ground! ↩
A few days ago I saw a cute question on mse asking about a particularly non-intuitive failing of the axiom of choice. I remember when I was an undergrad talking to a friend of mine about various statements equivalent to choice, and being particularly hung up on the same statement that OP asks about – The product of nonempty sets is nonempty. I understood that there were models where the axiom of choice fails, and so in those models we must have some family of nonempty sets whose product is, somehow, empty! Now that I’m older and I’ve spent much more time thinking about these things, this is less surprising to me, but reading that question reminded me how badly I once wanted a concrete example, and so I’ll share one here! This should be a pretty quick post, since I’ll basically just be fleshing out my answer to that mse question. But I think it’ll also be nice to have here, since these things can be hard to find when you’re first getting into logic and topos theory! Let’s get to it! First, let’s remember that for any group $G$ the category $G\text{-}\mathsf{Set}$ of sets equipped with a $G$-action is a topos. Indeed you can see it as a presheaf topos, since $G\text{-}\mathsf{Set}$ is equivalent to the category of functors from $G \to \mathsf{Set}$ (viewing $G$ as a one-object category). We’ve talked about this topos before, and it’s wild to think how far I’ve come since writing that post! The basic idea of $G\text{-}\mathsf{Set}$ as a topos is that any set theoretic construction we do to some $G$-sets again gives us $G$-sets! For instance, any (co)limits of $G$-sets will have a natural $G$-action. If $X$ is a $G$-set then its powerset $\mathcal{P}(X)$ has a $G$-action where if $A \in \mathcal{P}(X)$ we define \(g \cdot A = \{g \cdot a \mid a \in A \}\), which is another element of $\mathcal{P}(X)$. If $X$ and $Y$ are $G$-sets then the set of functions $X \to Y$ is again a $G$-set where we say $(g \cdot_{X \to Y} f)(x) = g \cdot_Y f(g^{-1} \cdot_X x)$. In particular, we can recover the “$G$-equivariant” constructions as the global elements! So even though $\mathcal{P}(X)$ contains all subsets of $X$ (not just the $G$-invariant subsets), if we look at the global elements (that is the maps $1 \to \mathcal{P}(X)$) we do get exactly the $G$-invariant subsets. Similarly while the set of functions $X \to Y$ sees all functions, the global elements of this set will pick out exactly the $G$-equivariant functions. But $G\text{-}\mathsf{Set}$ has a coreflective subcategory given by those $G$-sets all of whose orbits are finite. The coreflector takes a $G$-set and just deletes all the infinite orbits, so we have an adjunction which gives us a comonad $\iota R$ on $G\text{-}\mathsf{Set}$. This comonad is idempotent, and its category of coalgebras is equivalent to \(G\text{-}\mathsf{Set}_\text{finite orbits}\). Then since $\iota$ is left exact (and so is $R$, since it’s a right adjoint), we see that \(G\text{-}\mathsf{Set}_\text{finite orbits}\) is the category of coalgebras for a lex comonad on a topos, thus is itself a topos! As a cute exercise, check that $\iota$ really is left exact! Now doing computations in this topos is pretty easy! Finite limits and arbitrary colimits are computed as in $G\text{-}\mathsf{Set}$ since $\iota$ preserves these. Arbitrary limits and exponentials $Y^X$ come from coreflecting – that is $Y^X$ as computed in \(G\text{-}\mathsf{Set}_\text{finite orbits}\) is just what we get by removing the infinite orbits from $Y^X$ as computed in $G\text{-}\mathsf{Set}$, and similarly for limits. The subobject classifier is just the usual set of truth values \(\{ \top, \bot \}\) with the trivial $G$-action1. In particular this topos is boolean, so set theory inside it is particularly close to the usual ZF set theory. Now with this in mind, we can prove the main claim of this post: In $\mathbb{Z}\text{-}\mathsf{Set}_\text{finite orbits}$, let $C_n$ be $\mathbb{Z}/n$ with its obvious $\mathbb{Z}$-action. Then each $C_n$ is inhabited2 in the sense that $\exists x . x \in C_n$, and yet \(\prod_n C_n = \emptyset\)! So this topos shows explicitly how, in the absence of choice, you can have a family of nonempty sets3 whose product is somehow empty! The computation is actually quite friendly! To compute $\prod_n C_n$ in this topos, we first compute the product in the category of all $\mathbb{Z}$-sets, then throw away any infinite orbits. But it’s easy to see that every orbit is infinite! Any element of the product will contain an element from every $C_n$, so that in any finite number of steps some large entry in this tuple won’t be back where it started. Ok, this one was actually quite quick, which I’m happy about! My parents are visiting soon, and I’m excited to take a few days to see them ^_^. I have another shorter post planned which another grad student asked me to write, and I’ve finally actually started the process of turning my posts on the topological topos into a paper. I’m starting to understand TQFTs better, and it’s been exciting to learn a bit more physics. Hopefully I’ll find time to talk about all that soon too, once I take some time to really organize my thoughts about it. Thanks for hanging out, all! Stay safe, and we’ll talk soon. In case $G = \mathbb{Z}$ then it’s a kind of cute fact that this topos is equivalent to the topos of continuous (discrete) $\widehat{\mathbb{Z}}$-sets, where $\widehat{\mathbb{Z}}$ is the profinite completion of $\mathbb{Z}$. See, for instance, Example A2.1.7 on page 72 of the elephant. This gives another computationally effective way to work with this topos! I’m pretty sure I convinced myself that more generally the category of $G$-sets all of whose orbits are finite should be equivalent to the category of continuous discrete $\widehat{G}$-sets, but I haven’t thought hard enough about it to say for sure in a blog post. ↩ Of course, there’s no global points for $n \neq 1$, since maps $1 \to C_n$ correspond to fixed points. But existential quantification is local, so that the topos models $\exists x \in C_n . \top$ if there’s some surjection $V \twoheadrightarrow 1$ and a map $V \to C_n$. We can take $V = \mathbb{Z}$ with its left multiplication action on itself, and there is a map from $\mathbb{Z} \to C_n$. If you’re more used to type theory, we don’t have $\Sigma_{x : C_n} \top$, since that would imply a global element. But despite this, we do have the propositional truncation $\lVert \Sigma_{x : C_n} \top \rVert$, so that an element of $C_n$ merely exists. ↩ Since this topos is boolean, nonempty and inhabited are actually synonyms here. Moreover, this “nonempty” is closer to how a lot of working mathematicians speak, so it felt right to use this wording here. ↩
Earlier today I gave a talk in the graduate student seminar titled “Counting is Hard. Complex Analysis is Easy.” based in part on my recent blog post about analytic combinatorics and based in part on Varilly’s notes on Dirichlet’s Theorem, showing how to count the number of trees of a certain shape and the number of “primes of a certain shape” by doing complex analysis. While prepping for this talk, I realized there’s a very pretty geometric way to see what’s going on when counting rooted ordered ternary trees! I don’t usually write blog posts about my local seminar talks anymore, but I think this is more than worthy of an exception! For instance, I think this could serve as a great visiual example of branches and mild singularities in complex analysis. So, let’s remember the problem! We want to count the number of rooted ordered ternary trees with $n$ nodes. Call this number $t_n$. We note that every such tree is either a root node a root with one child a root with two children a root with three children where each child is recursively a rooted ordered ternary tree. If we define $T(z) = \sum_n t_n z^n$ to be the (ordinary) generating function for the $t_n$ we see from this recurrence relation that it must satisfy the functional equation That is, if $P(z,w) = -w + z + zw + zw^2 + zw^3$ then $P(z,T(z)) = 0$. If we draw (the real locus of) $P$ it looks like and the implicit function theorem says that we can find functions $f(z)$ so that $P(z,f(z)) = 0$ through any starting point $P(z_0,f_0)$ on the curve for as long as $\frac{\partial P}{\partial w} \neq 0$. In particular, we know that our $T(0) = t_0 = 0$ since there are no trees with zero vertices, so that our $T(z)$ takes $z$ to the unique part of this curve passing through the origin, for as long as this is defined. Here we’ll plot our curve $P$ in blue, and the implicit function $T(z)$ in orange. Note we have to stop at $z \approx 0.27695$ since here $\frac{\partial P}{\partial w} = 0$ so that the slope is infinite and we can’t continue. Said another way, we bend backwards and if we tried to continue we would fail the vertical line test. But since we have to stop here, we see that $0.27695$ is the radius of convergence for $T$, and is the dominant singularity! Next, can we find a good approximation for $T$ near this singularity? In the main post we used puiseux series for this, but it’s instructive to do it by hand! Note that $\frac{\partial P}{\partial z} \neq 0$ at $(0.27695, 0.65730)$ so there’s nothing stopping us from approximating $z$ as a function of $w$ at this point (then inverting it). Indeed, we can solve for $z = \frac{w}{1+w+w^2+w^3}$ and then just taylor expand this at our point of interest: Of course, we chose this point because $\frac{\mathrm{d}z}{\mathrm{d}w} = \frac{- \frac{\partial P}{\partial w}}{\frac{\partial P}{\partial z}}$ is $0$ here. So we know this $2.97 \times 10^{-8}$ is a rounding error and our leading order expansion is Here’s a graph zoomed into near the singularity. The actual graph of $P$ is shown in blue, and our approximating parabola is shown in orange: But of course if $z - 0.27695 \approx -.34680 (w - 0.6573)^2$ then we can solve for and if we want this to agree with our $w = T(z)$ branch through the origin we obviously have to choose the negative square root (since we want the lower half of this sideways parabola1). Here we draw our $T(z)$ in blue, and we draw the approximation $0.6573 - 0.8936 \sqrt{1 - \frac{z}{0.27695}}$ in orange so you can see how well they line up! But now from here we can run exactly the same argument as in the previous post! We compute $t_n = \frac{1}{2\pi i} \oint \frac{T}{z^{n+1}} \ \mathrm{d}z$, along a keyhole contour around the singular point $z=0.27695$ with a branch cut along the real axis. The brunt of this integral comes from the cutout near the singular point, where $T$ is well approximated by $0.6573 - 0.8936 \sqrt{1 - \frac{z}{0.27695}}$ so that Again, for more details about exactly what this keyhole contour is and how you estimate this integral along it, see the main post. Let’s go ahead and call it here! My actual talk was slightly longer than this, since I also sketched a proof of Dirichlet’s Theorem following Varilly’s notes on the subject, but there’s no sense in me writing that up since those notes are already so good! Plus it’s getting late and I want to go to bed, haha. As always, here’s a copy of my title and abstract. Unfortunately I don’t have any slides or recordings today, but I’m giving another talk at WiSCons in Madison next week and I should have slides for that. I’m also hoping to finish a sister blog post for that talk where I do more example computations in more detail than I could possibly do in a 20 minute talk. Lots of writing to do this week! Take care all, stay safe, and I can’t wait to talk soon ^_^ Counting is Hard. Complex Analysis is Easy. Don’t you miss being a kid, when your mom would ask you to count how many of your alphabet fridge magnets were both red and vowels? When you saw the answer was “2”, you felt such accomplishment…. But counting has only gotten harder since then, and now if you want to count how many objects satisfy some properties (say, being both red and vowels) it can be borderline impossible! In this talk we’ll show how you can solve counting problems by doing complex analysis instead. First we’ll count the number of trees of a certain shape, then (given time) we’ll count how many prime numbers are of a (different) certain shape. In the main post we used puiseux series for this, and we had to essentially guess which branch was correct. Now we see how the geometry of the situation tells us that we have to choose the negative branch of the square root! After all, this is the one that locally agrees with the rest of the graph of $T$! ↩
Another day, another blog post that starts with “I was on mse the other day…”. This time, someone asked an interesting question amounting to “how many unordered rooted ternary trees with $n$ nodes are there, up to isomorphism?”. I’m a sucker for these kinds of combinatorial problems, and after finding a generating function solution I wanted to push myself to get an asymptotic approximation using Flajolet–Sedgewick style analytic combinatorics! I’ve never actually done this before, so I learned a lot, and I want to share some of the things I learned – especially how to do this stuff in sage! Now, you might be thinking – didn’t you write a very similar blog post three years ago? Yes. Yes I did. Did I also completely forget what was in that post? Yes. Yes I did, haha. For some reason I was getting it mixed up with this other post from even more years ago, which isn’t nearly as relevant. Thankfully it didn’t matter much, since I’m fairly sure what I wanted to do wouldn’t work in the framework of that post anyways, and now I have a better idea of what this machinery is actually doing which is really exciting (since I’ve been wanting to understand this stuff for years). Let’s start with a warmup problem! How might we count the number of rooted ordered ternary trees with $n$ nodes? These are the kinds of “ternary trees” that you might remember from a first class on algorithms and data structures. Such a tree is either a leaf or an internal node with 1,2, or 3 children. Crucially these children come with an order, because the datatype keeps track of which child is on the left/right (in the case of 2 children) or on the left/middle/right (in the case of 3 children). After all, from a CS perspective, we need to remember this in order to traverse the tree! This means that the following two trees are distinct for our purposes, even though they’re isomorphic as graphs: In a functional programming language, you might describe this datatype by T z = Leaf z | Unary z (T z) | Binary z (T z) (T z) | Ternary z (T z) (T z) (T z) and this translates immediately to give a functional equation for the generating function of $t_n$, which counts the number of rooted ordered ternary trees with $n$ nodes: We can rearrange this as so that we can compute the power series expansion of $T$ by lagrange inversion: = QQbar[[]] # power series ring T = (z/(1 + z + z^2 + z^3)).reverse() show(T.O(10)) Incredibly, this agrees with A036765, which is the “number of ordered rooted trees with n non-root nodes and all outdegrees <= three”, as we hoped! You might reasonably ask if there’s a closed form for these numbers, and this is too much to ask for (indeed, it’s already too much to ask for a closed form for fibonacci numbers, and this is more complicated). But like the fibonacci numbers, we can come up with an excellent approximation: where $C = \frac{(0.8936\ldots)}{2 \sqrt{\pi}}$ is a constant. Indeed, this gives fantastic results! Let’s plot the ratio of this approximation to the true value so we can see just how good this approximation gets as $n$ gets large. Note that it respects the promised big-oh error bound too! = PowerSeriesRing(QQbar, default_prec=N) T = (z/(1 + z + z^2 + z^3)).reverse() return T.coefficients()[n-1:] def approx(n,N): # very very magic C = 0.8936373911078061 / (2 * sqrt(pi)) w = 3.610718613276040 return [(C * w^k * (1/k^1.5)).n() for k in range(n,N+1)] show(html("Plot the error ratios in the range [n,N]")) @interact def _(n=input_box(100, width=20, label="$n$"), N=input_box(120, width=20, label="$N$"), auto_update=False): actuals = actual(n,N) approxs = approx(n,N) ratios = [(actuals[i]/approxs[i]) for i in range(N+1-n)] show(html("ratio at n={}: {}".format(n, ratios[0]))) show(html("ratio at N={}: {}".format(N, ratios[-1]))) text_options = {'horizontal_alignment': 'right', 'vertical_alignment': 'bottom', 'fontsize': 'large'} G = Graphics() G += scatter_plot([(n+i,r) for (i,r) in enumerate(ratios)]) G += plot(1, (x,n,N)) G += plot(1-1/x, (x,n,N)) G += text('$y = 1$', (N, 1), **text_options) G += text('$y = 1-1/x$', (N, 1-1/N), **text_options) G.show() This is extremely cool, but where the hell did this approximation come from? The answer is called Singularity Analysis, and can be found in Chapter 2 Section 3 of Melczer’s excellent An Invitation to Analytic Combinatorics, or Chapters VI and VII of Flajolet and Sedgewick’s tome. See especially Theorem VII.3 on pg 468. Like seemingly every theorem in complex analysis, this is basically an application of the Residue Theorem. I won’t say too much about why it works, but I’ll at least gesture at a proof. You can read the above references if you want something more precise. First, we recall where $C$ is any contour containing the origin inside a region where $T$ is holomorphic. Then we draw the most important picture in complex analysis: Here the obvious marked point is our singularity $\omega$, and we’ve chosen a branch cut for $T$ (shown in blue1) so that $T$ is holomorphic in the region where the pink curve lives. We’ll estimate the value of this integral by estimating the contribution from the big circle, the little circle, and the two horizontal pieces2. It turns out that the two horizontal pieces basically contribute $O(1)$ amount to our integral, so we ignore them. Since the big circle is compact, $T$ will attain a maximal value on it, say $M$. Then the integral along the big circle (of radius $\omega + \epsilon$, say) is bounded by $2 \pi (\omega + \epsilon) \frac{M}{(\omega + \epsilon)^{n+1}} = O((\omega+\epsilon)^{-n})$ To estimate the integral around the little circle, it would be really helpful to have a series expansion at $\omega$ since we’re staying really close to it… Unfortunately, $\omega$ is a singular point so we don’t have a taylor series, but fortunately there’s another tool for exactly this job: a Puiseux Series! I won’t say much about what these are, especially since Richard Borcherds already put out such a great video on the topic. What matters is is that Sage can compute them for us3, so we can actually get our hands on the approximation! We compute the integral around the little circle to be roughly: In step $(1)$ we approximate $T$ by the first term of its puiseux series, in step $(2)$ we apply the generalized binomial theorem, so that in step $(3)$ we can apply the residue theorem to realize this integral as the coefficient of $z^{-1}$ in our laurent expansion. This gives us something which grows like $\widetilde{O}(\omega^{-n})$, dominating the contribution $O((\omega + \epsilon)^{-n})$ from the big circle, so that asymptotically this is the only term that matters. If we were more careful to keep track of the big-oh error for the puiseux series for $T$ we could easily sharpen this bound4 to see This looks a bit weird with the $(-1)^n$, but remember that $\binom{\alpha}{n}$ also alternates sign. Indeed, asymptotics for $\binom{\alpha}{n}$ are well known so that we can rewrite this as Which finally shows us where our magic numbers came from! Sage happily tells us that the dominant singularity for $T$ is at $\omega = (0.2769\ldots)$ and that a puiseux expansion for $T$ at $\omega$ is so that for us $\alpha = 1/2$, and $C_{1/2} = -(0.8936\ldots)$. Finally, since $\Gamma(-1/2) = -2 \sqrt{\pi}$ and $(0.2769\ldots)^{-1} = (3.6107\ldots)$ we see as promised! Indeed, here’s how you could actually get sage to do all this for you! = QQbar[] S. = R[] # our defining polynomial. # we want to solve for T as a function of z P = z*T^3 + z*T^2 + (z-1)*T + z # following Example 2.12 in Melczer, # we compute the dominant singularity w. w = min([r for (r,_) in P.discriminant().roots() if r != 0], key=lambda r: r.abs()) # to work with abelfunctions we need P # to be a polynomial in two variables R. = QQbar[] P = R(P) # compute the puiseux expansion for T at w. # I know from trial and error that the correct # branch to pick is entry [1] in the list. You # just need to check which one gives you # positive real coefficients for T at 0. s = abelfunctions.puiseux(P,w)[1] c = -sqrt(-s.x0/s.xcoefficient) As a fun exercise, you might modify this code (using s.add_term()) to compute a longer puiseux series and get asymptotics valid up to a multiplicative error of $O \left ( \frac{1}{n^2} \right )$ for the number of rooted ternary trees. Try modifying the previous code block to see that our current approximation is not accurate to $O \left ( \frac{1}{n^2} \right )$, and then check that your approximation is! Ok, now that we understand the warmup, let’s get to the actual problem! How many unordered rooted ternary trees are there, up to isomorphism? Now we’re counting up to graph isomorphism, so that our two trees are now considered isomorphic. It’s actually much less obvious how one might pin down a generating function for something like this, but the answer, serendipitously, comes from Pólya-Redfield counting! If this is new to you, you might be interested in my recent blog post where I talk a bit about one of its corollaries. Today, though, we’ll be using it in a much more sophisticated way. Say you have a structure $X$ acted on by a group $G$ and a collection $C$ of “colors”. How can we count the number of ways to give each $x \in X$ a color from $C$, up to the action of $G$? We start by building the cycle index polynomial of the action $G \curvearrowright X$, which we call $P_G(x_1, \ldots, x_n)$. Then we can plug all sorts of things into the variables $x_i$ in order to solve various counting problems. For example, if $C$ is literally just a finite set of colors, we can plug in $x_1 = x_2 = \ldots = x_n = |C|$ to recover the expression from my recent blog post. But we can also do much more! Say that $C$ is a possibly infinite family of allowable “colors”, each of a fixed “weight”. (For us, our “colors” will be trees, and the “weight” will be the number of nodes). Then we can arrange them into a generating function $F(t) = \sum c_i t^i$, where $c_i$ counts the number of colors of weight $i$5. Then an easy argument (given in full on the wikipedia page) shows that $P_G(F(t), F(t^2), \ldots, F(t^n))$ is a generating function counting the number of ways to color $X$ by colors from $C$, counted up to the $G$ action, and sorted by their total weight6. Check out the wikipedia page for a bunch of great examples! For us, we realize an unordered rooted ternary tree is either empty, or a root with 3 recursive children7. In the recursive case we also want to count up to the obvious $\mathfrak{S}_3$ action permuting the children, so by the previous discussion we learn that where \(P_{\mathfrak{S}_3}(x_1, x_2, x_3) = \frac{1}{6}(x_1^3 + 3x_1 x_2 + 2 x_3)\) is the cycle index polynomial for the symmetric group on three letters. Plugging this in we see Now, how might we get asymptotics for $t_n$ using this functional equation? First let’s think about how our solution to the warmup worked. We wrote $F(z,T)=0$ for a polynomial $F$, used the implicit function theorem to get a taylor series for $T$ at the origin, then got a puiseux series near the dominant singularity $\omega$ which let us accurately estimate the taylor coefficients $t_n$8. We’re going to play the same game here, except we’ll assume that $F$ is merely holomorphic rather than a polynomial. Because we’re no longer working with a polynomial, this really seems to require an infinite amount of data, so I’m not sure how one might get an exact solution for the relevant constants… But following Section VII.5 in Flajolet and Sedgewick we can get as precise a numerical solution as we like! We’ll assume that the functions $T(z^2)$ and $T(z^3)$ are already known analytic functions, so that we can write $F(z,w) = -w + 1 + \frac{z}{6} \left ( w^3 + 3 T(z^2) w + 2 T(z^3) \right )$. This is an analytic function satisfying $F(z,T) = 0$. Now for a touch of magic. Say we can find a pair $(r,s)$ with both $F(r,s)=0$ and $\left . \frac{\partial F}{\partial w} \right |_{(r,s)} = 0$… Then $F$ is singular in the $w$ direction at $(r,s)$ so this is a branch point for $T$. Since both $F$ and $F_w = \frac{\partial F}{\partial w}$ vanish at $(r,s)$ the taylor series for $F$ at $(r,s)$ starts Since we know $F(z,T)$ vanishes, we estimate up to smaller order terms so that a puiseux series for $T$ at $(r,s)$ begins If we write $\gamma = \sqrt{\frac{2r F_z(r,s)}{F_{ww}(r,s)}}$ and are slightly more careful with our error terms, the same technique from the warmup shows See Flajolet and Sedgewick Theorem VII.3 on page 468 for a more careful proof of this theorem. Now how do we use this? We can approximate $T$ by its taylor series at the origin, then numerically solve for the unique9 positive real solution to the system $F(r,s) = F_w(r,s) = 0$ using this approximation: = B[[]] # cycle index polynnomials S. = QQ[] P3 = (x1^3 + 3*x1*x2 + 2*x3)/6 # to start, the taylor coefficients of T are variables T = sum([t[i] * z^i for i in range(n)]) + O(z^n) lhs = T rhs = 1 + z*P3.subs(x1=T(z), x2=T(z^2), x3=T(z^3)) # formally expand out the rhs and set the coefficients # equal to each other. This gives a recurrence relation # for t[n] in terms of the t[i] for i If we run this code locally with $N=20$, we get the approximation so that we expect and indeed this seems to work really well! Our taylor expansion for $T$ agrees with A000598, as we expected, and comparing our approximation to our taylor expansion gives: = B[[]] S. = QQ[] P3 = (x1^3 + 3*x1*x2 + 2*x3)/6 T = sum([t[i] * z^i for i in range(N)]) + O(z^N) lhs = T rhs = 1 + z*P3.subs(x1=T(z), x2=T(z^2), x3=T(z^3)) I = B.ideal([lhs.coefficients()[i] - rhs.coefficients()[i] for i in range(N)]) return [I.reduce(t[i]) for i in range(n,N)] show(html("Plot the error ratios in the range [n,N]")) show(html("This is likely to time out if you make N too large online,")) show(html("so you might want to play around with it locally instead!")) @interact def _(n=input_box(10, width=20, label="$n$"), N=input_box(30, width=20, label="$N$"), auto_update=False): actuals = actualList(n,N) approxs = approxList(n,N) ratios = [actuals[i]/approxs[i] for i in range(N-n)] show(html("ratio at n={}: {}".format(n, ratios[0].n()))) show(html("ratio at N={}: {}".format(N, ratios[-1].n()))) text_options = {'horizontal_alignment': 'right', 'vertical_alignment': 'bottom', 'fontsize': 'large'} G = Graphics() G += scatter_plot([(n+i,r) for (i,r) in enumerate(ratios)]) G += plot(1, (x,n,N)) G += plot(1-1/x, (x,n,N)) G += text('$y = 1$', (N, 1), **text_options) G += text('$y = 1-1/x$', (N, 1-1/N), **text_options) G.show() I’m pretty proud of this approximation, so I think this is a good place to stop ^_^. As a fun exercise, can you write a program that outputs the number of cyclic rooted ternary trees on $n$ vertices? For these we consider two trees the same if they’re related by cyclicly permuting their children. Compare your solution to A000625 For bonus points, can you check that the number of such trees is, asymptotically, Wow! It’s been super nice to be writing up so many posts lately! Like I said in one of my other recent posts, I’ve had a lot of time to think about more bite sized problems and mse stuff while waiting for my DOI generating code to run, so I’ve had more things that felt quick to write up on my mind. My research is actually going quite well too! I have a few interesting directions to explore, and at least one project that might be wrapping up soon. Of course when that happens I’ll be sure to talk about it here, and I’m still planning out a series on fukaya categories, hall algebras, skein algebras, and more! That’s a pretty long one, though, so it’s easy for me to deprioritize, haha. It’s already heating up in Riverside, consistently in the 80s (Fahrenheit), so I can really feel the summer coming. I’m enjoying the sunny days, though, and it’s been nice to spend time working outside and under trees. Thanks for hanging out, all! Take care of each other, and I can’t wait to chat soon ^_^. Just like $w = \sqrt{z}$, a solution to $w^2 = z$ has branches, so too does $T$, a solution to $T = z + zT + zT^2 + zT^3$. ↩ Really these are circles with a little arc cut out, say by integrating from $\epsilon$ to $2 \pi - \epsilon$… But we’ll end up taking $\epsilon \to 0$ and we’re all friends here, so let’s not worry about it. ↩ Though at time of writing you need the abelfunctions package to do it. There is a builtin implementation of puiseux series, but it won’t actually compute a series expansion for you. ↩ Of course, it’s easy to see how to extend this technique to get better asypmtotics. The first approach is to just keep more terms of the puiseux series of $T$ at $\omega$. Then apply the generalized binomial formula multiple times for each term you kept. You can also do better by keeping track of more of the singularities. Build a contour with multiple keyholes in order to get sharper lower order asymptotics: Of course, you can also combine these approaches to keep track of both more singularities and more puiseux coefficients at each singularity. ↩ In fact we can push things even farther and work with multivariate generating functions, but we won’t need that here. ↩ This is why we plug $F(t^k)$ into $x_k$. Because $x_k$ is responsible for $k$-cycles, so we choose a single color for each $k$ cycle, but we have to count it $k$-many times towards our total weight. See the proof on the wikipedia page for more information! ↩ This actually isn’t the version of the recurrence I used in my mse answer. There I used the convention that a rooted tree had to be nonempty, since… you know… it has a root. But allowing possibly empty trees makes the recurrence much simpler, which in turn allows for much easier to analyze asymptotics. Hilariously this exact example is on the wikipedia page for the Pólya-Redfield theorem, which could have saved me a lot of time writing up that answer. I was a bit worried at first about doing these asymptotics by myself, since this was my first serious attempt at using analytic combinatorics, but serendipitously this exact example was also worked out in Flajolet and Sedgewick VII.5, though slightly more tersely than I would have liked, haha. ↩ Officially we have to check that the choice of puiseux series matches up with our choice of taylor series (since there’s multiple branches to our function). But this is easy to arrange for us by choosing the branch of the puisuex series that leads to all our coefficients being positive reals. If you want to do this purely analytically you need to solve a “connection problem”. See figure VII.9 in Flajolet and Sedgewick, as well as the surrounding text. ↩ Under mild technical conditions this pair $(r,s)$ is unique. See Flajolet and Sedgewick Theorem VII.3. ↩
This will be a really quick one! Over the last two weeks I’ve been finishing up a big project to make DOIs for all the papers published in TAC, and my code takes a while to run. So while testing I would hit “go” and have like 10 minutes to kill… which means it’s time to start answering questions on mse again! I haven’t been very active recently because I’ve been spending a lot of time on research and music, but it’s been nice to get back into it. I’m especially proud of a few recent answers, so I think I might quickly turn them into blog posts like I did in the old days! In this post, we’ll try to understand the Capping Algorithm which turns a graph embedded in a surface into a particularly nice embedding where the graph cuts the surface into disks. I drew some pretty pictures to explain what’s going on, and I’m really pleased with how they turned out! So, to start, what’s this “capping algorithm” all about? Say you have a (finite) graph $G$ and you want to know what surfaces it embeds into. For instance planar graphs are those which embed in $\mathbb{R}^2$ (equivalently $S^2$), and owners of this novelty mug know that even the famously nonplanar $K_{3,3}$ embeds in a torus1: Obviously every graph embeds into some high genus surface – just add an extra handle for every edge of the graph, and the edges can’t possibly cross each other! Also once you can embed in some surface you can obviously embed in higher genus surfaces by just adding handles you don’t use. This leads to two obvious “extremal” questions: What is the smallest genus surface which $G$ embeds into? What is the largest genus surface which $G$ embeds into where all the handles are necessary? Note we can check if a handle is “necessary” or not by cutting our surface along the edges of our graph. If the handle doesn’t get cut apart then our graph $G$ must not have used it! This leads to the precise definition: Defn: A $2$-Cell Embedding of $G$ in a surface $S$ is an embedding so that all the conected components of $S \setminus G$ are 2-cells (read: homeomorphic to disks). Then the “largest genus surface where all the handles are necessary” amounts to looking for the largest genus surface where $G$ admits a 2-cell embedding! But in fact, we can restrict attention to 2-cell embeddings in the smallest genus case too, since if we randomly embed $G$ into a surface, there’s an algorithm which only ever decreases the genus and outputs a 2-cell embedding! So if $S$ is the minimal genus surface that $G$ embeds in, we can run this algorithm to get a 2-cell embedding of $G$ in $S$. And what is that algorithm? It’s called Capping, see for instance Minimal Embeddings and the Genus of a Graph by J.W.T. Youngs. The idea is to cut your surface along $G$, look for anything that isn’t a disk, and “cap it off” to make it a disk. Then you repeat until everything in a disk, and you stop. The other day somebody on mse asked about this algorithm, and I had a lot of fun drawing some pictures to show what’s going on2! This post basically exists because I was really proud of how these drawings turned out, and wanted to share them somewhere more permanent, haha. Anyways, on with the show! We’ll start with an embedding of a graph 𝐺 (shown in purple) in a genus 2 surface: we’ll cut it into pieces along $G$, and choose one of our non-disk pieces (call it $S$) to futz with: Now we choose3 a big submanifold $T \subseteq S$ which leaves behind cylinders when we remove it. Pay attention to the boundary components of $T$, called $J_1$ and $J_2$ below, since that’s where we’ll attach a disk to “cap off” where $T$ used to be We glue all our pieces back together, but remove the interior of $T$ and then, as promised “cap off” the boundary components $J_1$ and $J_2$ with disks. Note that the genus decreased when we did this! It used to be genus 2, and now we’re genus 1! Note also that $G$ still embeds into our new surface: Let’s squish it around to a homeomorphic picture, then do the same process a second time! But faster now that we know what’s going on: At this point, we can try to do it again, but we’ll find that removing $G$ cuts our surface into disks: This tells us the algorithm is done, since we’ve successfully produced a 2-cell embedding of $G$ ^_^. Wow that was a really quick one today! Start to finish in under an hour, but it makes sense since I’d already drawn the pictures and spent some time doing research for my answer the other day. Maybe I’ll go play flute for a bit. Thanks for hanging out, all! Stay safe, and see you soon ^_^ This photo of a solution was taken from games4life.co.uk ↩ You know it’s funny, even over the course of drawing just these pictures the other day I feel like I improved a lot… I have half a mind to redraw all these pictures even better, but that would defeat the point of a quick post, so I’ll stay strong! ↩ It’s possible there’s a unique “best” choice of $T$ and I’m just inexperienced with this algorithm. I hadn’t heard of it until I wrote this answer, so there’s a lot of details that I’m fuzzy on. If you happen to know a lot about this stuff, definitely let me know more! ↩
More in science
Heaps of discarded clothing from the U.K. have been dumped in protected wetlands in Ghana, an investigation found. Read more on E360 →
This is the first touchpoint for science, we should make it more enticing
[Note that this article is a transcript of the video embedded above.] Wichita Falls, Texas, went through the worst drought in its history in 2011 and 2012. For two years in a row, the area saw its average annual rainfall roughly cut in half, decimating the levels in the three reservoirs used for the city’s water supply. Looking ahead, the city realized that if the hot, dry weather continued, they would be completely out of water by 2015. Three years sounds like a long runway, but when it comes to major public infrastructure projects, it might as well be overnight. Between permitting, funding, design, and construction, three years barely gets you to the starting line. So the city started looking for other options. And they realized there was one source of water nearby that was just being wasted - millions of gallons per day just being flushed down the Wichita River. I’m sure you can guess where I’m going with this. It was the effluent from their sewage treatment plant. The city asked the state regulators if they could try something that had never been done before at such a scale: take the discharge pipe from the wastewater treatment plant and run it directly into the purification plant that produces most of the city’s drinking water. And the state said no. So they did some more research and testing and asked again. By then, the situation had become an emergency. This time, the state said yes. And what happened next would completely change the way cities think about water. I’m Grady and this is Practical Engineering. You know what they say, wastewater happens. It wasn’t that long ago that raw sewage was simply routed into rivers, streams, or the ocean to be carried away. Thankfully, environmental regulations put a stop to that, or at least significantly curbed the amount of wastewater being set loose without treatment. Wastewater plants across the world do a pretty good job of removing pollutants these days. In fact, I have a series of videos that go through some of the major processes if you want to dive deeper after this. In most places, the permits that allow these plants to discharge set strict limits on contaminants like organics, suspended solids, nutrients, and bacteria. And in most cases, they’re individualized. The permit limits are based on where the effluent will go, how that water body is used, and how well it can tolerate added nutrients or pollutants. And here’s where you start to see the issue with reusing that water: “clean enough” is a sliding scale. Depending on how water is going to be used or what or who it’s going to interact with, our standards for cleanliness vary. If you have a dog, you probably know this. They should drink clean water, but a few sips of a mud puddle in a dirty street, and they’re usually just fine. For you, that might be a trip to the hospital. Natural systems can tolerate a pretty wide range of water quality, but when it comes to drinking water for humans, it should be VERY clean. So the easiest way to recycle treated wastewater is to use it in ways that don’t involve people. That idea’s been around for a while. A lot of wastewater treatment plants apply effluent to land as a disposal method, avoiding the need for discharge to a natural water body. Water soaks into the ground, kind of like a giant septic system. But that comes with some challenges. It only works if you’ve got a lot of land with no public access, and a way to keep the spray from drifting into neighboring properties. Easy at a small scale, but for larger plants, it just isn’t practical engineering. Plus, the only benefits a utility gets from the effluent are some groundwater recharge and maybe a few hay harvests per season. So, why not send the effluent to someone else who can actually put it to beneficial use? If only it were that simple. As soon as a utility starts supplying water to someone else, things get complicated because you lose a lot of control over how the effluent is used. Once it's out of your hands, so to speak, it’s a lot harder to make sure it doesn’t end up somewhere it shouldn’t, like someone’s mouth. So, naturally, the permitting requirements become stricter. Treatment processes get more complicated and expensive. You need regular monitoring, sampling, and laboratory testing. In many places in the world, reclaimed water runs in purple pipes so that someone doesn’t inadvertently connect to the lines thinking they’re potable water. In many cases, you need an agreement in place with the end user, making sure they’re putting up signs, fences, and other means of keeping people from drinking the water. And then you need to plan for emergencies - what to do if a pipe breaks, if the effluent quality falls below the standards, or if a cross-connection is made accidentally. It’s a lot of work - time, effort, and cost - to do it safely and follow the rules. And those costs have to be weighed against the savings that reusing water creates. In places that get a lot of rain or snow, it’s usually not worth it. But in many US states, particularly those in the southwest, this is a major strategy to reduce the demand on fresh water supplies. Think about all the things we use water for where its cleanliness isn’t that important. Irrigation is a big one - crops, pastures, parks, highway landscaping, cemeteries - but that’s not all. Power plants use huge amounts of water for cooling. Street sweeping, dust control. In nearly the entire developed world, we use drinking-quality water to flush toilets! You can see where there might be cases where it makes good sense to reclaim wastewater, and despite all the extra challenges, its use is fairly widespread. One of the first plants was built in 1926 at Grand Canyon Village which supplied reclaimed water to a power plant and for use in steam locomotives. Today, these systems can be massive, with miles and miles of purple pipes run entirely separate from the freshwater piping. I’ve talked about this a bit on the channel before. I used to live near a pair of water towers in San Antonio that were at two different heights above ground. That just didn’t make any sense until I realized they weren’t connected; one of them was for the reclaimed water system that didn’t need as much pressure in the lines. Places like Phoenix, Austin, San Antonio, Orange County, Irvine, and Tampa all have major water reclamation programs. And it’s not just a US thing. Abu Dhabi, Beijing, and Tel Aviv all have infrastructure to make beneficial use of treated municipal wastewater, just to name a few. Because of the extra treatment and requirements, many places put reclaimed water in categories based on how it gets used. The higher the risk of human contact, the tighter the pollutant limits get. For example, if a utility is just selling effluent to farmers, ranchers, or for use in construction, exposure to the public is minimal. Disinfecting the effluent with UV or chlorine may be enough to meet requirements. And often that’s something that can be added pretty simply to an existing plant. But many reclaimed water users are things like golf courses, schoolyards, sports fields, and industrial cooling towers, where people are more likely to be exposed. In those cases, you often need a sewage plant specifically designed for the purpose or at least major upgrades to include what the pros call tertiary treatment processes - ways to target pollutants we usually don’t worry about and improve the removal rates of the ones we do. These can include filters to remove suspended solids, chemicals that bind to nutrients, and stronger disinfection to more effectively kill pathogens. This creates a conundrum, though. In many cases, we treat wastewater effluent to higher standards than we normally would in order to reclaim it, but only for nonpotable uses, with strict regulations about human contact. But if it’s not being reclaimed, the quality standards are lower, and we send it downstream. If you know how rivers work, you probably see the inconsistency here. Because in many places, down the river, is the next city with its water purification plant whose intakes, in effect, reclaim that treated sewage from the people upstream. This isn’t theoretical - it’s just the reality of how humans interact with the water cycle. We’ve struggled with the problems it causes for ages. In 1906, Missouri sued Illinois in the Supreme Court when Chicago reversed their river, redirecting its water (and all the city’s sewage) toward the Mississippi River. If you live in Houston, I hate to break it to you, but a big portion of your drinking water comes from the flushes and showers in Dallas. There have been times when wastewater effluent makes up half of the flow in the Trinity River. But the question is: if they can do it, why can’t we? If our wastewater effluent is already being reused by the city downstream to purify into drinking water, why can’t we just keep the effluent for ourselves and do the same thing? And the answer again is complicated. It starts with what’s called an environmental buffer. Natural systems offer time to detect failures, dilute contaminants, and even clean the water a bit—sunlight disinfects, bacteria consume organic matter. That’s the big difference in one city, in effect, reclaiming water from another upstream. There’s nature in between. So a lot of water reclamation systems, called indirect potable reuse, do the same thing: you discharge the effluent into a river, lake, or aquifer, then pull it out again later for purification into drinking water. By then, it’s been diluted and treated somewhat by the natural systems. Direct potable reuse projects skip the buffer and pipe straight from one treatment plant to the next. There’s no margin for error provided by the environmental buffer. So, you have to engineer those same protections into the system: real-time monitoring, alarms, automatic shutdowns, and redundant treatment processes. Then there’s the issue of contaminants of emerging concern: pharmaceuticals, PFAS [P-FAS], personal care products - things that pass through people or households and end up in wastewater in tiny amounts. Individually, they’re in parts per billion or trillion. But when you close the loop and reuse water over and over, those trace compounds can accumulate. Many of these aren’t regulated because they’ve never reached concentrations high enough to cause concern, or there just isn’t enough knowledge about their effects yet. That’s slowly changing, and it presents a big challenge for reuse projects. They can be dealt with at the source by regulating consumer products, encouraging proper disposal of pharmaceuticals (instead of flushing them), and imposing pretreatment requirements for industries. It can also happen at the treatment plant with advanced technologies like reverse osmosis, activated carbon, advanced oxidation, and bio-reactors that break down micro-contaminants. Either way, it adds cost and complexity to a reuse program. But really, the biggest problem with wastewater reuse isn’t technical - it’s psychological. The so-called “yuck factor” is real. People don’t want to drink sewage. Indirect reuse projects have a big benefit here. With some nature in between, it’s not just treated wastewater; it’s a natural source of water with treated wastewater in it. It’s kind of a story we tell ourselves, but we lose the benefit of that with direct reuse: Knowing your water came from a toilet—even if it’s been purified beyond drinking water standards—makes people uneasy. You might not think about it, but turning the tap on, putting that water in a glass, and taking a drink is an enormous act of trust. Most of us don’t understand water treatment and how it happens at a city scale. So that trust that it’s safe to drink largely comes from seeing other people do it and past experience of doing it over and over and not getting sick. The issue is that, when you add one bit of knowledge to that relative void of understanding - this water came directly from sewage - it throws that trust off balance. It forces you not to rely not on past experience but on the people and processes in place, most of which you don’t understand deeply, and generally none of which you can actually see. It’s not as simple as just revulsion. It shakes up your entire belief system. And there’s no engineering fix for that. Especially for direct potable reuse, public trust is critical. So on top of the infrastructure, these programs also involve major public awareness campaigns. Utilities have to put themselves out there, gather feedback, respond to questions, be empathetic to a community’s values, and try to help people understand how we ensure water quality, no matter what the source is. But also, like I said, a lot of that trust comes from past experience. Not everyone can be an environmental engineer or licensed treatment plant operator. And let’s be honest - utilities can’t reach everyone. How many public meetings about water treatment have you ever attended? So, in many places, that trust is just going to have to be built by doing it right, doing it well, and doing it for a long time. But, someone has to be first. In the U.S., at least on the city scale, that drinking water guinea pig was Wichita Falls. They launched a massive outreach campaign, invited experts for tours, and worked to build public support. But at the end of the day, they didn’t really have a choice. The drought really was that severe. They spent nearly four years under intense water restrictions. Usage dropped to a third of normal demand, but it still wasn’t enough. So, in collaboration with state regulators, they designed an emergency direct potable reuse system. They literally helped write the rules as they went, since no one had ever done it before. After two months of testing and verification, they turned on the system in July 2014. It made national headlines. The project ran for exactly one year. Then, in 2015, a massive flood ended the drought and filled the reservoirs in just three weeks. The emergency system was always meant to be temporary. Water essentially went through three treatment plants: the wastewater plant, a reverse osmosis plant, and then the regular water purification plant. That’s a lot of treatment, which is a lot of expense, but they needed to have the failsafe and redundancy to get the state on board with the project. The pipe connecting the two plants was above ground and later repurposed for the city’s indirect potable reuse system, which is still in use today. In the end, they reclaimed nearly two billion gallons of wastewater as drinking water. And they did it with 100% compliance with the standards. But more importantly, they showed that it could be done, essentially unlocking a new branch on the skill tree of engineering that other cities can emulate and build on.
A version of this post originally appeared on Tedium, Ernie Smith’s newsletter, which hunts for the end of the long tail. For roughly three decades, the JPEG has been the World Wide Web’s primary image format. But it wasn’t the one the Web started with. In fact, the first mainstream graphical browser, NCSA Mosaic, didn’t initially support inline JPEG files—just inline GIFs, along with a couple of other formats forgotten to history. However, the JPEG had many advantages over the format it quickly usurped. aspect_ratio Despite not appearing together right away—it first appeared in Netscape in 1995, three years after the image standard was officially published—the JPEG and web browser fit together naturally. JPEG files degraded more gracefully than GIFs, retaining more of the picture’s initial form—and that allowed the format to scale to greater levels of success. While it wasn’t capable of animation, it progressively expanded from something a modem could pokily render to a format that was good enough for high-end professional photography. For the internet’s purposes, the degradation was the important part. But it wasn’t the only thing that made the JPEG immensely valuable to the digital world. An essential part was that it was a documented standard built by numerous stakeholders. The GIF was a de facto standard. The JPEG was an actual one How important is it that JPEG was a standard? Let me tell you a story. During a 2013 New York Times interview conducted just before he received an award honoring his creation, GIF creator Steve Wilhite stepped into a debate he unwittingly created. Simply put, nobody knew how to pronounce the acronym for the image format he had fostered, the Graphics Interchange Format. He used the moment to attempt to set the record straight—it was pronounced like the peanut butter brand: “It is a soft ‘G,’ pronounced ‘jif.’ End of story,” he said. I posted a quote from Wilhite on my popular Tumblr around that time, a period when the social media site was the center of the GIF universe. And soon afterward, my post got thousands of reblogs—nearly all of them disagreeing with Wilhite. Soon, Wilhite’s quote became a meme. The situation paints how Wilhite, who died in 2022, did not develop his format by committee. He could say it sounded like “JIF” because he built it himself. He was handed the project as a CompuServe employee in 1987; he produced the object, and that was that. The initial document describing how it works? Dead simple. 38 years later, we’re still using the GIF—but it never rose to the same prevalence of JPEG. The JPEG, which formally emerged about five years later, was very much not that situation. Far from it, in fact—it’s the difference between a de facto standard and an actual one. And that proved essential to its eventual ubiquity. We’re going to degrade the quality of this image throughout this article. At its full image size, it’s 13.7 megabytes.Irina Iriser How the JPEG format came to life Built with input from dozens of stakeholders, the Joint Photographic Experts Group ultimately aimed to create a format that fit everyone’s needs. (Reflecting its committee-led roots, there would be no confusion about the format’s name—an acronym of the organization that designed it.) And when the format was finally unleashed on the world, it was the subject of a more than 600-page book. JPEG: Still Image Data Compression Standard, written by IBM employees and JPEG organization stakeholders William B. Pennebaker and Joan L. Mitchell, describes a landscape of multimedia imagery, held back without a way to balance the need for photorealistic images and immediacy. Standardization, they believed, could fix this. “The problem was not so much the lack of algorithms for image compression (as there is a long history of technical work in this area),” the authors wrote, “but, rather, the lack of a standard algorithm—one which would allow an interchange of images between diverse applications.” And they were absolutely right. For more than 30 years, JPEG has made high-quality, high-resolution photography accessible in operating systems far and wide. Although we no longer need to compress JPEGs to within an inch of their life, having that capability helped enable the modern internet. As the book notes, Mitchell and Pennebaker were given IBM’s support to follow through this research and work with the JPEG committee, and that support led them to develop many of the JPEG format’s foundational patents. Described in patents filed by Mitchell and Pennebaker in 1988, IBM and other members of the JPEG standards committee, such as AT&T and Canon, were developing ways to use compression to make high-quality images easier to deliver in confined settings. Each member brought their own needs to the process. Canon, obviously, was more focused on printers and photography, while AT&T’s interests were tied to data transmission. Together, the companies left behind a standard that has stood the test of time. All this means, funnily enough, that the first place that a program capable of using JPEG compression appeared was not MacOS or Windows, but OS/2—a fascinating-but-failed graphical operating system created by Pennebaker and Mitchell’s employer, IBM. As early as 1990, OS/2 supported the format through the OS/2 Image Support application. At 50 percent of its initial quality, the image is down to about 2.6 MB. By dropping half of the image’s quality, we brought it down to one-fifth of the original file size. Original image: Irina Iriser What a JPEG does when you heavily compress it The thing that differentiates a JPEG file from a PNG or a GIF is how the data degrades as you compress it. The goal for a JPEG image is to still look like a photo when all is said and done, even if some compression is necessary to make it all work at a reasonable size. That way, you can display something that looks close to the original image in fewer bytes. Or, as Pennebaker and Mitchell put it, “the most effective compression is achieved by approximating the original image (rather than reproducing it exactly).” Central to this is a compression process called discrete cosine transform (DCT), a lossy form of compression encoding heavily used in all sorts of compressed formats, most notably in digital audio and signal processing. Essentially, it delivers a lower-quality product by removing details, while still keeping the heart of the original product through approximation. The stronger the cosine transformation, the more compressed the final result. The algorithm, developed by researchers in the 1970s, essentially takes a grid of data and treats it as if you’re controlling its frequency with a knob. The data rate is controlled like water from a faucet: The more data you want, the higher the setting. DCT allows a trickle of data to still come out in highly compressed situations, even if it means a slightly compromised result. In other words, you may not keep all the data when you compress it, but DCT allows you to keep the heart of it. (See this video for a more technical but still somewhat easy-to-follow description of DCT.) DCT is everywhere. If you have ever seen a streaming video or an online radio stream that degraded in quality because your bandwidth suddenly declined, you’ve witnessed DCT being utilized in real time. A JPEG file doesn’t have to leverage the DCT with just one method, as JPEG: Still Image Data Compression Standard explains: The JPEG standard describes a family of large image compression techniques, rather than a single compression technique. It provides a “tool kit” of compression techniques from which applications can select elements that satisfy their particular requirements. The toolkit has four modes: Sequential DCT, which displays the compressed image in order, like a window shade slowly being rolled down Progressive DCT, which displays the full image in the lowest-resolution format, then adds detail as more information rolls in Sequential lossless, which uses the window shade format but doesn’t compress the image Hierarchical mode, which combines the prior three modes—so maybe it starts with a progressive mode, then loads DCT compression slowly, but then reaches a lossless final result At the time the JPEG was being created, modems were extremely common. That meant images loaded slowly, making Progressive DCT the most fitting format for the early internet. Over time, the progressive DCT mode has become less common, as many computers can simply load the sequential DCT in one fell swoop. That same forest, saved at 5 percent quality. Down to about 419 kilobytes.Original image: Irina Iriser When an image is compressed with DCT, the change tends to be less noticeable in busier, more textured areas of the picture, like hair or foliage. Those areas are harder to compress, which means they keep their integrity longer. It tends to be more noticeable, however, with solid colors or in areas where the image sharply changes from one color to another—like text on a page. Ever screenshot a social media post, only for it to look noisy? Congratulations, you just made a JPEG file. Other formats, like PNG, do better with text, because their compression format is intended to be non-lossy. (Side note: PNG’s compression format, DEFLATE, was designed by Phil Katz, who also created the ZIP format. The PNG format uses it in part because it was a license-free compression format. So it turns out the brilliant coder with the sad life story improved the internet in multiple ways before his untimely passing.) In many ways, the JPEG is one tool in our image-making toolkit. Despite its age and maturity, it remains one of our best options for sharing photos on the internet. But it is not a tool for every setting—despite the fact that, like a wrench sometimes used as a hammer, we often leverage it that way. Forgent Networks claimed to own the JPEG’s defining algorithm The JPEG format gained popularity in the ’90s for reasons beyond the quality of the format. Patents also played a role: Starting in 1994, the tech company Unisys attempted to bill individual users who relied on GIF files, which used a patent the company owned. This made the free-to-use JPEG more popular. (This situation also led to the creation of the patent-free PNG format.) While the JPEG was standards-based, it could still have faced the same fate as the GIF, thanks to the quirks of the patent system. A few years before the file format came to life, a pair of Compression Labs employees filed a patent application that dealt with the compression of motion graphics. By the time anyone noticed its similarity to JPEG compression, the format was ubiquitous. Our forest, saved at 1 percent quality. This image is only about 239 KB in size, yet it’s still easily recognizable as the same photo. That’s the power of the JPEG.Original image: Irina Iriser Then in 1997, a company named Forgent Networks acquired Compression Labs. The company eventually spotted the patent and began filing lawsuits over it, a series of events it saw as a stroke of good luck. “The patent, in some respects, is a lottery ticket,” Forgent Chief Financial Officer Jay Peterson told CNET in 2005. “If you told me five years ago that ‘You have the patent for JPEG,’ I wouldn’t have believed it.” While Forgent’s claim of ownership of the JPEG compression algorithm was tenuous, it ultimately saw more success with its legal battles than Unisys did. The company earned more than $100 million from digital camera makers before the patent finally ran out of steam around 2007. The company also attempted to extract licensing fees from the PC industry. Eventually, Forgent agreed to a modest $8 million settlement. As the company took an increasingly aggressive approach to its acquired patent, it began to lose battles both in the court of public opinion and in actual courtrooms. Critics pounced on examples of prior art, while courts limited the patent’s use to motion-based uses like video. By 2007, Forgent’s compression patent expired—and its litigation-heavy approach to business went away. That year, the company became Asure Software, which now specializes in payroll and HR solutions. Talk about a reboot. Why the JPEG won’t die The JPEG file format has served us well. It’s been difficult to remove the format from its perch. The JPEG 2000 format, for example, was intended to supplant it by offering more lossless options and better performance. The format is widely used by the Library of Congress and specialized sites like the Internet Archive, however, it is less popular as an end-user format. See the forest JPEG degrade from its full resolution to 1 percent quality in this GIF. Original image: Irina Iriser Other image technologies have had somewhat more luck getting past the JPEG format. The Google-supported WebP is popular with website developers (and controversial with end users). Meanwhile, the formats AVIF and HEIC, each developed by standards bodies, have largely outpaced both JPEG and JPEG 2000. Still, the JPEG will be difficult to kill at this juncture. These days, the format is similar to MP3 or ZIP files—two legacy formats too popular and widely used to kill. Other formats that compress the files better and do the same things more efficiently are out there, but it’s difficult to topple a format with a 30-year head start. Shaking off the JPEG is easier said than done. I think most people will be fine to keep it around. Ernie Smith is the editor of Tedium, a long-running newsletter that hunts for the end of the long tail.