Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
39
I’ve been thinking a lot about the internal logic of topoi again, and I want to have more examples of topoi that I understand well enough to externalize some statements. There’s more to life than just a localic $\mathsf{Sh}(B)$, and since I’m starting to feel like I understand that example pretty well, it’s time to push myself to understand other important examples too! In particular, it would be nice to throw some gros topoi into the mix, and where better to start than Johnstone’s topological topos? This topos is fairly small (which makes explicit computation easy) and is very well studied (which makes finding references and examples merely annoying instead of totally impossible). Eventually I’ll want to learn about the effective topos1 (and other realizability topoi more generally), various smooth topoi, etc. but let’s take them on one-at-a-time! The topological topos $\mathcal{T}$ is a world where every set is intrinsically a space. What does this mean? Well, if we’re...
7 months ago

More from Chris Grossack's Blog

Where Do Those Undergraduate Divisibility Problems Come From?

Oftentimes in your “intro to proofs” class or your first “discrete math” class or something similar, you’ll be shown problems of the form “prove that for $n^6 + n^3 + 2n^2 + 2n$ is a multiple of $6$ for every $n$”… But where do these problems come from? And have you ever stopped to think how magical this is? If I gave you some random polynomial in $n$ and asked you if it always output multiples of $6$, the answer would almost always be “no”! So if you really needed to come up with an example of this phenomenon, how would you do it? In this blog post, we give one approach! I want to give some sort of attribution for this, since I remember reading about this exact topic… like a long time ago. Maybe 6 or 7 years ago? I’m sure that I saved the relevant article1 in my zotero, but I can’t find it for the life of me! Regardless, I want to emphasize that the whole idea for this topic (using Pólya-Redfield counting to build these divisibility problems) is not mine. I’ve wanted to write a post on Pólya-Redfield counting for years now, since it was a pet topic of mine as an undergrad, but I think I was always planning too big a scope. This is a very bite-sized problem, and I won’t go into the theory very deeply, so I think it should make for a blog post I can finish in a day2. Let’s get to it! Ok, first, how might we come up with problems like this? We want a polynomial $P(n)$ and an integer $k$ so that $P(n)$ is always a multiple of $k$. That is, so that $P(n) / k$ is always an integer! But what are sources of integers? If we put our combinatorial hats on, we learn that we had better be counting something! That’s the quickest way to ensure that we get an integer answer at the end of the day. So we want a polynomial so that as we vary $n$, the value $P(n) / k$ counts… something. At this point, you might be inspired by The Lemma Which is Not Burnside’s, which says that when a group $G$ acts on a set $X$, the number of orbits is where \(X^g = \{ x \mid gx = x \}\) is the set of fixed points of $g$. This is great, since in some sense it’s “where division comes from”. I don’t want to get into categorification here, but when we say we’re thinking of numbers as the cardinality of some set (to guarantee they’re integers) we’re really categorifying our problem. There’s been lots of work showing how various operations on numbers lift to categorified operations on the category of finite sets, and the only way I know of to categorify division is to take the orbits of some group action3. Hopefully this also serves to show that categorification doesn’t need to be scary! It can be incredibly simple, just thinking about finite sets and what we can do to them… Though maybe people who read my blog are already convinced of that, haha. Regardless, this orbit-counting formula is close to what we want! It gives us access to division. So if we could only find a family of sets $X_n$, all of which admit a $G$-action, then maybe we could have $P(n) = \sum_g |X_n^g|$ and thus $P(n)$ will always be divisible by $|G|$, since $P(n)/|G|$ counts the orbits $|X_n \big / G|$! This is exactly what Pólya-Redfield counting buys us! I really want to write a blog post with lots of pretty pictures that explains this in detail, but maybe just for today I’ll allow myself to be a bit less thorough. If you want to see this motivated with pretty pictures, I’ll point you to some slides by my undergrad advisor Klaus Sutner4. These are from the 2023 version of the class where I first met him back in… 2017? It’s better to not think about that, haha. Moving on, say we have a set $X$ with a $G$-action. Then we think about all the ways to “color” $X$ with $n$-many colors. Precisely these are functions from \(X \to [n]\), and they pick up a natural $g$ action by sending the function $f : X \to [n]$ to the function $(gf)(x) = f(g^{-1}x)$5. These function spaces6 \((X \to [n])\) will be our sets $X_n$, and all that’s left is to see how to compute $|X_n \big / G|$, ideally in a way that’s uniform in $n$. Now (a corollary of) the main Pólya-Redfield theorem says: where $c(g)$ is the “cycle count” of $g$. We can view the action of $G$ on $X$ as a homomorphism $\alpha$ from $G$ to the symmetric group $\mathfrak{S}_X$. Then $\alpha(g) \in \mathfrak{S}_X$ is a permutation, so decomposes into a product of cycles. The number $c(g)$ is perhaps the dumbest invariant you can think of: just count the number of cycles! This is exactly what we want, since it tells us that the polynomial $P(n) = \sum_g n^{c(g)}$ is always divisible by $|G|$, since the quotient is exactly counting the orbits of the action of $G$ on \((X \to [n])\)! Again, unfortunately I’ll leave the derivation of this formula (and the many many other useful things the Pólya-Redfield theory buys you) for another day, but at least we can do some quick examples! First, say we want to count the number of bracelets you can make with $6$ beads, provided you have $n$ many types of beads available. Obviously if you were looking at bracelets in the real world that you can move around, the following two bracelets are actually the same: so we want to count the number of ways to color this picture with $n$ colors (this is a choice of bead in each position), but only up to the obvious $\mathbb{Z} / 6$ action on the space of colorings7! Now choose an isomorphism between the colorable places of our configuration and the standard set of size $6$, for instance, this is likely to be a common choice: After making this identification our action is a map $\mathbb{Z}/6 \to \mathfrak{S}_6$, and thus we can compute cycle decompositions as follows: Element $g \in \mathbb{Z}/6$ Image in $\mathfrak{S}_6$ Number of Cycles, $c(g)$ $0$ $(1)(2)(3)(4)(5)(6)$ $6$ $1$ $(1 \ 2 \ 3 \ 4 \ 5 \ 6)$ $1$ $2$ $(1 \ 3 \ 5)(2 \ 4 \ 6)$ $2$ $3$ $(1 \ 4)(2 \ 5)(3 \ 6)$ $3$ $4$ $(1 \ 5 \ 3)(2 \ 6 \ 4)$ $2$ $5$ $(1 \ 6 \ 5 \ 4 \ 3 \ 2)$ $1$ Using this, we see that the number of bracelets with $n$ beads, up to our $\mathbb{Z}/6$ action is Since this is counting the number of orbits, it must be an integer, and so for every $n$, the polynomial $P(n) = n^6 + n^3 + 2n^2 + 2n$ has to be divisible by $6$, as promised! Let’s see a harder one, which (in the interest of speed) I stole from Klaus’s lectures: How many ways can you fill a tic-tac-toe board with $X$s and $O$s, up to rotation and reflection? We have configurations like the following (I’ve already chosen a bijection between our colorable places and the standard set with $9$ elements): Now we want to color each slot $X$ or $O$, and quotient out the action of the dihedral group in order to view the following colorings as “the same” (of course, these aren’t the whole orbit! There’s many other rotations and reflections which are also equivalent): Notice we’re also not worried about which colorings come from actual games of tic-tac-toe! How does Pólya-Redfield tell us to proceed? Well, the dihedral group $D_{2 \cdot 4}$ has $8$ elements, built out of rotations and reflections. Say that $r$ is the clockwise rotation shown above, and $s$ is the horizontal flip shown above. Then using the numbering system from before, we compute so that, in cycle notation: Since $r$ and $s$ generate the whole dihedral group, we’re done with the hard work! Now a computer algebra system like sage can compute the rest of the table from these by multiplying them together: As a cute exercise for those new to group theory, try computing these 8 permutations yourself! Can you figure out which one comes from which element of the dihedral group? Can you see how they relate to the usual presentation $G = \langle r, s \mid r^4, s^2, srs=r^{-1} \rangle$? From here it’s easy to read off the polynomial! If we have $n$-many available colors to put in each slot of the tic-tac-toe board, then the number of possible boards, counted up to rotation and reflection is given by Since we’re trying to color the board with only two colors, $X$ and $O$, we see the number of ways is Now we’ve really made two predictions here. First, that $P(n) = n^9 + 4n^6 + n^5 + 2n^3$ will be a multiple of $8$ for whichever $n$ you plug in. Second, that this quotient really is counting the tic-tac-toe boards! Let’s take a quick second and ask sage how true those look. First, we can just plug in a few thousand $n$s and see if we ever hit anything other than a multiple of $8$: Second, we know there’s only $2^9 = 512$ many ways to 2-color a tic-tac-toe board without counting rotations and reflections. So it’s not too time consuming to just list all of them and remove any we’ve seen before! So there we go! This actually ended up taking two days to write, since yesterday I got distracted from my distraction when I was talking to a friend about connections and how curvature is related to force in physics. I realized I don’t actually understand that as well as I thought I did, so I had to spend some time rereading a bunch of physics stuff, which was super fun, even if it took a while, haha. If you’re ever on a deserted island and you find yourself needing polynomials whose outputs are always divisible by some fixed number, this is an endless source! You might ask if every such polynomial arises from Pólya-Redfield counting in this way, and that’s obviously false (since, for instance, we’ll never get any constant terms)… But it’s not obviously false that every such polynomial arises from Pólya-Redfield counting after a change of variables! So with no intuition at all for whether it’s true or false, let me pose that as a conjecture: Conjecture: If $p$ is a polynomial so that $p(n)$ is a multiple of $k$ for every $n$, then there is a set $X$ and a group $G$ of order $k$ and a polynomial $f$ sending integers to integers so that Maybe $f$ can even be taken to be of the form $an+b$, why not. I haven’t thought about it either way, haha. For anyone interested in thinking about this conjecture, it’ll be nice to know that every polynomial sending integers to integers is an integer linear combination of binomial coefficients (see here). Amusingly, this was also shown by Pólya! You can push this further (as mentioned in the same wikipedia article) to note that $k \mid p(n)$ for all $n$ if and only if in the above representation $p = \sum c_i \binom{x}{i}$, all the $c_i$s are multiples of $k$. So we have an extremely concrete classification of these polynomials, which one might use to (dis)prove the conjecture8! As a cute aside, we’ve actually talked about this basis for polynomials in terms of binomial coefficients before! Seeing them turn up here was like seeing an old friend. Thanks again for hanging out, all! This was super fun, and it was a nice diversion from the blog post about my thesis work. It’s loooong, but really interesting, and I think people will enjoy it. This stuff relating fukaya categories, topological field theories, and representation theory is some of the coolest math I’ve ever seen, and I couldn’t have asked for a more fun thesis topic. Of course, I also need to get the topological topos posts cleaned up and submitted to journals, and I have a fun project that I want to finish up which will be interesting to categorical logicians and algebraic geometers! (At least, algebraic geometers of a certain kind, haha). There’s always more to do, but that’s part of the fun of it! After two months applying to postdocs and barely doing any math at all, I’m thrilled to be back at it ^_^. Alright, stay safe, all! We’ll talk soon 💖 or maybe it was a blog post or an mse question or something… Or maybe a footnote in a published paper? Part of why I can’t find it is because I don’t remember anything about where I read it! And back when I was an undergrad I was much worse about leaving myself searchable notes so I can quickly find interesting things again. ↩ If you’re curious why I decided to make this blog post now, I just started learning about cluster algebras today, and in the first lecture of Pavel Galashin’s series on the subject he mentions a recurrence relation which magically always gives integers, despite division happening. This reminded me of these polynomials which always give outputs divisible by some number (which is an easier version of the same phenomenon), and I went looking for the original article to share on mastodon. I couldn’t find it, so… I guess I’m writing one! John Baez and Jim Dolan and I also talked about species and structure types in our last meeting, and so I think I was also somewhat primed to think about generating functions and Pólya-Redfield counting based on that conversation. ↩ You need groupoids and “groupoid cardinality” to have access to division in general, since this lets you get rational numbers (and even certain real numbers!) as cardinalities. See, for instance, John and Jim’s famous paper From Finite Sets to Feynman Diagrams or this MO post for more. ↩ I saved a copy of these slides local to this blog to make sure they’re always available, but if you want them right from the horse’s mouth you can find them here, and a course schedule with all the available lecture slides here. ↩ As long as I’m missing my undergrad years, I’ll pass on an insight from one of my favorite undergrad professors Clinton Conley about this seemingly weird $g^{-1}$: Remember when you were learning how to graph functions and if you want to move the graph of $f(x)$ to the right by $c$, you look at the function $f(x-c)$? This trips up a lot of students first learning things, since it seems backwards to subtract $c$ to move to the right, especially since moving up and down has the much more sane behavior that $f(x)+c$ moves you up by $c$ units! This is the same phenomenon, where the action on the input is contravariant (and we act on $x$ by the inverse of $c$) while the action on the output is covariant (and we act on $f(x)$ by $c$)! There’s also something to be said here about left vs right actions, but I won’t say it, haha. Hopefully this Clinton-ism helps this make more sense, since I remember it really helped me when I was younger! ↩ I guess my type theory background is showing in this notation, haha. ↩ I think when most people talk about these bracelet problems, they quotient by the dihedral group instead of the cyclic group. This is because you can flip a bracelet over in 3D, so you have access to reflections too. I’m ignoring that for the sake of the example, though, and just counting things up to rotation. ↩ This is also a much more efficient source of polynomials for undergrad divisibility problems, but it wouldn’t have been anywhere near as fun! ↩

2 weeks ago 19 votes
$\mathsf{B}\text{Diff}(\Sigma)$ Classifies $\Sigma$-bundles

I’ve been trying to learn all about topological (quantum) field theories, the cobordism hypothesis, and how to use $(\infty,n)$-categories. This is all in service of some stuff I’m doing with skein algebras (which are part of a “$3+1$ TQFT” often named after Crane–Yetter, but I believe the state of the art is due to Costantino–Geer–Haïoun–Patureau-Mirand), but it’s incredibly interesting in its own right and it’s been super fun to get a chance to learn more physics and learn way more about topology and higher categories. Along the way, I was watching a lecture by Jacob Lurie where he mentions something that should probably be obvious (but wasn’t to me): For a (smooth) manifold $F$, $\mathsf{B}\text{Diff}(F)$ classifies bundles with fibre $F$ That is, given a manifold $M$ we have a natural bijection I asked about this in the category theory zulip, and got some great responses that still took me a while to digest. Actually, it wasn’t until I started reviewing a bunch of material on principal bundles1 that everything clicked into place for me. Over the course of this post, I’ll hopefully make this bijection obvious, and I’ll explain some facts about bundles along the way. This should serve as a nice quick blog post to write during my winter break since I’ve been extremely stressed for the last two months trying to get all my postdoc applications done2. Thankfully it’s all finally behind me, and I’m back in New York with people I love, so I’m resting up and doing math again (which have both been fantastic for my mental health). Let’s get to it! First, let’s remember what a bundle even is. Say we have a base space $M$ and a map $\pi : E \to M$. Then for each point $m \in M$ we have the fibre $E_m = \pi^{-1}(m)$, and so we can view $\pi$ as a family of spaces $(E_m)$ indexed by points of $M$. We think of $E_m$ as being “vertical” over $m \in M$, and we think of the topology on $E$ as telling us how these fibres are glued to each other “horizontally”3: We say that $\pi : E \to M$ is a fibre bundle if it’s “locally trivial” in the sense that (locally) the fibres are all $F$, and these copies of $F$ are all glued to each other in the most obvious possible way. Because we mathematicians feel the need to make newcomers feel inadequate, we call this most obvious way the “trivial” bundle: $\pi : M \times F \to M$. For a bundle to be locally trivial we don’t ask for it to literally be $M \times F$, that’s asking too much. Instead we ask that for every point $m \in M$ there should be a neighborhood $U_m$ so that $\pi^{-1}(U_m) \cong F \times U_m$4. We say that such a bundle is a Fibre Bundle with Fibre $F$, and here’s a completely natural question: Can we classify all fibre bundles with fibre $F$ over a base space $M$? Let’s think about how we might do this from a conceptual point of view, and that will lead us to the answer in the title of this post. Then we’ll reinterpret that conceptual understanding in much more concrete language, and hopefully explain why people care so much about principal bundles at the same time! If we put our type theory hats on5, then we want to think about a fibre bundle as being a map from our base space $M$ into a “space of all $F$s”. Then we think of this map as assigning an $F$ to every point of $M$, and this is exactly6 the data of a fibre bundle! But what should a “space of all $F$s” look like7? Well, as a first try let’s just take a huge manifold $\mathcal{U}$ and look at all copies of $F$ sitting inside it! For instance, we might take $\mathbb{R}^\infty$, and then an “$F$ sitting inside it” is an embedding $\iota : F \hookrightarrow \mathbb{R}^\infty$… almost! A choice of embedding has the ~bonus data~ of a parameterization, which an abstract “copy of $F$” wouldn’t have. We’re only interested in what’s morally the image of a particular embedding, so we should quotient out by reparameterization. Concretely, if $\varphi : F \overset{\sim}{\rightarrow} F$ is a diffeomorphism, then $\iota$ and $\iota \circ \varphi$ represent the same “copy” of $F$ in $\mathbb{R}^\infty$, so we find ourselves interested in the space Now it feels like we made a choice here, taking $\mathcal{U} = \mathbb{R}^\infty$, but we can worm our way out of this choice with homotopy theory! Indeed it’s believable that \(\{ \iota : F \hookrightarrow \mathbb{R}^\infty \}\) should be contractible, since if $F$ is finite dimensional then any two embeddings lie in a small subspace of $\mathbb{R}^\infty$, so there’s plenty of room to homotope them into each other. Then for any two such homotopies there’s still room to fill in the resulting circle, and so on. Since the space of embeddings is contractible, up to homotopy we’re working with8 and if we make “$\mathcal{U}$ is sufficiently large” mean “the space of embeddings of $F$ in $\mathcal{U}$ is contractible”, then we’ll get $\star \big / \text{Diff}(F)$ independent of this choice. Of course, $\star \big / \text{Diff}(F)$ is frequently called the Classifying Space $\mathsf{B} \text{Diff}(F)$, and so we’re halfway to our goal! We normally think of $\mathsf{B}G$ for a group $G$ as an object classifying “principal $G$-bundles”, so that a map $M \to \mathsf{B}G$ is the same thing as a bundle over $M$ whose fibres are all isomorphic to $G$… So, naively, it sounds like a map $M \to \mathsf{B}\text{Diff}(F)$ should be the data of a bundle over $M$ whose fibres are $\text{Diff}(F)$! But we wanted a bundle over $M$ whose fibres are $F$! What gives? The answer comes from the Associated Bundle Construction, which is the reason to care about principal bundles at all! In a remarkably strong sense, we can understand every “$G$-bundle” as long as we’re able to understand the principal $G$-bundles! Let’s see how, then use it to solve our problem in a (more) down-to-earth way. Concretely, we might understand a fibre bundle by picking a good open cover \(\{ U_\alpha \}\). Then we pick local trivializations $\varphi_\alpha : \pi^{-1}(U_\alpha) \overset{\sim}{\to} U_\alpha \times F$. If we want to answer a (local) question about the bundle, we can use one of these $\varphi_\alpha$s to transfer the question to a question about $U_\alpha \times F$, which is often easier to work with. But crucially we need to know that we’ll get the same answer no matter which of these “charts” $U_\alpha$ we work with. For instance, say our fibres are all $\mathbb{R}$. Then a natural question might be “is this element of the fibre $0$?”. If we always glue $0$ to $0$ then we get a consistent answer, but if at some point we glue $0$ in $U_\alpha$ to $1$ in $U_\beta$ then the answer will depend on which chart we use to do the computation! It turns out that more fundamental than the particular choice of trivialization are the “transition maps” $\varphi_\beta \circ \varphi_\alpha^{-1} : (U_\alpha \cap U_\beta) \times F \overset{\sim}{\to} (U_\alpha \cap U_\beta) \times F$ which tell us how to translate computations between our $\alpha$ and $\beta$ charts. These have to be the identity on the $U_\alpha \cap U_\beta$ part, so the interesting data is a map $\varphi_{\beta \alpha} : F \overset{\sim}{\to} F$ which tells us how to glue the $\alpha$ and $\beta$ charts along their intersection. Now if we want to answer questions corresponding to “bonus structure” on our fibre, we need to make sure we only ever glue in a way that preserves the answers to these questions. That is, instead of $\varphi_{\beta \alpha}$ being any old diffeomorphism of $F$, we want it to be in the subgroup of diffeomorphisms respecting this structure9. If we call this subgroup $G$, then we call our bundle a $G$-bundle. For example, if our fibres are all $\mathbb{R}^n$ and we want to ask vector-spacey questions, then we should only glue them using diffeomorphisms that respect the vector space structure. That is, we should only glue along $\mathsf{GL}(n) \subseteq \text{Diff}(\mathbb{R}^n)$, and so we’re working with a $\mathsf{GL}(n)$-bundle. If we moreover want to ask questions about the usual inner product on $\mathbb{R}^n$, then we have to glue in a way that preserves the inner product so our transition functions will be valued in $\mathsf{O}(n)$10. So it turns out the crucial data for a $G$-bundle are the transition maps $\varphi_{\beta \alpha} : U_\alpha \cap U_\beta \to G$. These have to satisfy the Cocycle Condition11 that $\varphi_{\gamma \beta} \varphi_{\beta \alpha} = \varphi_{\gamma \alpha}$, which crucially never mentions the fibre $F$! So we learn that $G$-bundles can be classified without referencing the fibre $F$ at all! Indeed, say we have the data of a $G$-cocycle $\varphi_{\beta \alpha}$. Then for any space $F$ with a $G$-action we can build a bundle with fibres $F$ by using these as our transition functions. Since we can choose any $F$ we want (at least for the purposes of classification) it’s reasonable to ask if there’s a “best” choice of fibre that makes things easier. To answer this one might ask if there’s a “best” choice of set that $G$ acts on, and from there we might be led to say that the best choice is $G$ acting on itself by multiplication! In this case our fibres are all $G$, and it’s easy to see this preserves “relative information” about $G$ since $h^{-1}k = (gh)^{-1}(gk)$ and the value of $h^{-1}k$ is preserved by the multiplication action. However this action does not preserve “absolute information” about $G$, such as the identity, since $ge = g \neq e$. The structure that’s preserved is called a torsor, and so we see that any $G$-valued cocycle on $M$ naturally gives us a bundle of $G$-torsors on $M$! This is called a Principal $G$-Bundle, and from the earlier discussion if we can classify the principal $G$-bundles, we can classify all $G$-bundles easily! It turns out that in many many ways if we want to understand $G$-bundles it’s enough to understand the principal ones. For instance connections on a $G$-bundle arise in a natural way from connections on a principal $G$-bundle, and principal $G$-bundles give us a way to work with bundles in a coordinate-invariant way. I won’t say any more about this, in the interest of keeping this post quick, but I’ll point you to the excellent blog posts by Nicolas James Marks Ford12 on $G$-bundles and connections, as well as the relevant sections of Baez and Muniain’s excellent book Gauge Fields, Knots, and Gravity. Concretely, how can we implement this bijection? Well, given a $G$-bundle $\pi : E \to M$ we can fix a good open cover \(\{ U_\alpha \}\) of $M$, and we can read off the transition maps $\varphi_{\beta \alpha} : U_\alpha \cap U_\beta \to G$, and then we can build a new $G$-bundle $\widetilde{E} \to M$ whose fibres are all $G$ (called the Associated Bundle to $E$) by gluing copies of $G$ together with these same transition maps $\varphi_{\beta \alpha}$. This is a principal $G$-bundle. In the other direction, say we have a principal $G$-bundle $\pi : P \to M$. Then if $G$ acts on any space $F$ we can build a new $G$-bundle which is fibrewise given by $E_m = (P_m \times F) \big / \sim$ where we quotient by the relation $(hg,f) \sim (h, gf)$. This is, in some sense, coming from the “universality” of $G$ as a $G$-set that made $G$ the “best” choice of set with a $G$-action13! It’s a cute exercise to check that these operations are inverse to each other! Now. Where did we start, and where did we end? Well we can read on wikipedia that $\mathsf{B}G$ classifies principal $G$-bundles. But we know that once we fix an action of $G$ on $F$, the principal $G$-bundles are in bijection with $G$-bundles with fibre $F$! So using the obvious action of $\text{Diff}(F)$ on $F$, and remembering that all bundles with fibre $F$ have transition functions landing in $\text{Diff}(F)$, we get a natural bijection Which is exactly what we were looking for! In hindsight, I’m not sure whether the story involving a “space of all $F$s” or the story about associated bundles is “more concrete”… I know the second story is easier to make precise, but I think both versions are good to see. I’m super happy to have finished this post in two days. I started writing it last night when I got to my friends’ house in New York, and I finished it this morning while they were addressing and signing a bunch of christmas cards. It’s good to be thinking and writing about math again after my two months of postdoc application hell, and I’m feeling SO comfortable and relaxed it’s just great to be around people I love. I’m going to hopefully find time to finish writing a series on skein algebras, hall algebras, and fukaya categories (which is the topic of my thesis) in the next month or two. That will be great, because I have a TON of fascinating stuff to share ^_^. As always, thanks for hanging out. Now more than ever stay safe, stay warm, and take care of each other. Nobody else is going to. Also… wow I’m tall. I’m trying to learn some gauge theory for other reasons ↩ For political reasons at UCR, it seems like none of the rising 6th years will get TA positions next year. Which means that me and a LOT of people in my cohort are having to graduate a year earlier than we planned, because like…. we need money to live, lol. We found this out in mid october, crucially after the NSF grant deadline, and only a month and a half before most postdoc applications were due. I made it happen, but it’s been a really unpleasant two months. ↩ I want this to be a quick low effort post, so I’m going to shamelessly steal all my graphics from Baez and Muniain’s book Gauge Fields, Knots, and Gravity. ↩ In this situation we say our condition is “local on the base” or “local on the target” since we have a neighborhood $U_m$ for every $m \in M$ the “base space”, which is the target of the map $\pi : E \to M$. This is to be contrasted with a condition that’s “local on the domain” or “local on the total space” or “local on the source”, where we have a neighborhood $U_e$ for every $e \in E$ the “domain” or “total space” or “source” of the map $\pi : E \to M$. ↩ You could just as well put your moduli space hat on, if you’re more comfortable with that. One of the ↩ It’s reasonable to ask why this definition only gives us locally trivial bundles. The reason is that a homotopy class of a map from a contractible space to the “space of all $F$s” is the same thing as a map from a point to this space. Composing with the contraction homotopy we find that the bundle we classify is trivial. So if you take a point if $M$ and restrict your map to a contractible neighborhood of $m$, you’ll find that the bundle is trivial on that neighborhood. ↩ I got this argument from Kevin Carlson in the CT Zulip, from the same thread I linked earlier. ↩ Here, of course, we have to work $\infty$-categorically and interpret this quotient as a homotopy quotient in order for it to be interesting! ↩ This is what “structure groups” are all about. I had half a mind to write up a longer thing explaining how structure groups work and why they correspond to structure on the fibre surviving to global structure on the bundle (especially because it was something that took me a while to understand)… But I also want this post to be done quickly, and I found myself wanting to draw a bunch of pictures and write a bunch of stuff about that, so… I didn’t 😌. I’m sure we’ll talk more about it someday. ↩ If at this point you’re thinking that if we don’t want to preserve ~any~ bonus structure we should take our structure group to be the whole of $\text{Diff}(F)$, then congratulations! You are making an observation that would have saved me… a lot of time and confusion a few months ago, haha. ↩ I won’t get into this here, and since I want this to be quick I don’t want to find a good reference, but this use of the word “cocycle” isn’t an accident, and indeed we’re computing the Čech cohomology of the base space with coefficients in $G$. The cocycle condition tells us we’re considering cocycles, and the coboundaries tell us when we’re describing the same bundle with different intitial choices of trivialization. ↩ It’s kind of hilarious how similar his blog looks to mine. We’re using very similar color schemes. ↩ Note also that the action of left multiplication of $G$ on itself commutes with (“preserves”) the action of right multiplication of $G$ on itself (this is associativity). So a principal $G$-bundle has a global right action of $G$, and a trivial bundle with fibre $F$ has a global left action of $G$, since the identity diffeomorphism preserves the left action of $G$ on $F$. So we have two bundles, one with a global right $G$-action and one with a global left $G$-action. So we can take thier tensor product to be left with a new bundle, not necessarily admitting a global $G$-action. ↩

a month ago 44 votes
Finiteness in Sheaf Topoi

The notion of “finiteness” is constructively subtle in ways that can be tricky for people new to the subject to understand. For a while now I’ve wanted to figure out what’s going on with the different versions of “finite” in a way that felt concrete and obvious (I mentioned this in a few older blog posts here and here), and for me that means I want to understand them inside a sheaf topos $\mathsf{Sh}(X)$. I’ve thought about this a few times, but I wasn’t able to really see what was happening until a few days ago when I realized I had a serious misconception about picturing bundles and etale spaces! In this post, we’ll talk about that misconception, and spend some time discussing constructive finiteness in its most important forms. As a short life update before we get started, I’m currently in Denmark with my advisor to hang out with Fabian Haiden. We’re going to be talking about all sorts of fun things related to my thesis work, mainly about Fukaya Categories. These can be really scary when you first start reading about them, but in the special case of surfaces they’re really not that bad! In the process of prepping for meeting Fabian, I’ve been writing a blog post that explicitly details what fukaya categories are, why you should care, and (for surfaces) how to compute them. I know I would have loved an article like this, and I’m excited to share one soon! I’m also finally going to finish my qual prep series from three years ago! Way back then I promised a post on fourier theory that I never got around to, but I recently found a draft of that post! So it might not be 100% true to what I was studying back then, but it’ll be as true to that as I can make it1. With all that out of the way, let’s get to it! First, let’s just recall the notions of finiteness you’re likely to find in the literature2. Keep in mind that, depending on the reference you’re reading, each of these is liable to be called just “finite” without disambiguation. Sometimes you even get more confusing conventions, (such as writing “subfinite” to mean what we call “kuratowski finite”!) so make sure you read carefully to know exactly what each particular author means! For this post, though, we’ll write: If $n : \mathbb{N}$, the Finite Cardinal of size $n$ is the set \([n] = \{ x : \mathbb{N} \mid x \lt n \}\). A set $X$ is called Bishop Finite if it’s (locally) isomorphic to a cardinal. That is, if \(\exists n : \mathbb{N} . \exists f : X \cong [n]\) A set $X$ is called Kuratowski Finite if it’s (locally) the image of a cardinal. That is, if \(\exists n : \mathbb{N} . \exists f : [n] \twoheadrightarrow X\). A set $X$ is called Subfinite if it’s (locally) a subobject of a cardinal. That is, if \(\exists n : \mathbb{N} . \exists f : X \hookrightarrow [n]\). A set $X$ is called Dedekind Finite if every mono $X \hookrightarrow X$ is actually an iso. That is, if we can prove \(\forall f : X \hookrightarrow X . \exists g : X \to X . fg = 1_X \land gf = 1_X\). It’s pretty obvious from these definitions that dedekind finiteness is a bit different from the others. This comes with pros and cons, but in my experience it’s rarely the thing to consider. I’m including its definition more for cultural growth than anything, and I won’t say anything more about it in this post3. Also, notice we’re putting an existential quantifier in front of everything. Externally, this means that we’re allowed to pass to an open cover of our base space and use a different $f$ (or indeed, a different $n$!) on each open. If you prefer type theory4, you can replace every instance of $\exists$ by the propositional truncation of a $\Sigma$-type. It’s interesting to ask what the “untruncated” versions of these will be. I think that untruncated bishop finite types are exactly the cardinals, untruncated subfinite types are disjoint unions of finitely many propositions… But it’s not clear to me what the untruncated kuratowski finite types should be. Something like “finitely many copies of $B$, glued together along open sets”… But I don’t see a snappy characterization of these5. Introducing Finiteness Now, how should we go about visualizing these things? Every sheaf on $B$ is also an etale space over $B$ (that is, a local homeomorphism6 $E \to B$). And by thinking of our favorite sheaf as some space over $B$ we can draw pictures of it! Let’s start simple and let $B = [-1,1]$ be an interval. Then a cardinal7 is a sheaf \([n] = \{x : \mathbb{N} \mid x \lt n \}\). In every sheaf topos $\mathsf{Sh}(B)$ the natural number object is (the sheaf of sections of) $B \times \mathbb{N}$ where $\mathbb{N}$ gets the discrete topology. So if $n$ is a global section, it must be \(B \times \{ n \}\) for some fixed $n$ in “the real world”. This also tells us that \([n] = \{ x : \mathbb{N} \mid x \lt n \}\) is given by The situation is only slightly more complicated if $B$ has multiple components8 (say $B = [-1,1] \sqcup S^1$). In this case, $\mathbb{N}$ is still $B \times \mathbb{N}$, but now global sections can choose a different natural over each component: Because of this, we end up with more sets $[n]$! Indeed, now a global section $n$ might be something like $(2,3)$ (shown in pink above) so that we get a cardinal $[(2,3)]$, shown below: So cardinals are really really easy to work with! This is what we would expect, since they’re subsheaves of a particularly simple sheaf ($\mathbb{N}$). For instance, we can see the fact that cardinals always have decidable equality by noticing that any two sections over $U$ are either equal on all of $U$, or none of $U$! Contrast this with a doodle of what other sheaves might look like, where the pink and blue sections are different, but nonetheless intersect. Then the truth value of $(\text{pink} = \text{blue}) \lor (\text{pink} \neq \text{blue})$ as computed in $\mathsf{Sh}([-1,1]) \big / U$ will not be all of $U$9! We also see why a cardinal is either empty or inhabited! As soon as you have a piece of a section over $U$, it must extend to a section on all of $U$. Next up are the bishop finite sets10! These satisfy $\exists f : X \cong [n]$, so that we can find an open cover \(\{U_\alpha \}\) of $B$ and functions \(f_\alpha : X \! \upharpoonright_\alpha \cong [n] \! \upharpoonright_\alpha\). Since $[n]$ is a constant family over $B$, this means we want $X$ to be locally constant. So bishop finite sets have to basically look like cardinals, but now we’re allowed to “twist” the fibres around. See how on each element of our cover we have something that looks like $[2]$, but globally we get something that is not isomorphic to $[2]$! In fact, the bishop finite sets are exactly the covering spaces with finite fibres. These inherit lots of nice properties from the cardinals, since as far as the internal logic is concerned they’re isomorphic to cardinals! For instance, here’s an internal proof that bishop finite sets have decidable equality: $\ulcorner$ We know $\exists f : X \cong [n]$. Fix such an $f$, and let $x,y : X$. Since $f$ is an isomorphism we know $x=y$ if and only if $fx = fy$, but we can decide this using decidable equality on $[n]$ (which comes from decidable equality on $\mathbb{N}$, which we prove by induction). $\lrcorner$ We can also see this externally. As before, this is saying that two sections over $U$ either agree everywhere or nowhere (and it’s a cute exercise to see this for yourself). But we can even see this in a third way! It says that the subsheaf \(\{(x,y) : X \times X \mid x=y \}\) is a clopen subset of $X \times X$! But we can compute $X \times X$ (the pullback of $X \to B$ along $X \to B$) and check this. Note that the subsheaf (really sub-etale space) where the fibres are the same really is a clopen subset (read: a connected component) of the whole space $X \times X$ over $B$. If it’s not obvious that this really is the pullback, it makes a fun exercise! Actually, it’s a fun exercise to check this is true in general! As a last remark, let’s also notice that bishop finite sets can be inhabited without being pointed. That is, a bishop finite set $X$ can satisfy $\exists x : X$ ($X$ is inhabited11) without actually having $x:X$ for any $x$ ($X$ is not pointed)! This is because of the locality of the existential quantifier again! If we pass to an open cover of $B$, we can find a local section over each open. Unfortunately, these might not be compatible, so won’t glue to a global section (a point $x:X$)! Ok, now let’s get to the example that led me to write this post! How should we visualize kuratowski finite sets? These are (locally) quotients of cardinals, so we start with a cardinal (which we know how to visualize) and then we want to start gluing stuff together. Let’s start with a trivial double cover of $B = [-1,1]$ (that is, with two copies of $B$), and glue them together along their common open subset $[-1,0) \cup (0,1]$. This space is the line with two origins, and it’s famously nonhausdorff! You can see that this space still deserves to be called “finite” over $B$ (for instance, all the fibres are finite), but it’s more complicated than the bishop case. For instance, it doesn’t have a well defined “cardinality”. In some places it has one element in the fibre, and in others it has two. However, it’s still possible to list or enumerate all the sections. In the above picture, there’s the gold section and the blue section. It just happens that sometimes this list has duplicates, since away from $0$ they’re the same section! The failure of hausdorffness and the failure of decidable equality are closely related. For example, consider the two sections from the previous picture. Decidable equality says that they should be either everywhere equal or everywhere unequal12. But of course they aren’t! They’re equal in some places (over $1$, say) but unequal in others (over $0$)! Following Richard Borcherds, hausdorffness is also closely related to a kind of “analytic continuation”. Indeed, say your etale space $X$ is hausdorff and you pick a point $x_b$ over $b \in B$ (in this context we think of $x_b$ as the germ of a function at $b$). Then for any small enough open $U \ni b$, there is a unique extension of $x_b$ to a function on the whole neighborhood $U$! We are able to analytically continue $x_b$ from data at just a point to data defined in some neighborhood! Here’s an easy exercise to see this for yourself and check your knowledge of the topology on the etale space of a sheaf: Let $\mathcal{F}$ be a sheaf on $B$ whose etale space $X$ is hausdorff, and let $U$ be a connected open of $X$. Let $f,g \in \mathcal{F}(U)$ be local sections with $f(b_0) = g(b_0)$ for some $b_0 \in U$. Then show $f=g$ identically on $U$. As a hint, consider the set \(\{ b \in U \mid f(b) = g(b) \}\). Why is it open? Why is it closed? In fact, if $B$ is locally connected and hausdorff, then the converse is also true, and our sheaf of sections $\mathcal{F}$ has this analytic continuation property if and only if its etale space is hausdorff13! This also makes a nice exercise (though it’s less easy), and I’ll include a solution below a fold. solution $\ulcorner$ definition of stalk says that $f=g$ on a neighborhood of $b$. Since $U$ is connected, we learn this subset is the whole of $U$ and $\mathcal{F}$ has analytic continuation. $\lrcorner$ When I was trying to picture kuratowski finite objects inside $\mathsf{Sh}(B)$, I’d somehow convinced myself that a local homeomorphism over a hausdorff space has to itself be hausdorff. This is, obviously, not true! So I was trying to picture kuratowski finite sets and struggling, because I was only ever picturing hausdorff covering spaces. And we’ll see later that all hausdorff kuratowski finite sheaves (over spaces I was picturing) are actually bishop finite! So it’s no wonder I was confused! I didn’t realize the complexity (especially the nonhausdorff complexity) that etale spaces are allowed to have, even though in hindsight I’d been warned about this before14. Going back to examples, we should ask if we recognize the sheaf of sections for our favorite “line with two origins” picture from earlier. Away from $0$, we know there’s exactly one section, but in any neighbordhood of $0$ there’s two sections. So we see this is the etale space for the skyscraper sheaf with two points over $0$! Since our epi $[n] \twoheadrightarrow X$ only has to exist locally, kuratowski finite objects in $\mathsf{Sh}(B)$ can have the same twisting behavior as bishop finite sets too! If you look up an example of a kuratowski finite set that isn’t bishop finite (for example, in Section 2.2.2 of Graham Manuell’s excellent note), you’re likely to find the example \(\{ \top, p \}\) where $p$ is some truth value other than $\top$ or $\bot$. That is, where $p$ is some open set of the base space $B$. Away from $p$, this set has two elements, $\top$ and $p$ (since away from $p$ we know $p = \bot$). But inside of $p$, this set has a single element (since in $p$ we know $p = \top$). We find its etale space is nonhausdorff as before, but now the nonhausdorffness is “spread out” over a larger set. It’s a good exercise to figure out what’s happening at the boundary of $p$. What are the basic opens of this space? Does the fibre at $p$ have one point, or two? Kuratwoski finite sets can be shockingly complicated! Especially in contrast to the relatively tame bishop finite sets. For example, consider $B = \mathbb{R}$ with a skyscraper sheaf at every integer15: This space is kuratowski finite in $\mathsf{Sh}(\mathbb{R})$ (since it’s locally a quotient of $\mathbb{R} \sqcup \mathbb{R}$), but it has continuum many global sections! Indeed, we can choose either the top or the bottom point at every integer. Since even the choice of $[n]$ is local, we can change the size of the fibres as long as they’re all finite! In fact, on mastodon (here and here), Antoine Chambert-Loir and Oscar Cunningham pointed out that you can do even weirder things. For instance, the set \(\{\frac{1}{n} \mid n \in \mathbb{N} \} \cup \{ 0 \}\) is closed in $[0,1]$. So its complement is open, and we can glue two copies of $[0,1]$ together along it! This gives us a kuratowski finite object which nonetheless has infinitely many sections in every neighborhood of $0$16! Finally, let’s talk about subfinite sets. These are locally subobjects of cardinals, so are much easier to visualize again! A subobject of a cardinal is just a disjoint union of open subsets of $B$: So a subfinite set has to locally look like these. Keep in mind, though, that like kuratowski finite sets, the choice of $[n]$ is made locally. So we can have something like this: In particular, we see that the fibres of a subfinite set don’t need to be globally bounded! So, perhaps surprisingly, a subfinite set does not need to be a subobject of a bishop finite set! As we’ve come to expect, the existential quantifier gives us the ability to twist. As subobjects of decidable objects, we see these inherit decidable equality. However, while bishop and kuratowski finite objects are all either empty or inhabited, subfinite sets don’t need to be! For an easy example, consider Here the truth value of $\exists x : X$ is $U \cup V$, and the truth value of $X = \emptyset$ is $\text{int}(U^c \cap V^c)$. (Do you see why?) An aside on decidable equality This isn’t technically about visualizing etale spaces, but I was thinking about it while writing this post, and I think it fits well enough to include. If nothing else, it’s deifnitely interesting enough to include! If you’re really paying attention, you’ll notice the “easy exercise” about analytic continuation didn’t actually use any aspects of finiteness. Indeed, we can prove the following (very general) theorem: Let $B$ be locally connected and hausdorff. Then the following are equivalent for a sheaf $\mathcal{F}$ on $B$: The etale space of $\mathcal{F}$ is hausdorff Sections of $\mathcal{F}$ satisfy “analytic continuation”, in the sense that two sections agreeing on a stalk17 must agree everywhere they’re defined $\mathcal{F}$ has decidable equality in the internal logic of $\mathsf{Sh}(B)$ This comes from a mathoverflow answer by Elías Guisado Villalgordo $\ulcorner$ The equivalence of $1$ and $2$ was the earlier exercise, and you can find a proof below it under a fold/ To check the equivalence of $2$ and $3$, we look at “decidable equality”, that $\forall f,g : \mathcal{F} . (f=g) \lor (f \neq g)$. Externally, this says that for every $U$, for every pair of local sections $f,g \in \mathcal{F}(U)$, there’s an open cover \(\{U_\alpha\}\) so that on each member of the cover \(f \! \upharpoonright_{U_\alpha}\) and \(g \! \upharpoonright_{U_\alpha}\) are either everywhere equal or everywhere unequal. Clearly if $\mathcal{F}$ has analytic continuation then $\mathcal{F}$ has decidable equality. Indeed using local connectedness of $B$ we can cover $U$ by connected opens. By analytic continuation we know that on each connected piece either $f=g$ everywhere or nowhere, as desired. Conversely, say $\mathcal{F}$ has decidable equality. Fix a connected $U$ and sections $f,g \in \mathcal{F}(U)$ which have the same stalk at some point $b \in U$. By decidable equality, there’s an open cover \(\{ U_\alpha \}\) of $U$ where on each member of the cover $f=g$ holds everywhere or nowhere. Recall that since $U$ is connected, for any two opens in the cover $U_\alpha$ and $U_\beta$ we can find a finite chain $U_\alpha = U_0, U_1, U_2, \ldots, U_n = U_\beta$ of opens in our cover where each $U_i$ intersects $U_{i+1}$. Then for any $U_\beta$ in the cover, fix such a chain from $U_0 \ni b$ to $U_\beta$. By decidable equality we know $f=g$ on $U_0$. So for the point in $U_0 \cap U_1$ we know $f=g$, and by decidable equality again we learn $f=g$ on $U_1$. Proceeding in this way we learn $f=g$ on the whole of $U_\beta$. Thus $f=g$ on every member of the cover, so on the whole of $U$, and so $\mathcal{F}$ satisfies analytic continuation. $\lrcorner$ If you’ve internalized this, it makes it geometrically believable that a decidable kuratowski finite set is actually bishop finite! I think it’s fun to see this fact both syntactically (reasoning in the internal logic) and semantically (reasoning about bundles), so let’s see two proofs! First, a syntactic proof: $\ulcorner$ Say $f : [n] \twoheadrightarrow X$. Then we note for each $x:X$ the predicate $\varphi(k) \triangleq f(k) = x$ on $[n]$ is decidable since $X$ has decidable equality. Thus it gives an inhabited decidable subset of $[n]$, which has a least element (Lemma D5.2.9(iii) in the elephant). Call such an element $g(x)$. Note the image of $g$ is a complemented subset of $[n]$, since we can decide $k \in \text{im}(g)$ by checking if $k = g(f(k))$ using decidability of $[n]$. Then $\text{im}(g)$ is isomorphic to a cardinal (D5.2.3), $[m]$, and it’s easy to see that composing $g$ and $f$ with this isomorphism gives a bijection between $X$ and $[m]$. $\lrcorner$ And now a semantic proof: $\ulcorner$ Say $\pi : E \twoheadrightarrow B$ is the etale space of a kuratowski finite object in $\mathsf{Sh}(B)$. So, locally, $E$ is the quotient of some $\coprod_n B$. Say also that $E$ is has decidable equality so that, locally, two sections of $E$ that agree somewhere must agree everywhere. By refining our covers, then, we may fix an open cover \(\{U_\alpha\}\) of $B$ and epimorphisms \(\coprod_{n_\alpha} U_\alpha \to E \! \upharpoonright_{U_\alpha}\) so that two local sections on $U_\alpha$ either agree everywhere or nowhere. In particular, the $n_\alpha$ many components $f_i : U_\alpha \overset{i\text{th inclusion}}{\longrightarrow} \coprod_{n_\alpha} \to E$ are pairwise either have the same image or disjoint images. Choosing exactly one $f_i$ for each possible image (the least $i$ that works, say) we see that actually $\pi^{-1}(U_\alpha)$ is homeomorphic to the disjoint union of copies of $U_\alpha$ so that $E$ is actually a covering space, and thus represents a bishop finite object. $\lrcorner$ It’s worth meditating on why these are actually the same proof! In both cases we use decidability to “remove duplicates” from our enumeration. Which Finite to Use? Constructively, bishop and kuratowski finiteness encode two aspects of finiteness which are conflated in the classical finite world. Bishop finite sets are those which admit a cardinality. They’re in bijection with some $[n]$, and so have a well defined notion of size. Kuratowski finite sets, in contrast, are those equipped with an enumeration. You can list all the elements of a kuratowski finite set. Of course, knowing how to biject a set $X$ with a set $[n]$ always tells you how to enumerate $X$. But constructively knowing how to enumerate $X$ doesn’t tell you how to biject it with some $[n]$! As we saw earlier, the problem lies with removing duplicates. It’s worth taking a second to visualize a kuratowski finite set that isn’t bishop finite, and convince yourself that the question “is this a duplicate element” can have more subtle answers than just “yes” or “no”. This makes it impossible to remove the duplicates. This bifurcation might feel strange at first, but in some squishy sense it happens classically too once you start working with infinite sets. Indeed in set theory the notion of finiteness bifurcates into cardinals, which have a defined notion of “size”, and ordinals, which have a defined notion of “order” (kind of like an enumeration… if you squint). So when you’re working constructively, ask why you’re using finiteness. Do you really need to know that something literally has $n$ elements for some $n$? Or is it enough to know that you can put the elements in a finite list? Does it matter if your list has duplicates? These questions will tell you whether bishop finiteness or kuratowski finiteness is right for your purposes. Subfiniteness, which amounts to being contained in some $[n]$, I find to be less useful “in its own right”. Instead, if you’re able to prove a result for bishop finite objects, it’s worth asking if you can strengthen that to a proof for subfinite ones! In any case, as long as we work with decidable things, these notions all coincide. As we saw earlier, a decidable quotient of a bishop finite set (that is, a decidable kuratowski finite set) is again bishop finite. Similarly, a complemented subset of a bishop finite set is finite (this is lemma D5.2.3 in the elephant, but it’s not hard to see yourself). So if you’re working in a context where everything is decidable, finite sets work exactly as you would expect! This is part of why constructive math doesn’t have much to say about combinatorics and algebra. Most arguments are already constructive for decidable finite sets (which is what you’re picturing when you picture a finite set and do combinatorics to it). Sometimes you can push things a bit farther though, and this can be interesting. See, for instance, Andreas Blass’s paper on the Constructive Pigeonhole Principle. Another blog post done! I’m writing this on a train to Odense right now, and once I get to my room I’ll draw all the pictures and post it. Hopefully this helps demystify the ways in which constructive math might look strange at first. Once again the seemingly bizarre behavior is explained by its vastly greater generality! When you first learn that a subset of a (bishop) finite set need not be (bishop) finite, it sounds so strange as to be unusable! But once you learn that this is an aspect of “everywhere definedness” in a space of parameters (read: the base space) it becomes much more palatable. Thanks, as always, for hanging out. Try to keep cool during this summer heat (though it’s actually really pleasant in Denmark right now), and I’ll see you all soon ^_^. Here’s a picture of me and Peter right after we got off the train: And, of course, I still really want to finish the blog post on 2-categories and why you should care… There’s just so many other things to think about and to write! It also feels like a big topic because I have a lot to say, and that makes me a bit scared to actually start. Especially since I’ve been writing a lot of longer form posts lately (like the three part series on the topological topos, and I can tell the fukaya post is going to be long too…) ↩ Here we’re using the expected abbreviations. We write $f : X \cong Y$ to mean $f : X \to Y$ and $\exists g : Y \to X . gf = 1_X \land fg = 1_Y$. We write $f : X \hookrightarrow Y$ to mean $f : X \to Y$ and $\forall x_1, x_2 : X . fx_1 = fx_2 \to x_1 = x_2$ We write $f : X \twoheadrightarrow Y$ to mean $f : X \to Y$ and $\forall y : Y . \exists x : X . fx = y$. ↩ If you want to learn more about it, I highly recommend Stout’s Dedekind Finiteness in Topoi. I don’t know of a good way to visualize the dedekind finite objects in $\mathsf{Sh}(X)$ (though I haven’t tried at all – I want this to be a quick post) which was another reason to exclude them. ↩ Which I often do, but not today. ↩ And I want this to be a quick post, finished within a day or two of starting. So I don’t have time to think if there’s something slicker (and intuitively I don’t expeect there to be). Edit: I got about as close to “within a day or two” as is possible for me, haha. I started writing this on August 12, and it looks like I’ll get it posted on August 18. Given I went a few days without working on it, that’s not too bad! ↩ That is, a space built by gluing together opens of $B$ along common intersections ↩ Johnstone’s Elephant defines a cardinal to be a pullback of the “universal cardinal”. In $\mathsf{Sh}(B)$ this is an object in $\mathsf{Sh}(B) \big / \mathbb{N} \simeq \mathsf{Sh}(B \times \mathbb{N})$ and is shown in the following picture: ↩ We’ll still assume that $B$ is locally connected, though, since that makes them much easier to draw. It’s important to remember that our geometric intuition is likely to fail us if we start considering, say, sheaves on the cantor space! ↩ Remember truth values are open sets in the base space, so the truth value of $(\text{pink} = \text{blue}) \lor (\text{pink} \neq \text{blue})$ will be \(\text{int}(\{ \text{pink} = \text{blue} \}) \cup \text{int}( \{ \text{pink} \neq \text{blue} \} )\) ↩ I’m like… at least 90% sure it’s an accident these are both named after positions of authority in catholicism. ↩ Here we use the common abuse of notation of abbreviating “$\exists x:X . \top$” by just “$\exists x:X$”. It measures “to what extent” there exists an element of $X$. That is, its truth value is the support of $X$ over $B$ – \(\text{int} \big ( \{b \in B \mid \exists x \in X_b \} \big )\) ↩ Well, it says this should be true locally. But it’s easy to see we’ll have a problem in any neighborhood of $0$. ↩ I stole this whole exercise from this mathoverflow answer by Elías Guisado Villalgordo ↩ In that same Richard Borcherds video, for instance, which I know I watched when it came out. In fact, lots of references on etale spaces, in hindsight, emphasize the fact that etale spaces can be highly nonhausdorff. My guess is that I’m not the first person to have made this mistake, haha, and I know that when I teach etale spaces going forwards I’m going to be sure to emphasize this too! In his talk Nonetheless One Should Learn the Language of Topos, Colin McLarty compares etale spaces to a piece of mica (around the 1h04m mark), and I think I’m starting to see what he means. ↩ Concretely you can build this space by gluing two copies of $\mathbb{R}$ together along their common open subset $\mathbb{R} \setminus \mathbb{Z}$. ↩ This is a fun example to think about. Why does it have only countably many sections at $0$, rather than uncountably many? Can you picture its topology at all? See the linked mastodon posts for more discussion. ↩ Saying that $f$ and $g$ agree on the stalk at $b$ is saying that there’s an open neighborhood of $b$ where $f=g$ on that neighborhood. So this is another way of saying that as soon as two sections agree on an arbitrarily small neighborhood, they must agree on their whole (connected) domain! ↩

5 months ago 52 votes
Life in Johnstone's Topological Topos 3 -- Bonus Axioms

In the first post of the series, we talked about what the topological topos is, and how we can think about its objects (and, importantly, how we can relate computations in the topos $\mathcal{T}$ to computations with topological spaces in “the real world”). In part two, we talked about algebraic structures, and how (for example) groups in $\mathcal{T}$ are related to topological groups. In that post we alluded to the presence of ~bonus axioms~ that allow us to reason in $\mathcal{T}$ more freely than in many other topoi. For instance, we have access to a certain amount of choice. We also have access to a powerful principle saying that every function between metric spaces is $\delta$-$\epsilon$ continuous! In this post we’ll spend some time talking about these bonus axioms, and proving that they’re true (since a lot of these facts are basically folklore). Let’s get to it! First, let’s take a second to recall the definition of the grothendieck topology $J$ for $\mathcal{T}$. We’re going to be externalizing a lot of theorems, and it’ll be good to have the open cover condition on hand. Here’s what we wrote back in the first post, but you can of course read more in Section 3 of Johnstone’s original paper. For the site with two objects, \(\{1, \mathbb{N}_\infty\}\), every (nonempty) family of arrows \(\{X_\alpha \to 1 \}\) is covering. So the interesting question is what a covering family of \(\mathbb{N}_\infty\) looks like. If $S$ is an infinite subset of $\mathbb{N}$, we write $f_S$ for the unique monotone map \(\mathbb{N}_\infty \to \mathbb{N}_\infty\) whose image is \(S \cup \{ \infty \}\). A family \(\{X_\alpha \to \mathbb{N}_\infty\}\) is covering if and only if both It contains every constant map \(1 \to \mathbb{N}_\infty\) For every infinite $T \subseteq \mathbb{N}$, there is a further infinite subset $S \subseteq T$ with \(f_S : \mathbb{N}_\infty \to \mathbb{N}_\infty\) in the family In particular, if a family contains every constant map \(1 \to \mathbb{N}_\infty\) and a “tail of an infinite sequence” \(f_{\{x \geq N\}}\) for some $N$, then that family is covering. So, roughly, to prove that something “merely exists” in $\mathcal{T}$, we have to provide a witness for every finite $n$, and these witnesses should converge to the witness for $\infty$. If we want to use the site with one object \(\{ \mathbb{N}_\infty \}\), the condition is almost exactly the same. A family of maps is covering if and only if both every constant map \(\mathbb{N}_\infty \to \mathbb{N}_\infty\) is in the family For each infinite $T \subseteq \mathbb{N}$, there’s a further infinite $S \subseteq T$ so that $f_S$ is in the family. This, unsurprisingly, doesn’t make too much difference. Dependent Choice To start, $\mathcal{T}$ models Dependent Choice. You can find a proof in Shulman and Simpson’s aptly named note Dependent Choice in Johnstone’s Topological Topos. Say you have a relation $R \subseteq X \times X$ which is total in the sense that $\forall x . \exists y . R(x,y)$. Then DC says for each $x_0 : X$, there’s a function $f : \mathbb{N} \to X$ so that $f(0) = x_0$ and for each $n$, $R(f(n), f(n+1))$. This is intuitively obvious. After all, we start with $x_0$, then by totality the set \(\{ y \mid R(x_0,y) \}\) is inhabited. So we can choose an element $x_1$ from this set. Similarly we can choose an $x_2$ from \(\{y \mid R(x_1,y) \}\), and so on. Notice that the choices we make are allowed to depend on the choices that came before. After all, it’s possible that for two different choices \(x_1,x_1' \in \{ y \mid R(x_0,y) \}\) the sets \(\{ y \mid R(x_1,y) \}\) and \(\{ y \mid R(x_1',y) \}\) might be different, leading to different allowable choices of $x_2$! Thus, DC basically tells us that recursive definitions work, even if we don’t have a canonical way to choose one of many options at each recursive stage. Indeed, most recursive definitions work by first choosing an $x_0$, and then arguing that the set of “allowable” next steps is always inhabited1. For more information about DC and how to think about it, I recommend Karagila’s excellent paper on the subject. For those who like type theory, DC says that for every binary relation $R$ on a type $X$, or, cashing out these \(\lVert \Sigma \rVert\)s for \(\exists\)s, \(\left ( \forall (x:X) . \exists (y:X) . R(x,y) \right ) \to \forall (x_0 : X) . \exists (f : \mathbb{N} \to X) . f(0) = x_0 \land \forall n . R(f(n), f(n+1))\) Here’s an idiomatic example of dependent choice in action: The Baire Category Theorem for complete metric spaces. Let $(X,d)$ be a (cauchy) complete metric space2 with inhabited (strongly3) dense open sets $U_1, U_2, U_3, \ldots$. Then the countable intersection $\bigcap_n U_n$ is still (strongly) dense. The usual proof (say, from Karagila’s notes) doesn’t use LEM, so it goes through unchanged. We’ll present it here paying special attention to the use of DC. $\ulcorner$ Let $V$ be open in $X$. We need to show that $V \cap \bigcap_n U_n$ is inhabited. Since $U_1$ is strongly dense, we know that $V \cap U_1$ is inhabited, say by $x_1$. Now since $U_1 \cap V$ is open, we can find a radius $r_1$ so that $0 \lt r_1 \lt 2^{-1}$ the (strongly closed) ball \(\overline{B(x_1,r_1)} = \{ y \mid d(x_1,y) \leq r_1 \}\) is contained in $V \cap U_1$ By dependent choice (and strong density of each $U_k$), we may recursively build a sequence of pairs $(x_n,r_n)$ so that $0 \lt r_{n+1} \lt 2^{-(n+1)}$ the (strongly closed) ball $\overline{B(x_{n+1},r_{n+1})}$ is contained in $B(x_n,r_n) \cap U_{n+1}$. By construction, then, the $x_n$s assemble into a cauchy sequence whose limit $x_\infty$ lies in each $B(x_n,r_n)$. Therefore $x_\infty \in \bigcap_n B(x_n,r_n) \subseteq V \cap \bigcap_n U_n$, so that $\bigcap_n U_n$ is dense, as desired. $\lrcorner$ Can you build a relation $R$ on $(X \times \mathbb{R}_{\gt 0} \times \mathbb{N})$ so that $R \big ( (x,r,n), \ (y,s,m) \big )$ holds exactly when $m=n+1$ and $(y,s)$ satisfy the above conditions compared to $(x,r)$? This will let you see precisely how dependent choice is used above. Dependent Choice implies Countable Choice, which itself implies Weak Countable Choice. But WCC implies that the dedekind reals and the cauchy reals agree. And indeed one can show directly that in $\mathcal{T}$ both the dedekind and cauchy reals are given by $よ\mathbb{R}$. Brouwer’s Principle The next ~bonus axiom~ we’ll talk about is Brouwer’s Continuity Principle that “Every function $f : \mathbb{R} \to \mathbb{R}$ is continuous”! Precisely4: \(\mathcal{T} \vDash \forall f : \mathbb{R}^\mathbb{R} . \ \forall \epsilon : \mathbb{R}_{\gt 0} . \ \forall a : \mathbb{R} . \ \exists \delta : \mathbb{R}_{\gt 0} . \ \forall b : \mathbb{R} . \ d(a,b) \lt \delta \to d(fa,fb) \lt \epsilon\) In fact, this is true for any (external) metric spaces $X$ and $Y$ where $X$ is (externally) locally compact! In particular, it’s also true that every function $\mathbb{N}^\mathbb{N} \to \mathbb{N}$ is $\epsilon$-$\delta$ continuous. Note that it’s crucial here that we use the truncated $\exists$ rather than the untruncated $\Sigma$ in the statement of this theorem. Martín Escardó and Chuangjie Xu have shown that the untruncated version of this theorem isn’t just false in $\mathcal{T}$, it’s provably false in every topos! Regardless, it’s shockingly hard to find this continuity principle written down anywhere, but it’s cited in lots of places! It’s definitely part of the folklore, and I’m happy to share a full proof! It’s a bit long, though, so I’m leaving it as an “appendix” at the bottom of this post. If you know of a reference, or of a slicker proof than the one I found, I would REALLY love to hear about it5! Regardless, the truth of Brouwer’s principle tells us that $\mathcal{T}$ is a rather stronger version of Solovay’s Model. Solovay’s model validates \(\mathsf{LEM}+\mathsf{DC}+\)“every function $\mathbb{R} \to \mathbb{R}$ is measurable”. In $\mathcal{T}$, we have $\mathsf{DC}$ and the stronger “every function $\mathbb{R} \to \mathbb{R}$ is continuous”. But the price we pay is LEM. Omniscience Principles We can ask about other nonconstructive principles too. For instance, Bishop’s Principles of Omniscience! First, let’s look at the Limited Principle of Omniscience (LPO): The Limited Principle of Omniscience: says that every sequence of bits is either $0^\omega$ or contains a $1$. That is: In the presence of countable choice (which we have in $\mathcal{T}$), this is equivalent to the analytic LPO, which says: and is further equivalent to the type-theoretic condition that the obvious embedding is an isomorphism. It’s clear from the last condition that LPO is false in $\mathcal{T}$. Since both \(\mathbb{N} + \{ \infty \}\) and \(\mathbb{N}_\infty\) are sequential, these internal types correspond to the expected spaces externally, where this is obviously not an isomorphism. After all, one space is discrete and the other isn’t. That said, it can be hard to find example computations of people statements internal to a topos to statements in the real world, so just for fun let’s prove this again in a more complicated way: $\ulcorner$ If it were true, then we would know $1 \Vdash \text{LPO}$. Now cashing out the universal quantifier, we would know that for every \(s \in 2^\mathbb{N}(\mathbb{N}_\infty)\) Here \(s_k : \mathbb{N}_\infty \to 2^\mathbb{N}\) is allowed to be any convergent sequence in cantor space, and we interpret $0_k$ and $1_k$ as constant sequences. Let’s take $s_k$ to be the sequence $0^k 1^\omega$. That is, the $k$th element of this sequence, $s_k$, should be the point in cantor space with $k$ many $0$s followed by an infinite tail of $1$s. Note this sequence converges to $0^\omega$. Now what would it mean to have $(\forall n . s(n) = 0) \lor (\exists n . s(n) = 1)$? We would have an open cover of \(\mathbb{N}_\infty\) where each element of that cover thinks that one of these disjuncts is true. But every covering seive of \(\mathbb{N}_\infty\) contains a function $f_U$ for $U$ an infinite subset of $\mathbb{N}$. Now restricting $s$ to this member of the cover amounts to restricting $s$ to a subsequence, $s_{r_k}$. Is it possible that \(\mathbb{N}_\infty \Vdash \forall n . s_{r_k}(n) = 0_{r_k}\)? This says for every convergent sequence \(n_k \in \mathbb{N}(\mathbb{N}_\infty)\), we must have \(s_{r_k}(n_k) = 0_{r_k}\) for all $k$. Of course, it’s easy to find a convergent sequence where this is false! We can just choose \(n_1 \gt r_1\) to make \(s_{r_1}(n_1) = 1 \neq 0\). Is it possible for \(\mathbb{N}_\infty \Vdash \exists n . s_{r_k}(n) = 1_{r_k}\)? This says we can pass to a further subsequence \(s_{\ell_{r_k}}\) so that for some convergent sequence \(n_k \in \mathbb{N}(\mathbb{N}_\infty)\) we have \(s_{\ell_{r_k}}(n_k) = 1_k\) for all $k$. But of course this is false too! Every convergent sequence of naturals is eventually constant, say $n_k = N$ for $k \gg 1$. Then for $k$ large enough, we’ll have both $n_k = N$ and \(\ell_{r_k} \gt N\), in which case \(s_{\ell_{r_k}}(n_k) = 0 \neq 1\). So we see that LPO externalizes to a false claim, and thus is not validated by $\mathcal{T}$. $\lrcorner$ As an easy exercise, can you use LPO to build a function $\mathbb{R} \to \mathbb{R}$ which is not $\epsilon$-$\delta$ continuous? This provides another proof that $\mathcal{T} \not \models \text{LPO}$, since it contradicts Brouwer’s principle. In fact, we don’t even need Bouwer’s principle to contradict! By yoneda, $\text{Hom}(\mathbb{R}, \mathbb{R})$ is the set of continuous functions on $\mathbb{R}$, but if you build a function in $\mathcal{T}$ you know what that function does externally on points! In particular, you can contradict with continuity “in the real world”. Next, let’s look at the Lesser Limited Principle of Omniscience (LLPO): The Lesser Limited Principle of Omniscience says that This is equivalent to a kind of de Morgan’s law and, as before, under countable choice this is equivalent to the analytic LLPO, \(\forall x : \mathbb{R}. (x \geq 0) \lor (x \leq 0)\) This turns out to be true6! Just to show more ways to reason about the internal logic of topoi, we’ll prove this one by working with the category $\mathcal{T}$ directly. Since type theoretic functions externalize to arrows in $\mathcal{T}$, though, we’ll use type theory to label our arrows. $\ulcorner$ Proposition 6.2 in Johnstone’s original paper implies that $\mathbb{R}$ is the pushout of the closed cover Now showing $\forall x : \mathbb{R} . (x \geq 0) \lor (x \leq 0)$ amounts to building a section of the projection $\pi : \sum_{x : \mathbb{R}} \lVert (x \geq 0) + (x \leq 0) \rVert$. Here I’ve also cashed out the $\lor$ for a propositional truncation of a coproduct. But by the universal property of the pushout, we get a map $s : \mathbb{R} \to \sum_{x : \mathbb{R}} \lVert (x \geq 0) + (x \leq 0) \rVert$ as below. Note the truncation \(\lVert (x \geq 0) + (x \leq 0) \rVert\) is crucial for making the middle square commute! Moreover, since both $\pi s$ and $\text{id}_\mathbb{R}$ make the outer square commute, they must be equal by uniqueness in the universal property. So $s$ is the desired section of $\pi$, and $\mathcal{T}$ models LLPO. $\lrcorner$ As a nice exercise, check that internal to $\mathcal{T}$ the type $\mathbb{R}$ is equivalent to the quotient type7 and use this fact to give a purely type theoretic proof of that LLPO holds in $\mathcal{T}$. As another nice (but quite tricky!) exercise, try externalizing the statement of LLPO and proving it “directly” by seeing that it externalizes to something true! Next on this list is Markov’s Principle (MP): Markov’s Principle says that Again this is equivalent (under CC) to an analytic version8 Here \(\#\) means that $x$ is apart from $0$. That is, $\exists q : \mathbb{Q} . (x \lt q \lt 0) \lor (0 \lt q \lt x)$. MP is true in $\mathcal{T}$, and this is not so hard to show. We’ll write this proof in a slightly more conversational style, but we encourage the reader interested in learning topos theory to check all the details with the forcing language. $\ulcorner$ We want to show so we fix a convergent sequence \(s_k : \mathbb{N}_\infty \to 2^\mathbb{N}\). We want to show that if, for all \(f : \mathbb{N}_\infty \to \mathbb{N}_\infty\) then we must have To show this existential claim, we need to provide a conergent sequence $n_k$ so that each $s_k(n_k) = 1$. But taking $f$ to be a constant function in $(\star)$, we learn that such an $n_k$ exists for each $k$. Moreover, since \(s_k \to s_\infty\), there is a $k \gg 1$ so that $s_k$ and $s_\infty$ agree on the first $n_\infty + 1$ many bits, so that we may take $n_k = n_\infty$ when $k$ is large enough. Thus the sequence converges, as desired. $\lrcorner$ Now WLPO is easy: The Weak Limited Principle of Omniscience says that This is equivalent to a kind of excluded middle, and as expected there’s an analytic version (equivalent in the presence of CC): \(\forall x : \mathbb{R} . (x = 0) \lor \lnot (x=0)\) Now, it’s easy to see that MP + WLPO $\implies$ LPO (look at the analytic versions). So we learn indirectly that $\mathcal{T}$ cannot satisfy WLPO. Of course, here’s a more direct proof that the analytic version can’t work: $\ulcorner$ Consider the sequence $x_n = \frac{1}{n}$, converging to $0$, as an element of \(\mathbb{R}(\mathbb{N}_\infty)\). If $\mathcal{T}$ were to model WLPO, it would mean (among other things) that some subsequence of $x_n$ would be either everywhere $0$ or everywhere nonzero. But no subsequence has this property, since $x_n$ is nonzero for each finite $n$, but zero at $n=\infty$. So $\mathcal{T}$ does not model WLPO. $\lrcorner$ As an aside, in all of these statements we’re using the “truncated” $\lor$ and $\exists$, and it’s natural to ask what happens if we change these to their “untruncated” versions $+$ and $\Sigma$. It’s a theorem of Martín Escardó that truncated and untruncated LPO are equivalent truncated and untruncated WLPO are equivalent, and are equivalent to untruncated LLPO truncated LLPO is weaker than untruncated LLPO In fact, in his agda files, Martín wonders if there’s a place in the literature where untruncated LLPO and WLPO are shown to be inequivalent. He mentions that $\mathcal{T}$ should be an example, and here we’ve shown that in fact it is! After all, $\mathcal{T} \models \text{LLPO}$ but $\mathcal{T} \not \models \text{WLPO}$! Bar and Fan Theorems The Bar and Fan theorems are closely related to the nice properties of baire space and cantor space (respectively) as spaces rather than as locales. The locales are always well behaved, but in some topoi these locales fail to have enough points, so that the baire space and cantor space as sets of points may be lacking. Since we showed in part 1 that spatial sequential regular locales always have enough points in $\mathcal{T}$, we see that baire space and cantor space both have enough points! By Propositions 3.12 and 3.13 in van den Berg and Moerdijk’s Derived rules for predicative set theory: An application of sheaves we see that the Monotone Bar Theorem and the Full Fan Theorem hold in $\mathcal{T}$. This is also the best we can do. Since the Full Bar Theorem is known to imply LPO, and $\mathcal{T} \not \models \mathsf{LPO}$ we know that we can’t improve the monotone bar theorem to the full bar theorem in $\mathcal{T}$. As a not-so-hard exercise, verify the decidable fan theorem by hand! That is, prove Or, entirely in symbols, prove Note that $B : 2^{\lt \mathbb{N}} \to 2$ is a decidable subset of $2^{\lt \mathbb{N}}$. To prove the full bar theorem you would want to prove the same theorem for all subsets of $2^{\lt \mathbb{N}}$. That is, for $B : 2^{\lt \mathbb{N}} \to \Omega$. As a ~bonus exercise~, use what you know about discrete topological spaces and subobjects in $\mathcal{T}$ to argue that a semantic proof of this theorem is easily modified to give a proof of the full bar theorem. discussion of the ~bonus exercise~ The idea here is that a general subobject of a sequential space X is a subset of the points of X equipped with some topology making the inclusion continuous. A decidable subobject is a clopen subset of the points of $X$ equipped with the induced topology (do you see why?). Of course, since $2^{\lt \mathbb{N}}$ is discrete, it's not hard to see that _every_ subobject is decidable! This means in proving the theorem for decidable subobjects you've actually proven the theorem for _all_ subobjects! De Morgan and LEM Obviously LEM fails, since $\Omega \not \cong 1+1$. But what about De Morgan’s Laws? It turns out that $\mathcal{T}$ is not de Morgan. In another paper (the aptly named Conditions Related to de Morgan’s Law) Johnstone gives a slew of conditions equivalent to a topos being de Morgan. In particular, de Morgan-ness is equivalent to $[\top,\bot] : 1+1 \to \Omega_{\lnot \lnot}$ being an isomorphism $1+1$ being injective But we can show both of these are false (thus giving two proofs that $\mathcal{T}$ is not de Morgan). For $1$, we can compute that \(\mathcal{T}_{\lnot \lnot}\) is equivalent to $\mathsf{Set}$, where the equivalence sends a set $X$ the space $X$ equipped with the indiscrete topology. This is stated at the end of Section 3 of Johnstone’s original paper. In particular, $1+1$, which has the discrete topology, is not isomorphic to \(\Omega_{\lnot \lnot}\) (which is indiscrete). For $2$, we know that in $\mathsf{Seq}$ there’s a map $(0,1) + (2,3) \to 1+1$ which doesn’t extend to a map $(0,3) \to 1+1$. Since monos in $\mathsf{Seq}$ are still monos in $\mathcal{T}$ (since the inclusion is a right adjoint), we’re done. This was a long one! Multiple months in the making, and easily the most research I’ve ever done for a blog post (both in terms of reading done and original proofs). It was super rewarding, though, and I feel way better about the topological topos and its internal logic, as well as about topos theory more broadly ^_^. Hopefully I was able to explain it clearly enough to be useful to all of you too! I know it’s a TON of information, and in the process of revising this I really struggled to tell if it’s well exposited or not since it kind of feels like this But even though it’s a ton of information, hopefully each section is digestible! Thanks again for reading all. I have other posts planned, about my thesis work for a change, but I think I’m going to take a break from writing after this, haha. Stay safe, and talk soon! 💖 Appendix: A Proof that Johnstone’s Topos Models Brouwer’s Continuity Principle If you’re not super familiar with externalizing formulas, you might want to read my old blog post with a bunch of simpler examples before trying to tackle this one! We’ll be doing this computation using the site with one object \(\mathbb{N}_\infty\). We’ll prove a kind of “theorem schema”. Every metric space is sequential, so for any metric spaces $X$ and $Y$ in “the real world” we can think of $X$ and $Y$ as objects of $\mathcal{T}$. Now if $X$ is moreover locally compact, we’ll prove that $\mathcal{T}$ models Precisely: \(\mathcal{T} \vDash \forall f : Y^X . \ \forall \epsilon : \mathbb{R}_{\gt 0} . \ \forall a : X . \ \exists \delta : \mathbb{R}_{\gt 0} . \ \forall b : X . \ d(a,b) \lt \delta \to d(fa,fb) \lt \epsilon\) $\ulcorner$ We want to show that Cashing out the universal quantifiers, we want to show that for any continuous functions \(f : \mathbb{N}_\infty \times X \to Y\), \(\epsilon : \mathbb{N}_\infty \to \mathbb{R}_{\gt 0}\), and \(a : \mathbb{N}_\infty \to X\) in “the real world” we have To witness the existential quantifier, we need to find a cover of \(\mathbb{N}_\infty\), and produce a real-world $\delta$ defined on each member of the cover. Finding a cover basically amounts to finding a cofinite subset of $\mathbb{N}$, and passing to a tail of all the sequences in sight. With this in mind, we choose a compact neighborhood $K$ of $a_\infty$ (using local compactness of $X$), and choose \(\delta^*\) so that \(B(a_\infty, \delta^*)\) is contained in $K$ and for every $x \in B(a_\infty, \delta^*)$ we have $d(f_\infty a_\infty, f_\infty x) \lt \frac{\epsilon_\infty}{6}$ (using continuity of $f_\infty$). Then, let $N$ be large enough that for all $n \gt N$: $\epsilon_n \gt \frac{\epsilon_\infty}{2}$ $d(f_n a_n, f_\infty a_\infty) \lt \frac{\epsilon_\infty}{6}$ For all $x \in K$, we have $d(f_\infty x, f_n x) \lt \frac{\epsilon_\infty}{6}$ $d(a_n, a_\infty) \lt \delta^*$ In condition (3) we’ve used the fact that $f_n$ converges to $f_\infty$ uniformly on the compact set $K$. Now we take as our covering familiy the constant functions \(k : \mathbb{N}_\infty \to \mathbb{N}_\infty\) \(i_S : \mathbb{N}_\infty \to \mathbb{N}_\infty\) for any infinite subset \(S \subseteq \{ n \gt N \} \subseteq \mathbb{N}\). As a reminder, the function $i_S$ is the unique monotone function \(\mathbb{N}_\infty \to \mathbb{N}_\infty\) whose image is \(S \cup \{\infty\}\). Now on each member $g$ of this family, we need to produce a function \(\delta : \mathbb{N}_\infty \to \mathbb{R}_{\gt 0}\) which witnesses Cashing out the last universal quantifier, we need to know that for all \(h : \mathbb{N}_\infty \to \mathbb{N}_\infty\) and for all continuous \(b : \mathbb{N}_\infty \to X\), If \(\mathbb{N}_\infty \Vdash d(a_{gh(-)},b) \lt \delta_{h(-)}\) then we must have \(\mathbb{N}_\infty \Vdash d(f_{gh(-)}(a_{gh(-)}), f_{gh(-)}(b)) \lt \epsilon_{gh(-)}.\) So let’s build such a $\delta$ for the two cases in our cover! First, if $g$ is the constant $k$ function. Then we need a convergent sequence $\delta_n$ so that for any function \(h : \mathbb{N}_\infty \to \mathbb{N}_\infty\) and any convergent sequence $b_n$ in $X$, If $d(a_k,b_n) \lt \delta_{hn}$ for all $n$, then $d(f_k(a_k), f_k(b_n)) \lt \epsilon_k$ for all $n$ too. Of course, this is easy to arrange by taking $\delta_n$ to be the constant sequence witnessing continuity of $f_k$ at $a_k$. Second, if $g = i_S$ is the unique monotone function whose image is \(S \cup \{ \infty \}\). Recall also that every member of $S$ is at least $N$. Then we define $\delta_n$ to be \(\delta^* - d(a_{i_S n}, a_\infty)\). Note this is convergent, with limit \(\delta^*\). Now we must show for any function \(h : \mathbb{N}_\infty \to \mathbb{N}_\infty\) and for any convergent sequence \(b_n\) in $X$ that If $d(a_{i_S h n}, b_n) \lt \delta_{hn}$ for all $n$, then $d(f_{i_S h n}(a_{i_S h n}), f_{i_S h n}(b_n)) \lt \epsilon_{i_S h n}$. But since $i_S h n \gt N$ and \(d(b_n, a_\infty) \leq d(b_n, a_{i_S h n}) + d(a_{i_S h n}, a_\infty) \lt \delta^*\) we see that $\epsilon_{i_S h n} \gt \frac{\epsilon_\infty}{2}$ $d(f_{i_S h n}(a_{i_S h n}), f_{\infty}(a_\infty)) \lt \frac{\epsilon_\infty}{6}$ $d(f_\infty b_n, f_{i_S h n} b_n) \lt \frac{\epsilon_\infty}{6}$ so that we can compute As desired. $\lrcorner$ You might wonder why we need a “nonconstructive” axiom to do this. Why can’t we use induction on $\mathbb{N}$? After all, we can prove (constructively, in type theory) that (and this makes a nice exercise!) The difference lies in $\sum$ vs $\exists$! To build a term of type $\prod_x \sum_y R(x,y)$ is to build a function that eats an $x$ and returns a $y$ alongside a proof that $R(x,y)$. This gives us a canonical choice in \(\{ y \mid R(x,y) \}\) – just use the one this function gave us! Dependent choice works with something much weaker. It says we can build such a function even when there merely exists such a $y$, without being handed a witness! (Of course, the function we’re given only merely exists too) Think about the semantics in $\mathsf{Sh}(B)$ for a moment. Here, to say that $\exists y . R(x,y)$ is to say that there’s an open cover of $B$ and a local witness $y$ on each element of the cover. But it’s entirely possible for these witnesses to not glue into a global witness! ↩ I don’t know if there are constructive subtleties with the notion of cauchy completeness which might be relevant here, and since I really want to get this blog post out I don’t want to read a bunch of literature on constructive metric spaces to try and figure it out… If anyone happens to know some facts about constructive metric spaces, though, I would love to hear about them! But for now, treat this example as being more to showcase how dependent choie works than to say anything profound about the topological topos. ↩ A set $D \subseteq X$ is called Strongly Dense if for any inhabited open set $V$ we know that $D \cap V$ is inhabited. ↩ This is the first time in a while I’ve actually written down the definition of (metric space) continuity in full! No wonder students struggle with this, haha. I’ve forgotten what a mouthful it is! ↩ Here we have to refer to external metric spaces, and we prove a kind of “theorem schema”. I suspect something like this is true purely internally, if we can find the right definition of a “metric space” in $\mathcal{T}$. Obviously we want a distance function \(d : X \times X \to \mathbb{R}_{\geq 0}\) satsifying the usual axioms. But we also need to know that the topology $d$ puts in $X$ agrees with the intrinsic topology on $X$! Since $d$ is externally continuous we know that metric-open balls will always be open in $X$, so I think the condition is something like “every open of $X$ contains an open ball” or maybe “every open of $X$ is the union of the open balls inside it”. This should be expressible in the internal logic since the opens of $X$ are exactly the inhabitants of $\Sigma^X$ where $\Sigma$ is the sierpinski space. I know that Davorin Lešnik has thought about this, but I really want to get this post out (and ideally turn it into a paper) so unfortunately I won’t be pursuing this any further… for now! ↩ This realization has probably been made by many people, but it was added to the nlab by Mike Shulman. ↩ This is the quotient type in the sense of Li’s PhD thesis, not the (higher!) quotient type in the sense of HoTT… Though the quotient type I mean is almost certainly the $0$-truncation of the higher inductive quotient type. I’ve been meaning to spend some time thinking about how you can prove theorems about a 1-topos by working in HoTT and truncating everything at the end, but I haven’t had the time. ↩ Actually, I think I remember reading somewhere that the analytic omniscience principles are always statements about the cauchy reals. The reason countable choice makes them properties of the dedekind reals is because under CC the dedekind and cauchy reals agree. If an expert sees this and happens to know offhand if that’s true, I would love to know for sure! ↩

7 months ago 58 votes
Life in Johnstone's Topological Topos 2 -- Topological Algebras

In the first post, we introduced Johnstone’s topological topos $\mathcal{T}$ and talked about what its objects look like. We showed how the interpretation of type theory in $\mathcal{T}$ gives us an “intrinsic topology” on any type we construct. We also alluded to the fact that, by working in $\mathcal{T}$ as a universe of sets, we’re able to interact with topological gadgets by forgetting about the topology entirely and just manipulating them naively as we would sets! In this post, we’ll talk about how that works in the special case of algebraic gadgets, like groups, rings, etc., and use this to prove some interesting theorems about topological groups! Recall Lawvere’s notion of Functorial Semantics. An Algebraic Theory is presented by some function symbols and equational axioms (we allow constant symbols as 0-ary functions), and this is probably best given through a “definition by examples”. The usual presentation of the theory of groups is A set $G$ equipped with function symbols $e : G^0 \to G$ $(-)^{-1} : G^1 \to G$ $\cdot : G^2 \to G$ satisfying the equational axioms $(x \cdot y) \cdot z = x \cdot (y \cdot z)$ $x \cdot e = x = e \cdot x$ $x \cdot x^{-1} = e = x^{-1} \cdot x$ Notice that the theory of posets is not algebraic1 and indeed the usual presentation involves a relation symbol $\leq$ (which is not allowed) rather than only function symbols. Similarly, the theory of fields is not algebraic2 and the usual presentation requires an axiom that’s much more complicated than just an equation: $(x = 0) \lor \exists y . xy = 1$. However, these presentations by functions and equational axioms should really be thought of as presentations. There are superficially quite different presentations which still present the same theory. For instance, here is another presentation of the theory of groups3: A set $G$ equipped with a function symbol $/ : G^2 \to G$ satisfying the equational axiom $x / \big ( ((x/x)/y)/z) / ((x/x)/x)/z \big ) = y$ With this in mind it’s natural to want an abstract characterization of an algebraic theory, that is independent of the choice of presentation. In his PhD thesis, Lawvere set this in motion by showing that for any algebraic theory $\mathbb{T}$, there’s a Classifying Category $\mathcal{C}_\mathbb{T}$ so that If we have a good understanding of $\mathbb{T}$, then we can get our hands on $\mathbb{C}_\mathbb{T}$ concretely since it’s (opposite) the full subcategory of finitely generated free models! Important for us is the related result that models of $\mathbb{T}$ in some other finite product category are given exactly by finite product functors from $\mathcal{C}_\mathbb{T}$ into that category! So, for example, a topological group is the same data is a finite product functor \(\mathcal{C}_\mathsf{Grp} \to \mathsf{Top}\), while a lie group is the data of a finite product functor \(\mathcal{C}_\mathsf{Grp} \to \mathsf{Diff}\). This is what’s going to give us the ability to relate algebras in $\mathcal{T}$ to topological algebras! Let’s see how it works! First, say we have a group object in $\mathcal{T}$. This is the data of a finite product preserving functor $\mathcal{C}_\mathsf{Grp} \to \mathcal{T}$. But we know from part 1 that the reflector $r : \mathcal{T} \to \mathsf{Seq}$ preserves finite products too! So composing these gives a finite product functor which is a group object in $\mathsf{Seq}$. That is, a (sequential) topological group4! Conversely, say we have a sequential topological group. Then the embedding $e : \mathsf{Seq} \to \mathcal{T}$ is a right adjoint, and in particular preserves finite products. So again we get a group object in $\mathcal{T}$! In fact, the adjunction $r \dashv e$ gives us an adjunction \(r_* \dashv e_*\) between the functor categories, and since $r \circ e \cong \text{id}_\mathsf{Seq}$, this is true at the level of functor categories too! So the category of sequential topological groups is a reflective subcategory of the category of groups in $\mathcal{T}$, and the reflector is exactly what you expect: Just take the topological reflection of the underlying object of $G$! There’s nothing special about groups here, and so we learn that for any algebraic theory, the category of sequential models is a reflective subcategory of the category of models in $\mathcal{T}$. Thus, any question we have about (sequential) topological models can be answered in $\mathcal{T}$ without losing information, and anything we prove about models in $\mathcal{T}$ immediately gives us results about topological models by reflecting (though in this direction we possibly lose information about the proofs of convergence). This is all kind of abstract right now, so let’s do a very down-to-earth example: In $\mathcal{T}$, a subset of $G$ is any monic $X \hookrightarrow G$ (that is, any continuous injection). In particular, $X$ does not need to have the subspace topology! With this in mind, a subgroup of $G$ is just a continuous injection $H \hookrightarrow G$ whose image is a subgroup in the usual sense. Of course, this pulls back (by injectivity) to a unique group structure on $H$ rendering the inclusion a homomorphism. Now here’s a typical (very easy!) theorem/construction: Let $X$ be any subset of $G$, a group. Then there’s a smallest subgroup $\langle X \rangle \leq G$ containing $X$. $\ulcorner$ If $X$ is any subset, we define This is a subgroup containing $X$, and any other subgroup containing $X$ is part of the intersection, rendering this the smallest such subgroup. $\lrcorner$ Notice that this proof is constructive in the sense that it doesn’t use LEM or Choice5. In particular, this proof works in every topos, and thus in $\mathcal{T}$. But what does this exceptionally simple proof tell us about topological groups? Well subsets and subgroups are continuous injections, so this tells us that6 Let $X \hookrightarrow G$ be any continuous injection into a topological group $G$. Then there’s a topological group $\langle X \rangle$ with a continuous injection $\langle X \rangle \hookrightarrow G$ so that $X \hookrightarrow G$ factors through $\langle X \rangle$ $\langle X \rangle$ is initial with this property We can actually build such an $H$ by externalizing the proof of this theorem too! Subsets are interpreted as general monics into $G$, and the “intersection” of two monics externalizes to their pullback. So the desired $\langle X \rangle$ is exactly the pullback of the family of all continuous injections $H \hookrightarrow G$ factoring the inclusion from $X$7. In case $G$ is sequentially hausdorff, this is on-the-nose correct. In case $G$ isn’t, then $\langle X \rangle$ might live in $\mathsf{Kur}$ instead of $\mathsf{Seq}$. But that’s ok! We can just hit it with the reflector to get an “honest” topological group with the same universal property (among the continuous injections whose domain is also “honest”). Now, it’s entirely possible that you would have come up with such a theorem yourself. After all, a moment’s thought shows that $\langle X \rangle$ is “just” the usual subgroup generated by $X$, equipped with the finest topology rendering $X \hookrightarrow \langle X \rangle$ continuous. The utility of the topos theoretic language is in doing more complicated constructions, where we’re still allowed to manipulate everything as though they’re sets, and we can be safe in the knowledge that, at the end of the day, we can cash out our theorem for one about topological spaces! It frees us from the burden of carrying around topologies all the time. For a more complicated example, one can show that the category of abelian groups in a (grothendieck) topos is always AB5. In particular, the category of abelian groups in $\mathcal{T}$ is abelian and has enough injectives, so we can do homological algebra to it! Contrast this with the category of abelian groups in $\mathsf{Top}$, which is famously not abelian! This is one of the big motivations for Condensed Mathematics. Indeed, in $\mathsf{Top}$, the continuous bijection of abelian groups $(\mathbb{R},\text{discrete}) \to (\mathbb{R},\text{euclidean})$ is not an isomorphism. Yet the kernel and cokernel are both trivial! In both condensed mathematics and the topological topos, this is remedied by a more complicated cokernel. Remember that the colimits preserved by the embedding $\mathsf{Seq} \to \mathcal{T}$ are only those that look like covers. In fact, we can compute the cokernel as the coequalizer of the inclusion map and the constant $0$ map. In the topos, this is the sheafififcation of the colimit of presheaves, which are computed pointwise. So the underlying set of the cokernel is the colimit of the underlying sets is ${ 0 }$. But the convergent sequences is the colimit of where one map is just the inclusion, and the other sends every eventually constant sequence to the constant $0$ sequence. So the cokernel has a single point ${ 0 }$, but there’s a proof that the constant $0$ sequence converges for every equivalence class of convergent sequences differing by an eventually constant sequence! Keeping track of these proofs (which themselves form an abelian group) is exactly what we need to do to algebraically detect that $(\mathbb{R}, \text{discrete}) \to (\mathbb{R}, \text{euclidean})$ isn’t an isomorphism! As an aside, I don’t understand condensed mathematics well enough to know how it differs from math in the topological topos. Just looking at definitions, I know it’s based on test maps from all compact hausdorff spaces instead of test maps from only \(\mathbb{N}_\infty\). This probably means it’s closely related to compactly generated spaces in much the way that $\mathcal{T}$ is related to sequential spaces. I’m sure there’s a reason to prefer this, but I don’t know what it is8. The moral to keep in mind is that the power of doing algebra in a topos that handles the topology for you is currently being used to great effect in applying homological algebra to analytic situations where it previously couldn’t go! Alright, I told you this one was going to be more leisurely than the last one! Now that we’ve seen some applications of $\mathcal{T}$ to topological algebra, and we’ve seen some basic externalization, let’s move on to part 3 and really get familiar with the internal logic and how it relates to the real world! You can show this with categorical techniques. For instance, the category of models of any algebraic theory is always regular, while the category of posets isn’t ↩ The category of models for any algebraic theory always has an initial object, yet the category of fields doesn’t! ↩ See McCune’s Single Axioms for Groups and Abelian Groups with Various Operations. This operaetion is related to the “usual” operations by $x / y = x \cdot y^{-1}$. ↩ Remember, though, that the product on $\mathsf{Seq}$ is different from the product on $\mathsf{Top}$. This never matters in practice, and the $\mathsf{Seq}$ product agrees with the product in the “convenient category” of compactly generated spaces, but if you want an honest group object in $\mathsf{Top}$, you’ll want $G$ to be locally compact. ↩ It’s not predicative, but that’s fine for a topos. And regardless, if you know enough to complain about predicativity, you know enough to give a predicative version of this proof :P. ↩ Indeed, it says something slightly stronger than this! In the case of non (sequentially) hausdorff spaces, there might be extra subsets that are merely kuratowski limit spaces! The theorem says we’re actually allowed to take $X$ to be such a subspace as well! ↩ The diligent reader will note there are a proper class of such arrows, so this pullback as written isn’t defined. Of course, the domain of any such arrow has at most $|G|$ many elements, and there’s only a set worth of topologies we can put on one of these domains. So up to isomorphism there’s only a set worth of arrows, and we’re good to go! ↩ Peter Scholze actually says a few words about why condensed sets are easier to work with than objects of $\mathcal{T}$ in a comment to his answer to this MO question. I still don’t really see it, but that’s probably because I haven’t spent a lot of time (or any time) working with condensed sets. ↩

7 months ago 46 votes

More in science

Incorruptible Skepticism

Everything, apparently, has a second life on TikTok. At least this keeps us skeptics busy – we have to redebunk everything we have debunked over the last century because it is popping up again on social media, confusing and misinforming another generation. This video is a great example – a short video discussing the “incorruptibility’ […] The post Incorruptible Skepticism first appeared on NeuroLogica Blog.

5 hours ago 1 votes
Sony Kills Recordable Blu-Ray And Other Vintage Media

Physical media fans need not panic yet—you’ll still be able to buy new Blu-Ray movies for your collection. But for those who like to save copies of their own data onto the discs, the remaining options just became more limited: Sony announced last week that it’s ending all production of several recordable media formats—including Blu-Ray discs, MiniDiscs, and MiniDV cassettes—with no successor models. “Considering the market environment and future growth potential of the market, we have decided to discontinue production,” a representative of Sony said in a brief statement to IEEE Spectrum. Though availability is dwindling, most Blu-Ray discs are unaffected. The discs being discontinued are currently only available to consumers in Japan and some professional markets elsewhere, according to Sony. Many consumers in Japan use blank Blu-Ray discs to save TV programs, Sony separately told Gizmodo. Sony, which prototyped the first Blu-Ray discs in 2000, has been selling commercial Blu-Ray products since 2006. Development of Blu-Ray was started by Philips and Sony in 1995, shortly after Toshiba’s DVD was crowned the winner of the battle to replace the VCR, notes engineer Kees Immink, whose coding was instrumental in developing optical formats such as CDs, DVDs, and Blu-Ray discs. “Philips [and] Sony were so frustrated by that loss that they started a new disc format, using a blue laser,” Immink says. Blu-Ray’s Short-Lived Media Dominance The development took longer than expected, but when it was finally introduced a decade later, Blu-Ray was on its way to becoming the medium for distributing video, as DVD discs and VHS tapes had done in their heydays. In 2008, Spectrum covered the moment when Blu-Ray’s major competitor, HD-DVD, surrendered. But the timing was unfortunate, as the rise of streaming made it an empty victory. Still, Blu-Rays continue to have value as collector’s items for many film buffs who want high-quality recordings not subject to compression artifacts that can arise with streaming, not to mention those wary of losing access to movies due to the vagaries of streaming services’ licensing deals. Sony’s recent announcement does, however, cement the death of the MiniDV cassette and MiniDisc. MiniDV, magnetic cassettes meant to replace VHS tapes at one-fifth the size, were once a popular format of digital video cassettes. The MiniDisc, an erasable magneto-optical disc that can hold up to 80 minutes of digitized audio, still has a small following. The 64-millimeter (2.5-inch) discs, held in a plastic cartridge similar to a floppy disk, were developed in the mid-1980s as a replacement for analog cassette tapes. Sony finally released the product in 1992, and it was popular in Japan into the 2000s. To record data onto optical storage like CDs and Blu-Rays, lasers etch microscopic pits into the surface of the disc to represent ones and zeros. Lasers are also used to record data onto MiniDiscs, but instead of making indentations, they’re used to change the polarization of the material; the lasers heat up one side of the disc, making the material susceptible to a magnetic field, which can then alter the polarity of the heated area. Then in playback, the polarization of reflected light translates to a one or zero. When the technology behind media storage formats like the MiniDisc and Blu-Ray was first being developed, the engineers involved believed the technology would be used well into the future, says optics engineer Joseph Braat. His research at Philips with Immink served as the basis of the MiniDisc. Despite that optimism, “the density of information in optical storage was limited from the very beginning,” Braat says. Despite using the compact wavelengths of blue light, Blu-Ray soon hit a limit of how much data could be stored. Even dual-layer Blu-Ray discs can only hold 50 gigabytes per side; that amount of data will give you 50 hours of standard definition streaming on Netflix, or about seven hours of 4K video content. MiniDiscs still have a small, dedicated niche of enthusiasts, with active social media communities and in-person disc swaps. But since Sony stopped production of MiniDisc devices in 2013, the retro format has effectively been on technological hospice care, with the company only offering blank discs and repair services. Now, it seems, it’s officially over.

6 hours ago 1 votes
How Does Life Happen When There’s Barely Any Light?

Under the sea ice during the Arctic’s pitch-black polar night, cells power photosynthesis on the lowest light levels ever observed in nature. The post How Does Life Happen When There’s Barely Any Light? first appeared on Quanta Magazine

yesterday 2 votes
Why housing shortages cause homelessness

It's not just about rents - it's also about the rooms friends and family can't afford to share

yesterday 2 votes
The Most Important Time in History Is Now

AGI Is Coming Sooner Due to o3, DeepSeek, and Other Cutting-Edge AI Developments

yesterday 2 votes