Because of its importance to the IFF metastack, we discuss, quote and paraphrase here in some detail Cantor’s diagonal argument as presented in the book Sets for Mathematics (2003) by Lawvere and Rosebrugh. As discussed in that book, in the nineteenth century George Cantor proved an important theorem using a diagonal argument, which implies the result that any set X is smaller than its power set X < 2X = PX. (For sets X and Y, the inequality X ≤ Y means that there exists at least one injection from X to Y, and the strict inequality X < Y means X ≤ Y and that no surjection X → Y exists.) The two essential ingredients in Cantor’s argument are the diagonal function and a fixed point free map. As Lawvere and Rosebrugh point out,

“This diagonal argument has been traced (by philosophers) back to ancient philosophers who used something like it to mystify people with the Liar’s paradox. Cantor, however, used his method to prove positive results, namely inequalities between cardinalities. The philosopher Bertrand Russell, who was familiar with Cantor’s theorem, applied it to demonstrate the inconsistency of a system of logic proposed by the philosopher Frege; since then philosophers have referred to Cantor’s theorem as Russell’s paradox and have even used their relapse into the ancient paradox habit as a reason for their otherwise unfounded rumor that Cantor’s set theory might be inconsistent. (Combatting this rumor became one of the main preoccupations of the developers of the axiomatized set theories of Zermelo, Fraenkel, von Neumann, and Bernays [P. Suppes. Axiomatic Set Theory. Dover, 1972]. This preoccupation assumed such an importance that the use of such axiom systems for clarifying the role of abstract sets as a guide to mathematical subjects such as geometry, analysis, combinatorial topology, etc., fell into neglect for many years.)”

Cantor’s Theorem

Powersets are strictly larger than their base. Let Y be any (base) set. An endofunction τ : Y  Y has an element y  Y as a fixed point when τ(y) = y. The set Y has the fixed point property when every endofunction τ : Y  Y has at least one fixed point. Although there are objects in the category of continuous spaces with the fixed point property (n-dimensional unit ball), most abstract sets do not have the fixed point property.

Theorem 1.
Suppose there is a set X and a function φ : X×X  Y whose curry φ̃ : X  YX, where φ̃(a) = φ(a,–) for all a ∈ X, is surjective; that is, such that for every function f : X  Y there is at least one element a  X such that f = φ̃(a) = φ(a,–). Then Y has the fixed point property.
Proof:
Consider any endofunction τ : Y → Y. We must show that τ has a fixed point. Using the diagonal function δX : X → X×X, define the function f = δX · φ · τ : X → X×X → Y → Y. In other words, for all x ∈ X, f(x) = τ(φ(x,x)). By the assumption, f = φ(a,–) for some element element a ∈ X; that is, f(x) = φ(a,x) for some element element a ∈ X and all lements x ∈ X. Hence, τ(φ(x,x)) = φ(a,x) for all x ∈ X. In particular, letting x = a, τ(φ(a,a)) = φ(a,a). Hence, φ(a,a) = φ(δX(a)) is a fixed point of τ. ∎
Corollary 1 (Cantor).
If a set Y has at least one endofunction τ : Y  Y with no fixed points, then for every set X there is no surjection X  YX.
Corollary 2.
For any set X, X < 2X = PX.
Proof:
Let 2 be the set of two elements 2 = {0,1}. Logical negation ¬ : 2 → 2, defined by ¬(0) = 1 and ¬(1) = 0, has no fixed point. Hence, there is no surjection X → 2X = PX. But, the singleton map {–} : X → 2X = PX is injective. ∎
Corollary 3.
There cannot exist a “universal set” U for which every set X is a subset X  U.
Proof:
If so, then the inclusion X → U is an injection. Hence, the exponent map 2U → 2X is a surjection. Define X = 2U to get a contradiction. ∎
Corollary 4.
The collection set of all sets is not a set.
Proof:
If set were a set, then U = set = {X | X ∈ set}, the union of all sets, would be a well-defined set. But, U would clearly be a “universal set”. Hence, the assumption that set is a set leads to a contradiction. ∎

Let us take inventory of the assumptions on sets that were needed to prove Corollary 4. We have assumed that sets have: (basic) singleton maps, diagonal maps, unbounded unions, the two-element (truth-value) set 2, logical negation; (finite limits) terminal elements, finite products; and (exponents) exponent sets, curry operators, exponent functions. From a category-theoretic viewpoint, all of these assumptions are satisfied in an elementary topos. From a set-theoretic viewpoint, we can assume Cantor’s theorem holds and that the collection set is a well-structured collection.

Cantorian Expansion

Cantor’s theorem entails an expansion. One consequence of Cantor’s theorem (Corollary 4) is that the collection set is not a set. In working practice, mathematicians distinguish between ordinary collections, such as the collection of all the current occupants of Paris or the collection of all of the stars in the Milky Way galaxy, from “larger” collections such as set. The latter kind of set is called a “proper class”, while the ordinary kind of set is called a “small set”. The collection of small sets, like the set of natural numbers, is either defined naturally, by convention or logically/mathematically. The set of natural numbers, which occurs in nature, was used in antiquity (convention) and axiomatized in modern times (logically/mathematically).

Corollary 4 states that set ∉ set; that is, that there are sets that are not small. But of course set is some kind of set. Let’s change the notation, letting set1 = set denote the collection of small sets, and set2 denote the collection of sets either small or not (call them “large sets”). Hence, the collection of all small sets is a large set set1 ∈ set2, any small set is itself a large set set1 ⊆ set2, but the collection of all small sets is not a small set set1 ∉ set1 (thus, we have the strict inequality set1 ⊂ set2). Of course, we also note that the collection of all groups is a large set grp ∈ set2, the collection of all topological spaces is a large set top ∈ set2, and the collection of all vector spaces is a large set vect ∈ set2. Hence, typical categories such as the category Grp of (small) groups and group morphisms, the category Top of (small) topological spaces and continuous maps, the category Vect of (small) vector spaces and linear transformations, and the category Set = Set1 of (small) sets and functions between small sets, are large categories in the sense that their sets of objects and morphisms are large.

In practice, in particular in the application of category theory, mathematicians sometimes need to distinguish an even larger kind of collection, called a “very large set”. Specifically, the category CAT of large categories and functors has very large object and morphism sets. Let set3 denote this even larger notion of set (kind of collection): set2 ⊆ set3 and set2 ∈ set3 (thus, we have the strict inequality set2 ⊂ set3), and let Set2 denote the category of (large) sets and functions between large sets. Then Set2 is a very large category – a category whose collection of objects (morphisms) is very large. This initiates the chain

Set1 ⊂ Set2 ⊂ Set3 ⊂ ⋯ ⊂ Setn ⊂ ⋯ .

This chain is at the heart of the IFF metastack, and results in the architecture of the core concepts, which includes not only a chain of different kinds of sets, but also (using pullbacks, in particular optimal restriction on functions and abridgment on relations) corresponding chains of (unary) functions and (binary) relations.

To reiterate, the Cantorian expansion is by convention. First, we assume by convention that a set of “small” sets exists. Call this set1. We also assume that the collection set1 is well-structured (that is, that there are power and unbounded union operators defined on set1). By Cantor's theorem, set1 ∉ set1. Hence, there are sets that are not just small. Call such sets, including set1, “properly large” sets. Call a set, that is either small or properly large, a “large” set. Second, we assume by convention that a well-structured collection of “large” sets exists. Call this set2. Note that set1 ∈ set2, set1 ⊂ set2 and set1 ∉ set1. By Cantor's theorem, set2 ∉ set2. Hence, there are sets that are not just large. Call such sets, including set2, “properly very large” sets. Call a set, that is either large or properly very large, a “very large” set. Third, we assume by convention that a well-structured collection of “very large” sets exists. Call this set3. Note that set2 ∈ set3, set2 ⊂ set3 and set2 ∉ set2. By Cantor's theorem, set3 ∉ set3. Hence, there are sets that are not just very large. And we continue in this fashion, thereby generating the Cantorian expansion.

Corollary 5 [Cantorian Expansion].
The collection of all sets unfolds into a chain (of Cantorian featureless abstract sets)
set = set1 ⊂ set2 ⊂ ⋯ ⊂ setn ⊂ …,
where set1 denotes the collection of all “small” sets.
Proof:
Starting from the small sets set1, apply Corollary 4 repeatedly. ∎

The Cantorian expansion is used as the foundation of the IFF. It is the minimal assumption required to obey Cantor’s theorem for well-structured collections. F.W. Lawvere has suggested in an email message to the CAT list that the small sets could be contructed from the large sets as the solution to a certain set isomorphism (equation) expressed using the real numbers. Since the IFF follows only a minimal approach to foundations, it should be possible to further constrain the IFF axiomatization to respect the Lawvere construction.

 

Some IFF documents such as this one contain a lot of unicode characters, and so require a recent browser. But even some recent browsers fail to render it properly. Two browsers that do render these documents correctly are Firefox and Opera.