It’s All Sets

Set Theory underpins the foundations of Mathematics. When all of modern Mathematics is formulated axiomatically with sets, it baffles me how much the syllabus requires us to hide it away from students in the HSC Mathematics courses.

At the moment it only rears its head when dealing with topics in Probability and Venn Diagrams, and beyond that it is only mentioned in passing without much definition. Moreover, in the latest Years 3-10 Draft Reformed Syllabus, basic Set Theory is not even given passing lip service!

When the last couple centuries of Mathematical development have depended so much on Sets, I often wonder why our high school advanced Mathematics courses do not also update to match these developments.

Review of Sets

A Set is a collection of distinct objects, denoted with curly brackets. Members of a set are also called elements of the set. For example:

\[ X = \{a,b,3\} \]

\[ Y = \{a,c,4\} \]

For the following, examples will use the above two sets \(X\) and \(Y\). Operations with two sets include:

  • Intersection: a set containing elements that are in both sets.
    \[ X \cap Y = \{a\} \]
  • Union: a set containing all elements from both sets.
    \[ X \cup Y = \{a,b,c,3,4\}\]
  • Set Deletion: from the first set, delete the elements in common with the second set.
    \[ X \backslash Y = \{b,3\} \]
  • Cartesian Product: a set containing the ordered pairs of elements.
    \[ X \times Y = \{(a,a), (a,c), (a,4), (b,a), (b,c), (b,4), (3,a), (3,c), (3,4) \} \]

Other set notations:

  • Subset: a set \(A\) is a subset of another set \(B\) if every element of \(A\) is contained in \(B\). Notation:
    \[ A \subseteq B\]
  • Elements: if \(x\) is an element of a set \(A\), we notate this as:
    \[ x \in A\]

For an axiomatic definition of Set Theory, one may find reading up on Zermelo-Fraenkel Set Theory a worthwhile exercise. I might do a post on that another time.

Where Are They?

Now with sets briefly reviewed, let’s have a look at some examples of where sets are at work, but sometimes not mentioned:


The different sets of numbers we are most familiar with are:

Indeed, one of the axioms that defines the set of Natural Numbers is the Axiom of Induction, which allows us to prove statements that depend on the structure of the Natural Numbers.


Current State:
Relations are currently taught as ordered pairs of real numbers (as opposed to arbitrary sets) and most often just taught as graphs on the number plane that don’t pass a vertical line test.

A relation \(R\) over the sets \(A\) and \(B\) is a subset of \(A \times B\). If \((a,b) \in R\), we write \(a R b\).

Relations we usually encounter in school are subsets of \(\mathbb{R} \times \mathbb{R}\) and we spend a great deal of time studying them and drawing them (called graphs). This cartesian product of \(\mathbb{R}\) with \(\mathbb{R}\) is usually called the number plane, or cartesian plane.

Examples of relations include:

  • When we write equations like \(x^2 + y^2 = 1\), we are dropping off the formal set notation around it: \[ \{(x,y) \in \mathbb{R}\times \mathbb{R} : x^2 + y^2 = 1 \} \]
  • Element equality \(=\): the relation of equality in a set \(A\) is the set \[\{ (x,x) \in A \times A \}\] and we can write \(x = x\).
  • Inequalities such as \(<\): the relation \(<\) over the finite set \(\{1,2,3\}\) is the set \[ \{(1,2), (1,3), (2,3) \}\]. So we can write \(1 <2\), \(1 < 3\) or \(2<3\). This same rule can easily be applied to the number sets.


Current state:
The good news is that Functions are taught as special types of Relations where the first coordinate cannot appear more than once and that’s probably where the good news stops.

A Function \(f: A \rightarrow B\) is a Relation over the sets \(A\) and \(B\) such that for every \(a \in A\) there is a unique \(b \in B\) such that \((a,b)\) is in the Relation. We write \(f(a) = b\) and call the set \(A\) the Domain, and \(B\) the Codomain. Concepts like ‘one-to-one’ (injective) are taught in school but the other equally important concepts of ‘onto’ (surjective) function and ‘one-to-one correspondence’ (bijective) are not, as the concept of the Codomain is completely ignored but rather defaulted to the set \(\mathbb{R}\) whenever any mathematics in the HSC involves a function. All that is of course hidden from the student.

The default thinking that every function has a real codomain really starts to fall apart when one encounters complex valued functions in the Extension II course.

I write more about this predicament in another post.

Functions expose related structures and properties between the Domain and Codomain and, needless to say, are ubiquitous to every part of Mathematics. Here are just some examples:

  • Counting: the cardinality, or size, of a set \(X\) is denoted as \(|X|\) and is defined as follows:
    • Recall that \(n = \{0,1,2, \ldots , n-1\}\) as per the Von Neumann construction of the Natural Numbers. \(|X| = n\) if and only if there exists a bijective function \(f: n \rightarrow X\). We call \(X\) finite.
    • \(X\) is countably infinite if and only if there exists a bijective function \(f: \mathbb{N} \rightarrow X\). Countable infinity is the smallest type of infinity and we write \(|X|= \aleph_0\) (Aleph Nought).
    • \(X\) is uncountably infinite if and only if it is not countably infinite and not finite.
  • The Pigeonhole Principle: if you regard the set of pigeons as \(A\) and the set of pigeonholes as \(B\), the pigeonhole principle states that there does not exist any injective function \(f: A \rightarrow B\).
  • Permutations: if \(X\) is a finite set, a permutation of the set \(X\) is a bijective function \(f: X \rightarrow X\). For example, if \(X = \{1,2,3\}\), an example permutation would be \[f(1) = 1\\f(2) = 3\\f(3) = 2\] Counting the number of permutations of a set \(X\) is a topic in Combinatorics.
  • Sequences: the usual definition of an (infinite) sequence is a list of numbers separated by commas like \(1,4,7,10,13, \ldots\). A sequence of elements in \(X\) is actually a function \(f: \mathbb{N} \rightarrow X\). For example, the sequence \(1,4,7,10,13, \ldots\) is \[ \begin{align*}f(0) & = 1\\ f(1) & = 4\\ f(2) & = 7 \\ f(3) & = 10\end{align*}\] etc. One can probably see why the methods for linear relationships are directly applicable to arithmetic sequences.
  • Random Variables: the HSC defines it as a variable whose possible values are outcomes of an experiment or random phenomenon. It’s really a function \(X: \Omega \rightarrow \mathbb{R}\) (and in more abstract cases it doesn’t even have to be \(\mathbb{R}\)) – you can read more about what it is in greater detail in this post. This post also explains Probability concepts with Sets.


Firstly, a thank you to Steve B. who brought to my attention this really cool way to calculate the Highest Common Factor and the Lowest Common Multiples of integers. Let’s use the numbers 72 and 60 as examples here.

Write the numbers as products of their prime factors:

\[ \begin{align*}72 & = 2^3 \times 3^2\\60 & = 2^2\times 3 \times 5\end{align*}\]

Then put each one into a Venn Diagram, with shared common factors in the intersection:

Then the HCF can be found by multiplying the intersection:

\[ 2^2 \times 3 = 12\]

The LCM can be found by multiplying the union:

\[ 2^3 \times 3^2 \times 5 = 360\]

Hang on a second! The eagle eyed reader would notice that there are repeated elements in the Venn Diagram which seemingly disqualifies them from being sets!

The actual structure being used here is called a multiset where repeated elements are allowed. So for example, the prime factors of 72 would be the multiset: \(\{ 2,2,2,3,3\}\) (which isn’t a set but…)

It’s a function, which is as discussed above, a set of ordered pairs! How?

It is a function whose domain is the underlying set of unique elements of the multiset and the codomain is the set of positive integers that represents the element’s multiplicity, i.e. the number of times the element is repeated in the multiset.

So for the example of the prime factors of 72, the multiset \(\{2,2,2,3,3\}\) is the function (which I’ll call \(M\) here):

\[ M: \{2,3\} \rightarrow \mathbb{Z}^+\\M(2) = 3\\M(3) = 2\]


Polynomials in the indeterminate \(x\) are expressions of the form

\[ a_0 + a_1x + a_2x^2 + \ldots + a_nx^n\]

where \(x\) is undefined and each of the coefficients \(a_i\) are elements from a set with addition and multiplication defined, such as \(\mathbb{Z}\), \(\mathbb{Q}\), \(\mathbb{R}\) or \(\mathbb{C}\) but are not restricted to just these sets.

Writing polynomials out this way is actually a convenient notational representation of what it actually is: a sequence! And what are sequences? They’re functions as explained above.

So the polynomial \(a_0 + a_1 x + a_2 x^2 + \ldots + a_nx^n\) is actually this:

\[ (a_0, a_1, a_2, \ldots, a_n, 0, 0, 0, \ldots) \]

where the terms with index greater than \(n\) are set to \(0\).

The indeterminate \(x\) is the sequence with \(1\) in the second position and \(0\) elsewhere:

\[ (0, 1, 0, 0, \ldots ) \]

and \(x^n\) is the sequence with \(1\) in the \(n+1\)st position and \(0\) elsewhere.

In the HSC, the theory of polynomials are quite deeply explored:

  • Expanding and factorising algebraic expressions.
  • Linear polynomials and equations are studied for the majority of Year 7-9.
  • Quadratic polynomials and equations are studied from Year 9-12.
  • Polynomial division algorithms.
  • Remainder and Factor Theorems.
  • Roots of general polynomial equations and the graphs of polynomials with real coefficients.
  • Binomial Theorem and its applications to combinatorics and the binomial distribution in statistics.


A vector in HSC and Physics is usually defined as an object with direction and length. However, mathematicians would reject this as the definition, but rather accept it as an example of what a vector is.

A vector is actually an element of a Vector Space over a Field \(\mathbb{F}\).

Without spending too much detail on the axioms of a Vector Space, essentially you can do two things with vectors: addition and scalar multiplication.

\[ \forall u,v \in V: u + v \in V\]

\[ \forall k \in \mathbb{F}\, \forall v \in V: kv \in V \]

Where do all the other things like direction and length come in?

When we talk about length of vectors, we really mean that there is a function, called a norm: \(|| v ||\), that computes a non-negative number for every vector. The ability to do that upgrades the vector space to a Normed Vector Space.

When we talk about direction of vectors, we mean that there is a function, called an inner product: \(\langle u,v\rangle\), that measures the similarity of two objects. The ability to measure this upgrades the vector space to an Inner Product Space. An example of an inner product we learn in high school is the dot product. As it turns out, an inner product actually induces a norm (length) by: \( || v || = \sqrt{ \langle v, v\rangle } \), so an Inner Product Space is automatically also a Normed Vector Space. We usually see this notated as \(|v| = \sqrt{v \cdot v}\) instead in school.

It’s All Sets

From these examples, we can see that sets, relations and functions are the core of all mathematical theory. I believe we should be teaching students how to think with sets more in high school before they have to make that adjustment in their mathematics education in university.

When students enter university and find out a lot of what they learned in high school is incomplete, or sometimes wrong, there is a certain level of trust with their former teachers that is broken – how is that evidence of a syllabus that prepares students for their tertiary future?

Leave a comment

Your email address will not be published. Required fields are marked *