# Quantifier packaging II: function limits and continuity (the fear of epsilon and delta)

In this post I want to discuss some possible approaches to teaching  function limits and continuity in terms of epsilon and delta.

The “fear of epsilon and delta” that I refer to here is not only that of the student, but also that of the teacher! Because of the difficulties students have with understanding and working with the multi-quantifier definitions of function limits, continuity and uniform continuity in terms of epsilon and delta, it is very tempting to find a way round the issue. For example, once the students have been taught about convergence of sequences, we can use sequence-based definitions of function limits and continuity.  (See my earlier post, however, on teaching the material on convergence of sequences. )

In my second-year mathematical analysis course I have emphasised the sequence approach  for many years, while still exposing the students to the epsilon-delta definitions. However, I am beginning to change my views. I no longer see it as desirable to find ways round these problems. I prefer to seek ways to help the students to gain a solid understanding of the harder concepts.

To begin with, let us restrict ourselves to the case of a function $f$ from $\mathbb{R}$ to $\mathbb{R}$.

Obviously this should be modified in appropriate ways when dealing with function limits (where the function need not be defined at the point in question) or dealing with functions between more general sets (e.g. subsets of $\mathbb{R}^n$, or more general metric spaces).

In this setting, in terms of sequences, the definition of  the statement $f$ is continuous is certainly very clean. Here is one version:

For every convergent sequence $(x_n) \subseteq \mathbb{R}$,  we have

$\lim_{n \to \infty} f(x_n) = f(\lim_{n\to\infty} x_n)\,.$

This is short and elegant, and it only  requires an understanding (see above) of convergence of sequences. This definition is very easy to use to prove results about sums/products/composition of continuous functions (etc.), quoting earlier standard results about the algebra of limits for sequences of real numbers to help where necessary. This then leads to a very clear and clean theory. So why do I have reservations about its use? I have mentioned some of the issues already, but let me list them again here, with some other issues.

• In postponing our confrontation with epsilon and delta for as long as possible, we may not be acting in the best interests of our students.
• Our confidence that students are happy and confident with the notion of convergence of sequences may be misplaced.
• While this definition in terms of sequences is an easy one to use in developing the general theory, on its own it does not give a particularly good introductory, intuitive notion of continuity. Many students have difficult in seeing what this definition really means (in terms of function values being close to the right value) for specific functions  such as $f(x)=x^2$, and have no idea how one could go about checking carefully that the condition is satisfied.  (Obviously you can explain how to do this using, for example, the algebra of limits for this particular function.)
• Students find it difficult to negate this definition correctly when trying to understand what it means for a function to be discontinuous. This is, of course, a general problem. Here it takes the form that students assume that if a sequence does not converge to a particular value, it is because it converges to some other value. The possibility that it does not converge at all may be overlooked.

The function $\sin x$ might be a more convincing example than $x^2$ here, except that approaches to teaching the properties of  $\sin x$ can sometimes be a little circular. Mostly we  use the “standard fact” that the derivative of $\sin x$ is $\cos x$ and the Mean Value Theorem to show that $|\sin x - \sin y| \leq |x-y|$, etc., but obviously we have assumed far more than the continuity of $\sin x$ in the process. I am as guilty as anyone else of referring students to books for more details concerning the trigonometrical functions (etc.).

So, what is the problem with the epsilon-delta definition of continuity? Well, if you jump straight in, you end up with a four-quantifier statement. In terms of $\varepsilon$ and delta, “$f$ is continuous” comes out as some variant of the following:

$\forall x \in \mathbb{R} \, \forall \varepsilon>0\,\exists \delta>0$ such that $\forall x' \in (x-\delta,x+\delta)$ we have $|f(x')-f(x)|<\varepsilon\,.$

Now it is obvious that we could first attack the three-quantifier statement

$f$ is continuous at $x$“,

and then define continuity in terms of this. Still, a three-quantifier statement is already rather challenging for, say, a first-year student of analysis.

We can also, if we wish, disguise one of the quantifiers by replacing

$\forall x' \in (x-\delta,x+\delta)$ we have $|f(x')-f(x)|<\varepsilon\,$

with the somewhat less formal version

$|x'-x|<\delta \Rightarrow |f(x')-f(x)|<\varepsilon\,$

(without openly specifying that $x'$ is in $\mathbb R$).

This is perfectly fine once students have a good understanding of the concepts and can use quantifiers correctly, but I am not convinced that it is wise when students are still in the process of learning how to handle quantifiers.

Rather than disguising this quantifier, I prefer to package it using images of sets, and to say

$f((x-\delta,x+\delta)) \subseteq (f(x)-\varepsilon,f(x)+\varepsilon)\,.$

We can also define continuity of $f$ in terms of function limits (one-sided or two-sided). I have no objection to this, but of course it simply shifts the original problem to the alternative setting of function limits.

So, having pointed out problems with these approaches, what is my recommendation? Well, I have some possible ideas involving quantifier packaging, but I haven’t yet had the opportunity to try them out. Perhaps some discussion here would be wise before unleashing them on the students!

I think that whatever approach we take should be applicable to function limits,  so let us change our setting slightly.

Let $a \in \mathbb R$,  let $f$ be a function from $\mathbb R \setminus \{a\}$ to $\mathbb R$, and let $L \in \mathbb R$.

In terms of sequences, the definition of “$\lim_{x\to a} f(x) = L$” is the following:

For every sequence $(x_n) \subseteq \mathbb R \setminus \{a\}$ such that $(x_n)$ converges to $a$, we have $f(x_n) \to L$ as $n \to \infty\,.$

As before, this is clean, clear, easy to use, and allows you to sidestep the epsilon-delta definition if you wish. The epsilon-delta definition comes out as some variant of the following:

Three-quantifier version

$\forall \varepsilon>0\,\exists \delta>0$ such that $\forall x \in (a-\delta,a+\delta) \setminus \{a\}$ we have $|f(x)-L|<\varepsilon$

or, using images of sets, we have the following

Two-quantifier version

$\forall \varepsilon>0\,\exists \delta>0$ such that

$f( (a-\delta,a+\delta) \setminus \{a\}) \subseteq (L-\varepsilon,L+\varepsilon)\,.$

Perhaps some suitable notation for a punctured $\null\delta$-neighbourhood would make this look a little less unwieldy.

These are  standard, and look easy enough to the professional mathematician, but are still complicated enough to cause serious difficulties for a student trying to come to grips with analysis.

Can we package the quantifiers further? I think that we can, but only at the expense of inventing new terminology. Here is a possible attempt, based on my method for teaching convergence of sequences. I am not at all sure that this is the definitive version.  Suggestions are welcome!

Let $\delta >0$ and let $B \subseteq \mathbb R$ . Then let us say that the set $B$ absorbs the values of $f$ $\null\delta$-near $a$ if

$f( (a-\delta,a+\delta) \setminus \{a\}) \subseteq B\,.$

We then say that the set $B$ absorbs the values of $f$ near $a$ if there exists a $\delta>0$ such that $B$ absorbs the values of $f$ $\null\delta$near $a$.

With this terminology, the definition of   “$\lim_{x\to a} f(x) = L$” becomes:

Every open interval centred on $L$ absorbs the values of $f$ near $a$.

So, after all this work, we are back to a single-quantifier definition. Why not stick to the definition using sequences? Well, we are now in a position to give the students a thorough introduction to the use of epsilon and delta. Time permitting, we can look at lots of examples of sets which do or don’t absorb the values of functions near or $\null\delta$-near various points. In the process we can reinforce intuitive notions of function limit and continuity, without sacrificing rigour.

Note that,  if we start with $f:\mathbb{R} \rightarrow \mathbb{R}$, the definition of “$f$ is continuous at $a$” can now be stated either as the standard
$\lim_{x \to a} f(x) = f(a)$,

or, with the new terminology, as follows:

Every open interval centred on $f(a)$ absorbs the values of $f$ near $a$.

Does anyone have any alternative suggestions for names for some of these concepts? Or do they already have names in the literature that I am simply unaware of?

Joel Feinstein

15:00

Maybe there is no need to bring in the word “absorbs” here, though it is tempting to make some use of it. If we do use the notion of absorption, it should perhaps be more closely associated with some notion of movement in the domain. For example, we could say that the set $B$ absorbs the values $f(x)$ as $x \to a$. Alternatively, we could simply say that the set $B$ includes the values of $f(x)$ for $x$ near $a$.

There is a second reason that I am not yet happy with the version above,

$B$ absorbs the values of $f$ near $a$.

This version appears to be a little ambiguous, and potentially confusing: the values in question could be taken to be

“those values of $f$ which are near $a$“.

Whether we use “absorbs”, “includes” or even that dangerous word “contains”, it may be safer to say

“… the values of $f(x)$ for $x$ near $a$

rather than

“… the values of $f$ near $a$“.

Maybe we could get away with

“… the values of $f$ at points near $a$“?

How about the following statement, formalized in terms of $\null\delta$ as above?

The set $B$ includes (all of) the values of $f$ at points near $a$.

• Do we need “all of” to make this as clear as possible?
• Is it a good idea to introduce a term such as absorption to make this statement less unwieldy?
• Should we try to introduce a notion of movement as in “as $x$ approaches $a$“?

16:00

Or maybe it would be best to eliminate the “values” and “points” altogether and go with statements such as

$B$ absorbs $f$  near $a$

and the version with  $\null\delta$,

$B$ absorbs $f$  $\null\delta$-near $a$.

What do people think?

Joel Feinstein

### 6 responses to “Quantifier packaging II: function limits and continuity (the fear of epsilon and delta)”

1. mattheath

When you put the first post about unpacking quantifiers, I started thinking about how to do continuity. I thought it might be worth separating out a definition of a discontinuity of size $\epsilon$ at a point x. So we say f has a discontinuity of size $\epsilon$ at x if for all $\null\delta$ there is $y\in B(x,\delta)$ with $|f(x)-f(y)|>\epsilon$. (Possibly “discontinuity of size at least $\epsilon$ would be preferable.)

We then define “f is continuous at x” to be “f has no discontinuity of any size at x” and so on.

Part of my thinking was that students will typically have a naïve picture of a function that isn’t continuous and that picture has it go along smoothly until a point at which it “jumps” some distance. If you keep this picture in mind (perhaps together with a collection of pictures of functions which “climb very steeply” at x but are continuous) I think it is fairly easy to reconstruct the definition of a discontinuity.

I thought of this sort of like remembering the offside rule in football; it’s hard because it seems pretty much arbitrary, but if you remember what it is designed to avoid (goal-hanging) it becomes fairly easy.

Like

• In reply to Matt: that is quite an interesting approach: defining continuous as “not discontinuous”. I will have to think about it!
I am currently writing a post about lim inf and lim sup: this gives another way to quantify the discontinuity at a point.
Joel

Like

2. From some conversations I have had, I think that I should clarify one thing. Quantifier packaging is not intended to prevent students from understanding the original multiple-quantifier statements, but to assist it. Once students can recognize parts of a complicated statement as meaning something that they understand, the whole statement should begin to make sense.
Perhaps this is a bit like learning a language. At first, when you hear people speaking quickly, you don’t even catch any of the words. After a while, you start to catch words, and then phrases that mean something, and then you start to be able to put the whole picture together.
So, when a student sees the statement
$\forall \varepsilon>0\,\exists \delta>0$ such that
$f( (a-\delta,a+\delta) \setminus \{a\}) \subseteq (L-\varepsilon,L+\varepsilon)\,$
they will be able to parse this using concepts they understand. Using terminology suggested above:

$f( (a-\delta,a+\delta) \setminus \{a\}) \subseteq (L-\varepsilon,L+\varepsilon)\,$

says that $(L-\varepsilon,L+\varepsilon)$ absorbs $f$ $\null\delta$-near $a$;

$\exists \delta>0$ such that $f( (a-\delta,a+\delta) \setminus \{a\}) \subseteq (L-\varepsilon,L+\varepsilon)\,$
says that $(L-\varepsilon,L+\varepsilon)$ absorbs $f$ near $a$;

the whole statement says that, for all $\varepsilon>0$, $(L-\varepsilon,L+\varepsilon)$ absorbs $f$ near $a$.

Understanding the simpler parts of the statement should hopefully assist the student to understand the whole of the multi-quantifier statement, even in its original form.

Joel

Like

3. If we introduce the term ‘almost absorbs’ (appropriately defined, as in our discussions of lim inf and lim sup), we can now achieve some new “zero-quantifier” definitions of our concepts!

The definition of ‘$\lim_{x \to a} f(x) = L$‘ becomes:

The single-point set $\{L\}$ almost absorbs $f$ near $a$.

The definition of ‘$f$ is continuous at $a$‘ becomes:

The single-point set $\{f(a)\}$ almost absorbs $f$ near $a$.

We can also define upper semi-continuity and lower semi-continuity this way.

Recall that $f$ is upper semi-continuous at $a$ if $\limsup_{x \to a} f(x) \leq f(a)$, and $f$ is lower semi-continuous at $a$ if $\liminf_{x \to a} f(x) \geq f(a)$.

In the language of “almost absorption”, $f$ is upper semi-continuous at $a$ if and only if
$({-\infty},f(a)]$ almost absorbs $f$ near $a$.
Similarly, $f$ is lower semi-continuous at $a$ if and only if
$\null[f(a),\infty)$ almost absorbs $f$ near $a$.

If we wish to work with extended-real-valued functions, we should use $\null[{-\infty},f(a)]$ and $\null[f(a),+\infty]$ instead, with the usual care needed in the definition of ‘almost absorbs’.

One possible problem here is the parsing of ‘almost absorbs $f$ near $a$‘. The student could think that this means that there is some $\delta>0$ such that the relevant set ‘almost contains’ the set $f(a-\delta,a+\delta)$ (with some inappropriate notion of ‘almost contains’ involving ‘$\forall \varepsilon$‘). This is not what we want! Would students be confused by this?

We may wish to avoid “almost absorption”. In this case, single-quantifier versions of the above (in the real-valued case) are as follows:

$f$ is upper semi-continuous at $a$ if and only if
for all $\varepsilon>0$, $({-\infty},f(a)+\varepsilon]$ absorbs $f$ near $a$;

$f$ is lower semi-continuous at $a$ if and only if
for all $\varepsilon>0$, $\null[f(a)-\varepsilon,\infty)$ absorbs $f$ near $a$.

In all cases, you can draw some nice diagrams illustrating the fact that parts of the curve near the point $a$ lie below/above appropriate horizontal lines.

Joel Feinstein 5/2/09

Like

4. I was discussing some of these issues with my colleague Sergey Utev, and he was telling me that he was able to make good use of “little o” and “big O” notation. This, of course, does not remove the need for epsilon and delta in order to make things rigorous, but it is another way to make some of the material more accessible to more of the students.
Now, in this post, I said that students need to get to grips with epsilon and delta at some point, and I gave some approaches that might help.
With “little o” notation in mind, perhaps a compromise could be to focus on convergence to zero? Null sequences and function limits being zero may be a bit easier than the general case. It also allows some more language to be (carefully) defined, such as

$|f|$ is $\varepsilon$-small $\null\delta$-near $a$
and
$|f|$ is $\varepsilon$-small near $a$.

However, we do lose the grammatical advantage of absorption, as discussed previously.
We end up with statements like

for all $\varepsilon>0$, $|f|$ is $\varepsilon$-small near $a$

and, perhaps only informally,

$|f|$ is small near $a$.

There is a bit of a problem if $f$ is actually defined at $a$, because you can’t get nearer to $a$ than $a$ itself! This is, however, always a problem when defining function limits.
Here, “near $a$” always means “at all those points which are sufficiently-near-but-not-equal-to $a$“.

Joel Feinstein 23/4/09

Like

5. There appears to be a (temporary?) problem that I don’t understand.
Wherever the Greek letter delta occurs on its own within some latex above, it currently says “Formula does not parse”.
If necessary I will use Tim Gowers’s \null trick to try to get around this, but has anyone else had problems with this?

Here I will try a delta on its own, using
dollar latex \delta dollar
This gives the following at the moment: $\delta$

Here is a second attempt including a \null, as
dollar latex \null \delta dollar
This gives the following: $\null \delta$

What do you know! another success for \null!

Joel Feinstein
6/6/09

PS
The temporary problem appears to have gone away again now anyway, and both of the versions above are working to give a delta. Maybe the best thing to do with these things is to check back later?

Like