Dual numbers and derivatives

A basic result from calculus is that the derivative of a one-term polynomial in x is: $\frac{d}{dx} x^n = n x^{n-1}$

One proof of this is based on the limit definition of derivative

$\frac{d}{dx} f(x) = \displaystyle\lim_{h \to 0} \frac{f(x+h) - f(x)}{h}$

The intuition behind this is that the derivative of a curve should give you information about the "slope" of a curve, or the rate at which $f(x)$ changes as $x$ varies. A derivative is intrinsic to a curve, and so can be calculated from it alone. This can be done by choosing two points on the curve and bringing them arbitrarily close together. The limit definition above just captures this process in mathematical language. Plugging $f(x) = x^n$ gives

$\frac{d}{dx} x^n = \displaystyle\lim_{h \to 0} \frac{(x+h)^n - x^n}{h}$

There are a few ways to go from here. A direct way is to invoke the binomial theorem:

$\frac{(x+h)^n - x^n}{h} =\displaystyle\sum_{k=0}^{n-1} \binom{n}{k} x^{k}h^{n-k-1}$

Taking the limit as $h \to 0$ simplifies things significantly:

$\frac{d}{dx} x^n = \frac{1}{h}\displaystyle\lim_{h \to 0} \displaystyle\sum_{k=0}^{n-1} \binom{n}{k} x^{k}h^{n-k-1} = \frac{1}{h} \binom{n}{n-1}x^{n-1} h = nx^{n-1}$

Where terms in $h^2, h^3, h^4, ...$ are effectively ignored. This was a contentious issue for a while - people knew that setting high-order powers to $0$ wasn't valid for conventional numbers, so this led to the idea of "infinitesmal" numbers. While proposed very early on, they were not rigorously established until the middle of the 20th century. This led to a long gap where their status as valid mathematical objects was unknown.

A basic model of infinitesmals is the real numbers, $\mathbb{R}$ , extended with an extra element $\epsilon$ such that $\epsilon^2 = 0$ (but $\epsilon \ne 0$ ) . This forms a ring, $\mathbb{R}[\epsilon]$ , of elements of the form:

$a + b \epsilon \in \mathbb{R}[\epsilon] \qquad a, b \in \mathbb{R}$

This is a system known as the dual numbers. This model ignores the foundational issues of infinitesmals in favor of looking at what algebra done with infinitesmals typically looks like. So we don't need to show that $\epsilon$ exists or construct it explicitly, but just take its existence as given as see what happens. The same proof above can be done in this number system: simply replace $h$ with $\epsilon$ . The added benefit of using dual numbers is that ignoring higher-order terms is now valid, which will allow us to use a weaker argument than the full binomial theorem.

Let's sketch it out: we can manually expand $(x + \epsilon)^n$ as

$(x + \epsilon)^n = (x + \epsilon) \cdot (x + \epsilon) \cdots \text{n brackets} \cdots (x + \epsilon)$

Multiplying these brackets consists of choosing one member ( $x, \epsilon$ ) from each bracket and multiplying them all together, then adding all the different ways to do so. To start, imagine that we start by multiplying only the $x$ terms. This gives us the leading coefficient, $x^n$ . So we can write

$(x + \epsilon) \cdot (x + \epsilon) \cdots \text{n brackets} \cdots (x + \epsilon) = x^n + \text{other terms}$

We can calculate the next term by multiplying by all but one of the $x$ terms and one of the $\epsilon$ terms. Since there are $n$ brackets, we have $n$ "different" epsilon terms to choose from. This means that

$(x + \epsilon) \cdot (x + \epsilon) \cdots \text{n brackets} \cdots (x + \epsilon) = x^n + n x^{n-1} \epsilon + \text{other terms}$

The next term has the form $x^{n-2}\epsilon^2$ , which is trivially $0$ in the dual numbers - so the computation can stop here. Otherwise, we can continue the argument: we can choose two different epsilon terms from $n$ brackets. There are $n(n-1)$ ways to do this as an epsilon in a bracket can't be chosen twice. However, given any pair of numbers there are two different ways to "choose" them: one before the other and vice-versa. In our case, we don't want to count these as separate instances so we need to correct by a factor of $1/2$ . Doing this gives

$(x + \epsilon) \cdot (x + \epsilon) \cdots \text{n brackets} \cdots (x + \epsilon) = x^n + n x^{n-1} \epsilon + \frac{n(n-1)}{2} x^{n-2} \epsilon ^2+ \text{other terms}$

Following this argument for even higher numbers will yield the binomial theorem. In dual numbers, we may terminate at the second step, where we got $(x + \epsilon)^n = x^n + n x^{n-1} \epsilon$ . This means that

$\frac{d}{dx} x^n = \frac{(x + \epsilon)^n - x^n}{\epsilon} = \frac{x^n + n x^{n-1} \epsilon - x^n}{\epsilon} = n x^{n-1}$

This hints at a general way to calculate derivatives in the dual numbers:

$f(x + \epsilon) - f(x) =\epsilon \frac{d}{dx} f(x)$

For example, given a function $p(x) = x^3 - 4x$ , we can calculate its derivative indirectly by first finding $p(x + \epsilon)$

$p(x + \epsilon) = (x + \epsilon)^3 - 4(x + \epsilon) = x^3 - 4x + (3x^2 - 4) \epsilon$

In this case, $p'(x) = 3x^2 - 4$ as expected.

We can go further with this idea to prove the product rule. Given the functions $f(x)$ and $g(x)$ , how do we calculate $\frac{d}{dx} f(x)g(x)$ ? With dual numbers, this amounts to finding $f(x + \epsilon)g(x + \epsilon)$ :

$f(x + \epsilon)g(x + \epsilon) = \left(f(x) + \epsilon \frac{d}{dx} f(x) \right) \left(g(x) + \epsilon \frac{d}{dx} g(x) \right)$ $= f(x)g(x) + \epsilon \left( g(x) \frac{d}{dx} f(x) + f(x) \frac{d}{dx} g(x) \right)$

Which confirms that $\frac{d}{dx} f(x)g(x) = g(x) \frac{d}{dx} f(x) + f(x) \frac{d}{dx} g(x)$ . The chain rule, likewise, can be found by expanding $f(g(x + \epsilon))$ :

$f(g(x + \epsilon)) = f \left (g(x) + \epsilon \frac{d}{dx} g(x) \right) = f(y + \eta)$ $= f(y) + \eta (\frac{d}{dy} f(y))$ $= f(g(x) ) + \epsilon \left(\frac{d}{dx} g(x) \cdot \frac{d}{dg(x)} f(g(x)) \right)$

With the substitutions $y = g(x)$ and $\eta = \epsilon \frac{d}{dx} g(x)$ to simplify the calculation.

Markdown	Appears as
`italics` or `_italics_`	italics
`bold` or `__bold__`	bold
- bulleted - list	bulleted list
1. numbered 2. list	numbered list
Note: you must add a full line of space before and after lists for them to show up correctly
paragraph 1 paragraph 2	paragraph 1 paragraph 2
`[example link](https://brilliant.org)`	example link
`> This is a quote`	This is a quote
# I indented these lines # 4 spaces, and now they show # up as a code block. print "hello world"	# I indented these lines # 4 spaces, and now they show # up as a code block. print "hello world"

Math	Appears as
Remember to wrap math in `\(` ... `\)` or `\[` ... `\]` to ensure proper formatting.
`2 \times 3`	$2 \times 3$
`2^{34}`	$2^{34}$
`a_{i-1}`	$a_{i-1}$
`\frac{2}{3}`	$\frac{2}{3}$
`\sqrt{2}`	$\sqrt{2}$
`\sum_{i=1}^3`	$\sum_{i=1}^3$
`\sin \theta`	$\sin \theta$
`\boxed{123}`	$\boxed{123}$

Dual numbers and derivatives

Comments