A basic result from calculus is that the derivative of a one-term polynomial in x is:
dxdxn=nxn−1
One proof of this is based on the limit definition of derivative
dxdf(x)=h→0limhf(x+h)−f(x)
The intuition behind this is that the derivative of a curve should give you information about the "slope" of a curve, or the rate at which f(x) changes as x varies. A derivative is intrinsic to a curve, and so can be calculated from it alone. This can be done by choosing two points on the curve and bringing them arbitrarily close together. The limit definition above just captures this process in mathematical language. Plugging f(x)=xn gives
dxdxn=h→0limh(x+h)n−xn
There are a few ways to go from here. A direct way is to invoke the binomial theorem:
h(x+h)n−xn=k=0∑n−1(kn)xkhn−k−1
Taking the limit as h→0 simplifies things significantly:
dxdxn=h1h→0limk=0∑n−1(kn)xkhn−k−1=h1(n−1n)xn−1h=nxn−1
Where terms in h2,h3,h4,... are effectively ignored. This was a contentious issue for a while - people knew that setting high-order powers to 0 wasn't valid for conventional numbers, so this led to the idea of "infinitesmal" numbers. While proposed very early on, they were not rigorously established until the middle of the 20th century. This led to a long gap where their status as valid mathematical objects was unknown.
A basic model of infinitesmals is the real numbers, R, extended with an extra element ϵ such that ϵ2=0 (but ϵ=0 ) . This forms a ring, R[ϵ], of elements of the form:
a+bϵ∈R[ϵ]a,b∈R
This is a system known as the dual numbers. This model ignores the foundational issues of infinitesmals in favor of looking at what algebra done with infinitesmals typically looks like. So we don't need to show that ϵ exists or construct it explicitly, but just take its existence as given as see what happens. The same proof above can be done in this number system: simply replace h with ϵ. The added benefit of using dual numbers is that ignoring higher-order terms is now valid, which will allow us to use a weaker argument than the full binomial theorem.
Let's sketch it out: we can manually expand (x+ϵ)n as
(x+ϵ)n=(x+ϵ)⋅(x+ϵ)⋯n brackets⋯(x+ϵ)
Multiplying these brackets consists of choosing one member (x,ϵ) from each bracket and multiplying them all together, then adding all the different ways to do so. To start, imagine that we start by multiplying only the x terms. This gives us the leading coefficient, xn. So we can write
(x+ϵ)⋅(x+ϵ)⋯n brackets⋯(x+ϵ)=xn+other terms
We can calculate the next term by multiplying by all but one of the x terms and one of the ϵ terms. Since there are n brackets, we have n "different" epsilon terms to choose from. This means that
(x+ϵ)⋅(x+ϵ)⋯n brackets⋯(x+ϵ)=xn+nxn−1ϵ+other terms
The next term has the form xn−2ϵ2, which is trivially 0 in the dual numbers - so the computation can stop here. Otherwise, we can continue the argument: we can choose two different epsilon terms from n brackets. There are n(n−1) ways to do this as an epsilon in a bracket can't be chosen twice. However, given any pair of numbers there are two different ways to "choose" them: one before the other and vice-versa. In our case, we don't want to count these as separate instances so we need to correct by a factor of 1/2. Doing this gives
(x+ϵ)⋅(x+ϵ)⋯n brackets⋯(x+ϵ)=xn+nxn−1ϵ+2n(n−1)xn−2ϵ2+other terms
Following this argument for even higher numbers will yield the binomial theorem. In dual numbers, we may terminate at the second step, where we got (x+ϵ)n=xn+nxn−1ϵ. This means that
dxdxn=ϵ(x+ϵ)n−xn=ϵxn+nxn−1ϵ−xn=nxn−1
This hints at a general way to calculate derivatives in the dual numbers:
f(x+ϵ)−f(x)=ϵdxdf(x)
For example, given a function p(x)=x3−4x, we can calculate its derivative indirectly by first finding p(x+ϵ)
p(x+ϵ)=(x+ϵ)3−4(x+ϵ)=x3−4x+(3x2−4)ϵ
In this case, p′(x)=3x2−4 as expected.
We can go further with this idea to prove the product rule. Given the functions f(x) and g(x), how do we calculate dxdf(x)g(x)? With dual numbers, this amounts to finding f(x+ϵ)g(x+ϵ):
f(x+ϵ)g(x+ϵ)=(f(x)+ϵdxdf(x))(g(x)+ϵdxdg(x))
=f(x)g(x)+ϵ(g(x)dxdf(x)+f(x)dxdg(x))
Which confirms that dxdf(x)g(x)=g(x)dxdf(x)+f(x)dxdg(x). The chain rule, likewise, can be found by expanding f(g(x+ϵ)):
f(g(x+ϵ))=f(g(x)+ϵdxdg(x))=f(y+η)
=f(y)+η(dydf(y))
=f(g(x))+ϵ(dxdg(x)⋅dg(x)df(g(x)))
With the substitutions y=g(x) and η=ϵdxdg(x) to simplify the calculation.
#Calculus
Easy Math Editor
This discussion board is a place to discuss our Daily Challenges and the math and science related to those challenges. Explanations are more than just a solution — they should explain the steps and thinking strategies that you used to obtain the solution. Comments should further the discussion of math and science.
When posting on Brilliant:
*italics*
or_italics_
**bold**
or__bold__
paragraph 1
paragraph 2
[example link](https://brilliant.org)
> This is a quote
\(
...\)
or\[
...\]
to ensure proper formatting.2 \times 3
2^{34}
a_{i-1}
\frac{2}{3}
\sqrt{2}
\sum_{i=1}^3
\sin \theta
\boxed{123}
Comments
There are no comments in this discussion.