The derivative of a function $f$ is another function, $f'$, defined as
$$f'(x) \;\equiv\; \frac{df}{dx} \;\equiv\; \lim_{\delta x \rightarrow 0} \, \frac{f(x + \delta x) - f(x)}{\delta x}.$$This kind of expression is called a limit expression because it involves a limit (in this case, the limit where $\delta x$ goes to zero).
If the derivative exists within some domain of $x$ (i.e., the above limit expression is mathematically well-defined), then we say $f$ is differentiable in that domain. It can be shown that a differentiable function is automatically continuous. (Try proving it!)
If $f$ is differentiable, we can define its second-order derivative $f''$ as the derivative of $f'$. Third-order and higher-order derivatives are defined similarly.
Graphically, the derivative represents the slope of the graph of $f(x)$, while the second derivative represents the curvature. The interactive graph below shows the function $f(x) = x^2 + 1$, illustrating how its derivative $f'(x) = 2x$ varies with $x$. Note that the curve is upward-curving, corresponding to the fact that the second derivative, $f''(x) = 2$, is positive.
In [3]:
## Run this code cell to generate an interactive graph illustrate the derivative concept.
%matplotlib inline
from ipywidgets import interact, FloatSlider
from numpy import linspace
import matplotlib.pyplot as plt
def plot_derivative(x0):
xmin, xmax, ymin, ymax = 0., 4.5, 0., 18.
x = linspace(xmin, xmax, 100)
y = x**2 + 1
## Plot y=x^2+1, and a triangle at x = x0, with base length 1 and height dy/dx = 2x
plt.figure(figsize=(10,5))
plt.plot(x, y)
x1, y0, dydx = x0+1, x0*x0+1, 2*x0
plt.plot([x0, x1, x1, x0], [y0, y0, y0+dydx, y0], color='red')
plt.plot([x0, x0], [0, y0], '--', color='grey')
## Axes, text labels, etc.
plt.text(x1+0.05, y0+0.5*dydx, "f ' (x0) = {0:.2f}".format(dydx))
plt.text(x0+0.5, y0-1.0, '1')
plt.axis([xmin, xmax, ymin, ymax])
plt.xlabel('x'); plt.ylabel('f(x)')
plt.show()
interact(plot_derivative, x0=FloatSlider(min=1.0, max=2.8, step=0.1, value=1.0));
Since derivatives are defined using limit expressions, let us review the rules governing limits.
First, the limit of a linear superposition is equal to the linear superposition of limits. Given two constants $a_1$ and $a_2$ and two functions $f_1$ and $f_2$,
$$\lim_{x \rightarrow c} \big[a_1 \,f_1(x) \;+\; a_2\, f_2(x)\big] = a_1 \lim_{x \rightarrow c} f_1(x) \;+\; a_2 \lim_{x \rightarrow c} f_2(x).$$Second, limits obey a product rule and a quotient rule:
$$\begin{aligned}\lim_{x \rightarrow c} \big[\;f_1(x) \; f_2(x)\;\big] &= \Big[\lim_{x \rightarrow c} f_1(x)\Big]\;\Big[\lim_{x \rightarrow c} f_2(x)\Big] \\ \lim_{x \rightarrow c} \left[\;\frac{f_1(x)}{f_2(x)}\;\right] &= \frac{\lim_{x \rightarrow c} f_1(x)}{\lim_{x \rightarrow c} f_2(x)}. \end{aligned}$$As a special exception, the product rule and quotient rule are inapplicable if they result in $0 \times \infty$, $\infty/\infty$, or $0/0$, which are undefined. As an example of why such combinations are problematic, consider this:
$$\lim_{x \rightarrow 0} \;x = \lim_{x \rightarrow 0}\Big[\,x^2\;\frac{1}{x}\;\Big] \overset{?}{=} \lim_{x \rightarrow 0}\Big[\,x^2\,\Big]\; \lim_{x \rightarrow 0}\Big[\,\frac{1}{x}\,\Big] = 0 \, \times \, \infty \;\;(??)$$In fact, the limit expression has the value of 0; it was not correct to apply the product rule in the second step.
It is often convenient to use the computer to check for the correctness of a limit expression. This can be done by replacing infinitesimal quantities with numbers that are nonzero but very small. For example, here is a quick way to check that the derivative of $x^2 + 1$ is $2x$:
In [11]:
x = 2.0
dx = 1e-8
derivative_estimate = (((x+dx)**2 + 1) - (x**2 + 1)) / dx
print(derivative_estimate)
print(2*x)
Using the rules for limit expressions, we can derive the elementary composition rules for derivatives:
$$\begin{aligned}\frac{d}{dx}\big[\,\alpha\, f(x) + \beta\, g(x)\,\big] &= \alpha\, f'(x) + \beta\, g'(x) \quad &\textrm{(linearity)}& \\ \frac{d}{dx}\big[\,f(x) \, g(x)\,\big] &= f(x) \, g'(x) + f'(x) \, g(x) &\textrm{(product}\;\textrm{rule)}& \\ \frac{d}{dx}\big[\,f(g(x))\,\big] &= f'(g(x)) \, g'(x) &\textrm{(chain}\;\textrm{rule)}&\end{aligned}$$These can all be proven by direct substitution into the definition of the derivative, and taking appropriate orders of limits. With the aid of these rules, we can prove various standard results, such as the "power rule" for derivatives: $$\frac{d}{dx} \big[x^n\big] = n x^{n-1}, \;\;n \in \mathbb{N}.$$
The linearity of the derivative operation implies that derivatives "commute" with sums, i.e. you can move them to the left or right of summation signs. This is a very useful feature. For example, we can use it to prove that the exponential function is its own derivative, as follows:
$$\frac{d}{dx} \left[\exp(x)\right] = \frac{d}{dx} \sum_{n=0}^\infty\frac{x^n}{n!} = \sum_{n=0}^\infty\frac{d}{dx} \, \frac{x^n}{n!} = \sum_{n=1}^\infty \frac{x^{n-1}}{(n-1)!} =\exp(x)$$Derivatives also commute with limits. For example, we can use this on the alternative definition of the exponential function:
$$\begin{aligned}\frac{d}{dx} \left[\exp(x)\right] &= \frac{d}{dx} \lim_{n\rightarrow\infty} \left(1+\frac{x}{n}\right)^n = \lim_{n\rightarrow\infty} \frac{d}{dx} \left(1+\frac{x}{n}\right)^n \\ &= \lim_{n\rightarrow\infty} \left(1+\frac{x}{n}\right)^{n-1} \;= \exp(x)\end{aligned}$$
In [4]:
## Check that the derivative of exp(x) is exp(x):
x = 2.0
dx = 1e-8
from numpy import exp
derivative_estimate = (exp(x+dx) - exp(x)) / dx
print(derivative_estimate)
print(exp(x))
A function is infinitely differentiable at a point $x_0$ if all orders of derivatives (i.e., the first derivative, the second derivative, etc.) are well-defined at $x_0$. If a function is infinitely differentiable at $x_0$, then near that point it can be expanded in a Taylor series:
$$\begin{aligned}f(x) \;&\leftrightarrow\; \sum_{n=0}^\infty \frac{(x-x_0)^n}{n!} \left[\frac{d^nf}{dx^n}\right](x_0) \\&=\; f(x_0) + (x-x_0)\, f'(x_0) + \frac{1}{2}\, (x-x_0)^2\, f''(x_0) + \cdots\end{aligned}$$Here, the "zeroth derivative" refers to the function itself. The Taylor series can be derived by assuming that $f(x)$ can be written as a general polynomial involving terms of the form $(x-x_0)^n$, and then using the definition of the derivative to find the series coefficients.
Many common encountered functions have Taylor series that are exact (i.e., the series is convergent and equal to the value of the function itself) over some portion of their domain. But beware: it is possible for a function to have a divergent Taylor series, or a Taylor series that converges to a different value than the function itself! The conditions under which this happens is a complicated topic that we will not delve into.
Useful Taylor Series | |
$$\displaystyle\frac{1}{1-x} = 1 + x + x^2 + x^3 + \cdots\mathrm{for} \; |x| < 1 $$ | $$\displaystyle\ln(1-x) = -x - \frac{x^2}{2} - \frac{x^3}{3} - \frac{x^4}{4} - \cdots \quad \mathrm{for} \; |x| < 1$$ |
$$\displaystyle\sin(x) = x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + \cdots$$ | $$\displaystyle\sinh(x) = x + \frac{x^3}{3!} + \frac{x^5}{5!} + \frac{x^7}{7!} + \cdots$$ |
$$\displaystyle\cos(x) = 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} + \cdots$$ | $$\displaystyle\cosh(x) = 1 + \frac{x^2}{2!} + \frac{x^4}{4!} + \frac{x^6}{6!} + \cdots$$ |
In the above table, apart from the first row, the other four Taylor series converge to the value of the function for all $x\in\mathbb{R}$.
The following interactive plots compare the functions $\sin(x)$ and $\ln(x+1)$ to their series expansions:
In [6]:
## Compare sin(x) to its series expansion
%matplotlib inline
from ipywidgets import interact, IntSlider
from numpy import linspace, zeros, sin
from scipy.special import factorial
import matplotlib.pyplot as plt
def plot_sine_series(N):
x = linspace(0, 25, 200)
y = sin(x)
ys = zeros(len(x))
for n in range(1, N+1, 2): # sum over 1, 3, 5, ..., N
ys += x**n * (-1)**(0.5*(n-1)) / factorial(n)
plt.figure(figsize=(10,5))
plt.plot(x, y, color='blue', label='Exact')
plt.plot(x, ys, color='red', label='Power series')
plt.title('Power series for sin(x), summed to order {}'.format(N))
plt.axis([0, 25, -2, 2])
plt.xlabel('x'); plt.ylabel('y')
plt.legend(loc='lower right')
plt.show()
interact(plot_sine_series, N=IntSlider(min=1, max=59, step=2, value=4));
In [7]:
## Compare ln(x+1) to its series expansion
%matplotlib inline
from ipywidgets import interact, IntSlider
from numpy import linspace, zeros, log
import matplotlib.pyplot as plt
def plot_log_series(N):
x = linspace(-0.99, 1.5, 200)
y = log(x+1)
xs = linspace(-2, 2, 200)
ys = zeros(len(x))
for n in range(1, N+1):
ys -= (-xs)**n / n
plt.figure(figsize=(10,5))
plt.plot(x, y, color='blue', label='Exact')
plt.plot(xs, ys, color='red', label='Power series')
plt.title('Power series for ln(x+1), summed to order {}'.format(N))
plt.axis([-2, 1.5, -4, 3])
plt.xlabel('x'); plt.ylabel('y')
plt.legend(loc='lower right')
plt.show()
interact(plot_log_series, N=IntSlider(min=1, max=59, step=1, value=1));
A differential equation is an equation that involves derivatives of a function. For example, here is a differential equation involving $f$ and its first derivative:
$$\frac{df}{dx} = f(x)$$This is called an ordinary differential equation because it involves a derivative with respect to a single variable $x$, rather than multiple variables.
Finding a solution for the differential equation means finding a function that satisfies the equation. There is no single method for solving differential equations. In some cases, we can guess the solution; for example, by trying different elementary functions, we can discover that the above differential equation can be solved by
$$f(x) = A \exp(x).$$Certain classes of differential equation can be solved using techniques like Fourier transforms, Green's functions, etc., some of which will be taught in this course. On the other hand, many differential equations simply have no known exact analytic solution.
Example |
The following differential equation describes a damped harmonic oscillator:$$\frac{d^2 x}{dt^2} + 2\gamma\frac{dx}{dt} + \omega_0^2 x(t) = 0.$$In this case, note that $x(t)$ is the function, and $t$ is the input variable. This is unlike our previous notation where $x$ was the input variable, so don't get confused! This equation is obtained by applying Newton's second law to an object moving in one dimension subject to both a damping force and a restoring force, with $x(t)$ representing the position as a function of time. |
When confronted with an ordinary differential equation, the first thing you should check for is the highest derivative appearing in the equation. This is called the order of the differential equation. If the equation has order $N$, then its general solution contains $N$ free parameters that can be assigned any value (this is similar to the concept of integration constants, which we'll discuss shortly). Therefore, if you happen to guess a solution, but that solution does not contain $N$ free parameters, then you know the solution isn't the most general one.
For example, the ordinary differential equation
$$\frac{df}{dx} = f(x)$$has order one. We have previously guessed the solution $f(x) = A \exp(x)$, which has one free parameter, $A$. So we know our work is done: there is no solution more general than the one we found.
A specific solution to a differential equation is a solution containing no free parameters. One way to get a specific solution is to start from a general solution, and assign actual values to each of the free parameters. In physics problems, the assigned values are commonly determined by boundary conditions. For example, you may be asked to solve a second-order differential equation given the boundary conditions $f(0) = a$ and $f(1) = b$; alternatively, you might be given the boundary conditions $f(0) = c$ and $f'(0) = d$, or any other combination of two conditions. For an ordinary differential equation of order $N$, we need $N$ conditions to define a specific solution.
So far, we have focused on functions which take a single input. Functions can also take multiple inputs; for instance, a function $f(x,y)$ maps two input numbers, $x$ and $y$, and outputs a number. In general, the inputs are allowed to vary independently of one another. The partial derivative of such a function is its derivative with respect to one of its inputs, keeping the others fixed. For example,
$$f(x,y) = \sin(2x - 3 y^2)$$has partial derivatives
$$\frac{\partial f}{\partial x} = 2\cos(2x-3y^2), \quad \frac{\partial f}{\partial y} = - 6\cos(2x-3y^2).$$We have previously seen that single-variable functions obey a derivative composition rule,
$$\frac{d}{dx}\, f\big(g(x)\big) = g'(x) \, f'\big(g(x)\big).$$This composition rule has a important generalization for partial derivatives, which is related to the physical concept of a change of coordinates. Suppose a function $f(x,y)$ takes two inputs $x$ and $y$, and we wish to express them using a different coordinate system denoted by $u$ and $v$. In general, each coordinate in the old system depends on both coordinates in the new system:
$$x = x(u,v), \quad y = y(u,v).$$Expressed in the new coordinates, the function is
$$F(u,v) \equiv f\big(x(u,v), y(u,v)\big).$$It can be shown that the transformed function's partial derivatives obey the composition rule
$$\begin{aligned}\frac{\partial F}{\partial u} &= \frac{\partial f}{\partial x} \frac{\partial x}{\partial u} + \frac{\partial f}{\partial y} \frac{\partial y}{\partial u}\\ \frac{\partial F}{\partial v} &= \frac{\partial f}{\partial x} \frac{\partial x}{\partial v} + \frac{\partial f}{\partial y} \frac{\partial y}{\partial v}.\end{aligned}$$On the right-hand side of these equations, the partial derivatives are to be expressed in terms of the new coordiantes $(u,v)$. For example,
$$\frac{\partial f}{\partial x} = \left.\frac{\partial f}{\partial x}\right|_{x = x(u,v), \;y= y(u,v)}$$The generalization of this rule to more than two inputs is straightforward. For a function $f(x_1, \dots, x_N)$, a change of coordinates $x_i = x_i(u_1, \dots, u_N)$ involves the composition
$$F(u_1, \dots, u_N) = f\big(x_1(u_1,\dots,u_N\big), \dots), \quad \frac{\partial F}{\partial u_i} = \sum_{j=1}^N \frac{\partial x_j}{\partial u_i} \frac{\partial f}{\partial x_j}.$$Example |
In two dimensions, Cartesian and polar coordinates are related by $$x = r\cos\theta, \quad y = r\sin\theta.$$ Given a function $f(x,y)$, we can re-write it in polar coordinates as $F(r,\theta)$. The partial derivatives are related by $$\begin{aligned}\frac{\partial F}{\partial r} &= \frac{\partial f}{\partial x} \frac{\partial x}{\partial r} + \frac{\partial f}{\partial y} \frac{\partial y}{\partial r} = \frac{\partial f}{\partial x} \cos\theta + \frac{\partial f}{\partial y} \sin\theta. \\ \frac{\partial F}{\partial \theta} &= \frac{\partial f}{\partial x} \frac{\partial x}{\partial \theta} + \frac{\partial f}{\partial y} \frac{\partial y}{\partial \theta} = -\frac{\partial f}{\partial x} r\,\sin\theta + \frac{\partial f}{\partial y} r\cos\theta.\end{aligned}$$ |
A partial differential equation is a differential equation involving multiple partial derivatives (as opposed to an ordinary differential equation, which involves derivatives with respect to a single variable). An example of a partial differential equation encountered in physics is Laplace's equation,
$$\frac{\partial^2 \Phi}{\partial x^2} + \frac{\partial^2 \Phi}{\partial y^2} + \frac{\partial^2 \Phi}{\partial z^2}= 0,$$which describes the electrostatic potential $\Phi(x,y,z)$ at position $(x,y,z)$, in the absence of any electric charges.
Partial differential equations are considerably harder to solve than ordinary differential equations. In particular, their boundary conditions are more complicated to specify: whereas each boundary condition for an ordinary differential equation consists of a single number (e.g., the value of $f(x)$ at some point $x = x_0$), each boundary condition for a partial differential equation consists of a function (e.g., the values of $\Phi(x,y,z)$ along some curve $g(x,y,z) = 0$).