Newton’s Method

(This post appeared on my previous blog, Not Just Numbers)

Let’s look at a very common problem where the use of math is needed, solving equations. We have an expression using some variable x on one side of the ‘=’ sign and another expression using x on the other. We want to find for which values of x the equation holds.

As you may recall from middle/high school, there are several methods of solving equations. if the equation is linear, so of the form ax+b=c, then you subtract b and divide by a. If the equation is quadratic, so of the form ax²+bx+c, you can use the quadratic formula. And if the equation is trignometric, say of the form sin(θ)=c, you hope you remembered the common sine identities rights.

But equations are not always that nice. In the meanest test by the meanest teacher, you may find an equation where you need to use dozens of methods to get a simpler equations, and hope you didn’t forget a minus sign along the way. Even worse, a lot of the time equations cannot be solved just with conventional methods, and finding the mathematical expression for their solution requires defining a completely new function!

This is, however, okay. Mathematics, unlike what some people might tell you, is not about solving equations and finding formulas, not at all. When the solutions of complicated equations do actually need to be found, whether in mathematics or its applications, it is usually better to approximate the solution instead.

Another benafit for approximating solutions to equations is that, unlike solving them mathematicaly, the result is an exact decimal expression, and not a mathematical one. So say we want to find the decimal value of the square root of two, so the solution to the equation x²=2. Then we get a decimal value for the solution, instead of the mathematical one √2 (which does not let us know, for example, whether is greater or less than 1.4).

Approximating has been part of mathematics dating back to the Babylonians thousands of years ago. Nowadays, approximation of constants and solutions of equation is one of the topics of mathematics called Numeric Analysis. Alongside finding methods to approximate solutions, numeric analysis also deals with finding when certain methods are more efficient and how efficient they are.

To be more precise, the part of numeric analysis I am referring to is finding roots of functions, but this is equivalent to finding solutions to equations. Indeed, every equation using one variable x can be converted to the form f(x)=0, where f(x) is some function, by subtracting one side of the equation for the other. So the problem of solving the equations becomes the problem of finding the root of the function f(x).

For those of you unfamiliar with what a function is, at its simplest function is a mathematical object f (usually written as f(x)) where you can plug in some number x and get a new number f(x). We want to find the root of f, so we want to find for which values of x we have f(x)=0.

A very common and very useful method for approximating roots of functions is, as this post’s title suggests, Newton’s Method (also known as the Newton-Raphson method). Named after the known scientist Isaac Newton, the method was initial used by him to find the roots of polynomials (functions that are sums of elements of the form axⁿ where a, n are constants, n is a positive integer). But over the years the method has been extended to all sorts of functions using (Newton’s own) calculus.

Overview of the method

Let’s look at a simple example to see how the method works. Say we want to find √2, so the root of f(x)=x²-2:

Let’s zoom in more at where the positive root is:

The first step in Newton’s method (and in any iterative method for approximating roots) is to take an initial guess for the root, x₀. For this example, let us take x₀=2 as the initial guess: (well, not that anyone in their right mind would say √2=2, but it is close enough to it for the method to work)

What we do now, and what defines Newton’s method, is to draw the tangent to the graph at x=x₀. Then we find out where the tangent intersects the x axis and choose x₁ to be the point of intersection:

(as it turns out, for this example x₁ is exactly 1.5)

We repeat the same step again, this time with x₁ instead of x₀, and get a new point x₂. We repeat this again but with x₂, and continue doing this so we get a sequence of points x₀, x₁, x₂, x₃, x₄,… . In most cases, it is guaranteed for all roots of the function, if we choose a starting point close enough to the root, the sequence will converge to the root!

graph5wd — It’s starting to get heard to see what’s going on since the graphs of the function and the tangent are so close to each other.

For this example, just doing the iteration twice gives us x₃=1.414215…, an error of less than five decimal places from √2=1.4142135…! Of course, if we do more and more iterations the number of correct decimal places will continue to increase.

How do we find the slope?

How exactly do we find the tangent line? Well, we can use calculus for help. As you may recall, the derivative of a function f at a certain point is also the slope of the tangent to the function at that point. So the slope of the tangent at point x* is f’(x), and as the line with slope f'(x*) passing through the point (x*,f(x*)), the equation for the tangent will be:

Putting in y=0, we can find the point of intersection of the tangent with the x axis, which we will call x**:
In particular, returning to the sequence x₀, x₁, x₂, x₃, x₄,… of the iteration, if we let the nth value in the iteration be x_n, we get the following relation:

This type of relation, where the next values in a sequence are given by the previous ones, is called in mathematics a recursive relation.

What happens if we land on a point where the derivative is zero. Well ~~the universe explodes because we divided by zero~~ we have to stop the process and move to another nearby point to start again. Luckily, for most applications of the method the points where the derivative is zero are far from each other and rarely will we run in to one. Should the root have a zero derivative, the method will still work, although at a slower pace than usual.

The Newton Fractal

For a blog aiming to show that math is more than just numbers and equations, that was a lot of…. well numbers and equations. But first of all, the idea is the method itself, not the equations surrounding it. Secondly, although explaining how the method works is important for me, my main motivation for this post is to show how this method leads to the Newton Fractal

Suppose we have some function f, and we want to use the Newton method to find its roots. Consider the following question: for which starting points (so for which x₀) does the method converge to a root of f? If it does, how many iterations does it require? More than that, we want to see what the set of points for which the method converges looks like. But since using the method on just the real number line will yield us a one dimesional picture, we will use the method in the complex plane.

For does of you unfamiliar, complex numbers are numbers of the form z=x+i*y, where x and y are real numbers (i.e, numbers on the number line, such as 0, 1.2, and π), and are called the real and imaginary components of z respectively. The number i is called the imaginary unit, and has the special property i^2=-1. Using these defenitions, we can add, multiply, and even define functions and derivatives on complex numbers. So, in particular, we can also use the Newton method on complex numbers.

Additionaly, complex numbers can be pictured as lying in the complex plane. The numbers real component will be on the x axis, and the imaginary component will be on the y axis.

So, returning to our question, and we plot the complex numbers for which the method converges, we get some beautiful results:

(The numbers on the side represent how many iterations it takes for the method to converge, or for it to get far enough from any root. Note that the points with the darkest shade of blue must be roots)

And this is the lesson I want you to learn from this post. Even though we started with a practical problem, solving (or rather, approximating solutions to) equations using the Newton method, we ended up here. By asking the question of for which points the method converges we ended up with the beauty of the Newton fractal.

So remember, even if you see something mathematical that seems ugly and overly complicated (say a really complex formula), what is important and the idea behind it. And behind that idea you might find something truly wonderful.