Numerical methods challenge: Day 12

Nicolás Guarín-Zapata

2017-10-12 15:06

During October (2017) I will write a program per day for some well-known numerical methods in both Python and Julia. It is intended to be an exercise then don't expect the code to be good enough for real use. Also, I should mention that I have almost no experience with Julia, so it probably won't be idiomatic Julia but more Python-like Julia.

Hermite interpolation: Inverting Vandermonde matrix

Today we are comparing Lagrange interpolation with Hermite interpolation. For this example we are using the confluent Vandermonde matrix \(V\). The code is based on an old code of mine that is present in this repo.

As in the case of Lagrange interpolation we solve the system

\begin{equation*} V\mathbf{c} = I \end{equation*}

where \(\mathbf{c}\) is the vector of coefficients and \(I\) is the identity matrix. This method is not stable and should not be used for the computation of higher order interpolants, even for optimally chosed sampling. It will start failing around 20 points. A better approach is to use the barycentric form of the interpolation.

In the example below we use Chebyshev nodes. The nodes are given by

\begin{equation*} x_k = \cos\left(\frac{2k-1}{2n}\pi\right), \quad k = 1, \ldots, n \end{equation*}

where \(n\) is the degree of the polynomial.

Following are the codes.

Python

from __future__ import division, print_function
import numpy as np
import matplotlib.pyplot as plt


def vander_mat(x):
    n = len(x)
    van = np.zeros((n, n))
    power = np.array(range(n))
    for row in range(n):
        van[row, :] = x[row]**power
    return van


def conf_vander_mat(x):
    n = len(x)
    conf_van = np.zeros((2*n, 2*n))
    power = np.array(range(2*n))
    for row in range(n):
        conf_van[row, :] = x[row]**power
        conf_van[row + n, :] = power*x[row]**(power - 1)
    return conf_van


def inter_coef(x, inter_type="lagrange"):
    if inter_type == "lagrange":
        vand_mat = vander_mat(x)
    elif inter_type == "hermite":
        vand_mat = conf_vander_mat(x)
    coef = np.linalg.solve(vand_mat, np.eye(vand_mat.shape[0]))
    return coef


def compute_interp(x, f, x_eval, df=None):
    n = len(x)
    if df is None:
        coef = inter_coef(x, inter_type="lagrange")
    else:
        coef = inter_coef(x, inter_type="hermite")
    f_eval = np.zeros_like(x_eval)
    nmat = coef.shape[0]
    for row in range(nmat):
        for col in range(nmat):
            if col < n or nmat == n:
                f_eval += coef[row, col]*x_eval**row*f[col]
            else:
                f_eval += coef[row, col]*x_eval**row*df[col - n]
    return f_eval




n = 7
x = -np.cos(np.linspace(0, np.pi, n))
f = lambda x: 1/(1 + 25*x**2)
df = lambda x: -50*x/(1 + 25*x**2)**2
x_eval = np.linspace(-1, 1, 500)
interp_f = compute_interp(x, f(x), x_eval, df=df(x))
plt.plot(x_eval, f(x_eval))
plt.plot(x_eval, interp_f)
plt.plot(x, f(x), ".")
plt.ylim(0, 1.2)
plt.show()

Julia

using PyPlot


function vander_mat(x)
    n = length(x)
    van = zeros(n, n)
    power = 0:n-1
    for row = 1:n
        van[row, :] = x[row].^power
    end
    return van
end


function conf_vander_mat(x)
    n = length(x)
    conf_van = zeros(2*n, 2*n)
    power = 0:2*n-1
    for row = 1:n
        conf_van[row, :] = x[row].^power
        conf_van[row + n, :] = power.*x[row].^(power - 1)
    end
    return conf_van
end


function inter_coef(x; inter_type="lagrange")
    if inter_type == "lagrange"
        vand_mat = vander_mat(x)
    elseif inter_type == "hermite"
        vand_mat = conf_vander_mat(x)
    end
    coef = vand_mat \ eye(size(vand_mat)[1])
    return coef
end


function compute_interp(x, f, x_eval; df=nothing)
    n = length(x)
    if df == nothing
        coef = inter_coef(x, inter_type="lagrange")
    else
        coef = inter_coef(x, inter_type="hermite")
    end
    f_eval = zeros(x_eval)
    nmat = size(coef)[1]
    for row = 1:nmat
        for col = 1:nmat
            if col <= n || nmat == n
                f_eval += coef[row, col]*x_eval.^(row - 1)*f[col]
            else
                f_eval += coef[row, col]*x_eval.^(row - 1)*df[col - n]
            end
        end
    end
    return f_eval
end


n = 7
x = -cos.(linspace(0, pi, n))
f = 1./(1 + 25*x.^2)
df = -50*x./(1 + 25*x.^2).^2
x_eval = linspace(-1, 1, 500)
interp_f = compute_interp(x, f, x_eval, df=df)
plot(x_eval, 1./(1 + 25*x_eval.^2))
plot(x_eval, interp_f)
plot(x, f, ".")
ylim(0, 1.2)
show()

In both cases the result is the plot below.

Hermite interpolation using Vandermonde matrix.

And, if we try with a high \(n\), say \(n=43\), we can see the problems.

Comparison Python/Julia

Regarding number of lines we have: 61 in Python and 70 in Julia. The comparison in execution time is done with %%timeit magic command in IPython and @benchmark in Julia.

For Python:

%%timeit -n 100
n = 7
x = -np.cos(np.linspace(0, np.pi, n))
f = lambda x: 1/(1 + 25*x**2)
df = lambda x: -50*x/(1 + 25*x**2)**2
x_eval = np.linspace(-1, 1, 500)
interp_f = compute_interp(x, f(x), x_eval, df=df(x))

with result

100 loops, best of 3: 18.1 ms per loop

For Julia:

function bench()
   n = 7
   x = -cos.(linspace(0, pi, n))
   f(x) = 1./(1 + 25*x.^2)
   df(x) = -50*x./(1 + 25*x.^2).^2
   x_eval = linspace(-1, 1, 500)
   interp_f = compute_interp(x, f(x), x_eval, df=df(x))
end
@benchmark bench()

with result

BenchmarkTools.Trial:
  memory estimate:  3.13 MiB
  allocs estimate:  836
  --------------
  minimum time:     10.318 ms (0.00% GC)
  median time:      10.449 ms (0.00% GC)
  mean time:        11.362 ms (1.74% GC)
  maximum time:     26.646 ms (0.00% GC)
  --------------
  samples:          100
  evals/sample:     1

In this case, we can say that the Python code is roughly as fast as the Julia one.

Comparison Hermite/Lagrange interpolation

We want to compare Hermite interpolation with Lagrange interpolation for the same number of degrees of freedom. We use the same function for test purposes

\begin{equation*} f(x) = \frac{1}{1 + 25x^2} \end{equation*}

This is the Python code that does that part

n_dof = np.array(range(1, 20))
error_herm = np.zeros(19)
error_lag = np.zeros(19)
for cont, n in enumerate(n_dof):
    f = lambda x: 1/(1 + 25*x**2)
    df = lambda x: -50*x/(1 + 25*x**2)**2
    x = -np.cos(np.linspace(0, np.pi, n))
    x2 = -np.cos(np.linspace(0, np.pi, 2*n))
    x_eval = np.linspace(-1, 1, 500)
    herm = compute_interp(x, f(x), x_eval, df=df(x))
    lag = compute_interp(x2, f(x2), x_eval)
    fun = f(x_eval)
    error_herm[cont] = np.linalg.norm(fun - herm)/np.linalg.norm(fun)
    error_lag[cont] = np.linalg.norm(fun - lag)/np.linalg.norm(fun)

plt.plot(2*n_dof, error_lag)
plt.plot(2*n_dof, error_herm)
plt.xlabel("Number of degrees of freedom")
plt.ylabel("Relative error")
plt.legend(["Lagrange", "Hermite"])
plt.show()

And this is the comparison of the relative errors

Comparison of errors between Lagrange and Hermite interpolation.

In general, the Lagrange approximation is closer for this function.

Numerical methods challenge: Day 11

Nicolás Guarín-Zapata

2017-10-11 12:25

Comments

Lagrange interpolation: Inverting Vandermonde matrix

Today we have Lagrange interpolation, yet one more time. I will have a different approach to compute the interpolation; I will form the Vandermonde matrix \(V\) and solve the system

\begin{equation*} V\mathbf{c} = I \end{equation*}

where \(\mathbf{c}\) is the vector of coefficients and \(I\) is the identity matrix. This method, and the previous one are not stable and should not be used for the computation of higher order interpolants, even for optimally chosed sampling. It will start failing around 40 points. A better approach is to use the barycentric form of the interpolation.

In the example below we use Chebyshev nodes. The nodes are given by

\begin{equation*} x_k = \cos\left(\frac{2k-1}{2n}\pi\right), \quad k = 1, \ldots, n \end{equation*}

where \(n\) is the degree of the polynomial.

Following are the codes.

Python

from __future__ import division, print_function
import numpy as np
import matplotlib.pyplot as plt


def vander_mat(x):
    n = len(x)
    van = np.zeros((n, n))
    power = np.array(range(n))
    for row in range(n):
        van[row, :] = x[row]**power
    return van


def inter_coef(x):
    vand_mat = vander_mat(x)
    coef = np.linalg.solve(vand_mat, np.eye(len(x)))
    return coef


def compute_interp(x, f, x_eval):
    n = len(x)
    coef = inter_coef(x)
    f_eval = np.zeros_like(x_eval)
    for row in range(n):
        for col in range(n):
            f_eval += coef[row, col]*x_eval**row*f[col]
    return f_eval


n = 11
x = -np.cos(np.linspace(0, np.pi, n))
f = 1/(1 + 25*x**2)
x_eval = np.linspace(-1, 1, 500)
interp_f = compute_interp(x, f, x_eval)
plt.figure()
plt.plot(x_eval, 1/(1 + 25*x_eval**2))
plt.plot(x_eval, interp_f)
plt.plot(x, f, ".")
plt.ylim(0, 1.2)
plt.show()

Julia

using PyPlot


function vander_mat(x)
    n = length(x)
    van = zeros(n, n)
    power = 0:n-1
    for row = 1:n
        van[row, :] = x[row].^power
    end
    return van
end


function inter_coef(x)
    vand_mat = vander_mat(x)
    coef = vand_mat \ eye(length(x))
    return coef
end


function compute_interp(x, f, x_eval)
    n = length(x)
    coef = inter_coef(x)
    f_eval = zeros(x_eval)
    for row = 1:n
        for col = 1:n
            f_eval += coef[row, col]*x_eval.^(row - 1)*f[col]
        end
    end
    return f_eval
end


n = 11
x = - cos.(linspace(0, pi, n))
f = 1./(1 + 25*x.^2)
x_eval = linspace(-1, 1, 500)
interp_f = compute_interp(x, f, x_eval)
plot(x_eval, 1./(1 + 25*x_eval.^2))
plot(x_eval, interp_f)
plot(x, f, ".")
ylim(0, 1.2)
show()

In both cases the result is the plot below.

Lagrange interpolation using Vandermonde matrix.

And, if we try with a high \(n\), say \(n=45\), we can see the problems.

Comparison Python/Julia

Regarding number of lines we have: 41 in Python and 44 in Julia. The comparison in execution time is done with %%timeit magic command in IPython and @benchmark in Julia.

For Python:

%%timeit -n 100
n = 11
x = -np.cos(np.linspace(0, np.pi, n))
f = 1/(1 + 25*x**2)
x_eval = np.linspace(-1, 1, 500)
interp_f = compute_interp(x, f, x_eval)

with result

100 loops, best of 3: 7.86 ms per loop

For Julia:

function bench()
   x = - cos.(linspace(0, pi, n))
   f = 1./(1 + 25*x.^2)
   x_eval = linspace(-1, 1, 500)
   interp_f = compute_interp(x, f, x_eval)
   return nothing
end
@benchmark bench()

with result

BenchmarkTools.Trial:
  memory estimate:  32.23 MiB
  allocs estimate:  8277
  --------------
  minimum time:     114.282 ms (1.50% GC)
  median time:      122.061 ms (1.46% GC)
  mean time:        129.733 ms (1.90% GC)
  maximum time:     163.716 ms (1.98% GC)
  --------------
  samples:          39
  evals/sample:     1

In this case, we can say that the Python code is roughly 16 times faster than the Julia one.

Numerical methods challenge: Day 10

Nicolás Guarín-Zapata

2017-10-10 21:16

Comments

Lagrange interpolation: Runge phenomenon

Today we have Lagrange interpolation, again. Technically, I am not posting about a different method, but just using the same algorithm for interpolation. The difference is that I will change the sampling, that is, I will use non-uniform sampling.

The problem with uniform interpolation is known as Runge phenomenon and is illustrated in the following image.

One way to mitigate the problem is to use non-uniform sampling, such as Chebyshev nodes or Lobatto nodes. The former set minimizes the Runge phenomenon, while the latter maximizes the Vandermonde determinant.

In the example below we use Lobatto sampling. Lobatto nodes are the zeros of

\begin{equation*} (1 - x^2) P'_N(x) \end{equation*}

where \(P_N\) refers to the Nth Legendre polynomial. The use of these nodes is useful in numerical integration and spectral methods. Finding the zeroes of these polynomials might be cumbersome in general. Nevertheless, we use an approach originally implemented in MATLAB by Greg von Winckel that use Chebyshev nodes as first guess and then update this guess using Newton-Raphson method.

Following are the codes.

Python

from __future__ import division
from numpy import (zeros_like, pi, cos, zeros, amax, abs,
                   array, linspace, prod)
import matplotlib.pyplot as plt


def lagrange(x_int, y_int, x_new):
    y_new = zeros_like(x_new)
    for xi, yi in zip(x_int, y_int):
        y_new += yi*prod([(x_new - xj)/(xi - xj)
                         for xj in x_int if xi!=xj], axis=0)
    return y_new


def gauss_lobatto(N, tol=1e-15):
    x = -cos(linspace(0, pi, N))
    P = zeros((N, N))  # Vandermonde Matrix
    x_old = 2
    while amax(abs(x - x_old)) > tol:
        x_old = x
        P[:, 0] = 1
        P[:, 1] = x
        for k in range(2, N):
            P[:, k] = ((2 * k - 1) * x * P[:, k - 1] -
                       (k - 1) * P[:, k - 2]) / k
        x = x_old - (x * P[:, N - 1] - P[:, N - 2]) / (N * P[:, N - 1])
        print(x)
    return array(x)


runge = lambda x: 1/(1 + 25*x**2)
x = linspace(-1, 1, 100)
x_int = linspace(-1, 1, 11)
x_int2 = gauss_lobatto(11)
x_new = linspace(-1, 1, 1000)
y_new = lagrange(x_int, runge(x_int), x_new)
y_new2 = lagrange(x_int2, runge(x_int2), x_new)
plt.plot(x, runge(x), "k")
plt.plot(x_new, y_new)
plt.plot(x_new, y_new2)
plt.legend(["Runge function",
            "Uniform interpolation",
            "Lobatto-sampling interpolation"])
plt.xlabel("x")
plt.ylabel("y")
plt.show()

Julia

using PyPlot


function lagrange(x_int, y_int, x_new)
    y_new = zeros(x_new)
    for (xi, yi) in zip(x_int, y_int)
        prod = ones(x_new)
        for xj in x_int
            if xi != xj
                prod = prod.* (x_new - xj)/(xi - xj)
            end
        end
        y_new += yi*prod
    end
    return y_new
end


function gauss_lobatto(N; tol=1e-15)
    x = -cos.(linspace(0, pi, N))
    P = zeros(N, N)  # Vandermonde Matrix
    x_old = 2
    while maximum(abs.(x - x_old)) > tol
        x_old = x
        P[:, 1] = 1
        P[:, 2] = x
        for k = 3:N
            P[:, k] = ((2 * k - 1) * x .* P[:, k - 1] -
                       (k - 1) * P[:, k - 2]) / k
        end
        x = x_old - (x .* P[:, N] - P[:, N - 1]) ./ (N* P[:, N])
    end
    return x
end


runge(x) =  1./(1 + 25*x.^2)
x = linspace(-1, 1, 100)
x_int = linspace(-1, 1, 11)
x_int2 = gauss_lobatto(11)
x_new = linspace(-1, 1, 1000)
y_new = lagrange(x_int, runge(x_int), x_new)
y_new2 = lagrange(x_int2, runge(x_int2), x_new)
plot(x, runge(x), "k")
plot(x_new, y_new)
plot(x_new, y_new2)
legend(["Runge function",
            "Uniform interpolation",
            "Lobatto-sampling interpolation"])
xlabel("x")
ylabel("y")

In both cases the result is plot shown above.

Numerical methods challenge: Day 9

Nicolás Guarín-Zapata

2017-10-09 21:17

Comments

Lagrange interpolation

Today we have Lagrange interpolation. This is defined by

\begin{equation*} L(x) = \sum_{j=0}^{k} y_j \prod_{m\neq j}\frac{x - x_m}{x_j - x_m} \end{equation*}

with \(x\) the point/points where we want to interpolate.

We will test the method with a sigmoid

\begin{equation*} f(x) = \frac{1}{1 + e^{-x}} \end{equation*}

Following are the codes.

Python

from __future__ import division
from numpy import zeros_like, linspace, exp, prod
import matplotlib.pyplot as plt


def lagrange(x_int, y_int, x_new):
    y_new = zeros_like(x_new)
    for xi, yi in zip(x_int, y_int):
        y_new += yi*prod([(x_new - xj)/(xi - xj)
                         for xj in x_int if xi!=xj], axis=0)
    return y_new


x_int = linspace(-5, 5, 11)
y_int = 1/(1 + exp(-x_int))
x_new = linspace(-5, 5, 1000)
y_new = lagrange(x_int, y_int, x_new)
plt.plot(x_int, y_int, "ok")
plt.plot(x_new, y_new)
plt.show()

Julia

using PyPlot

function lagrange(x_int, y_int, x_new)
    y_new = zeros(x_new)
    for (xi, yi) in zip(x_int, y_int)
        prod = ones(x_new)
        for xj in x_int
            if xi != xj
                prod = prod.* (x_new - xj)/(xi - xj)
            end
        end
        y_new += yi*prod
    end
    return y_new
end


x_int = linspace(-5, 5, 11)
y_int = 1./(1 + exp.(-x_int))
x_new = linspace(-5, 5, 1000)
y_new = lagrange(x_int, y_int, x_new)
plot(x_int, y_int, "ok")
plot(x_new, y_new)

In both cases the result is the following plot.

Comparison Python/Julia

Regarding number of lines we have: 34 in Python and 37 in Julia. The comparison in execution time is done with %timeit magic command in IPython and @benchmark in Julia.

For Python:

%timeit lagrange(x_int, y_int, x_new)

with result

1000 loops, best of 3: 1.55 ms per loop

For Julia:

@benchmark newton_opt(rosen, rosen_grad, rosen_hess, [2.0, 1.0])

with result

BenchmarkTools.Trial:
  memory estimate:  1.97 MiB
  allocs estimate:  254
  --------------
  minimum time:     737.665 μs (0.00% GC)
  median time:      811.633 μs (0.00% GC)
  mean time:        916.450 μs (10.77% GC)
  maximum time:     3.119 ms (64.40% GC)
  --------------
  samples:          5433
  evals/sample:     1

In this case, we can say that the Python code is roughly 2 times slower than the Julia one, where probably I am not using the best approach for Julia.

Numerical methods challenge: Day 8

Nicolás Guarín-Zapata

2017-10-09 16:15

Comments

Newton's method for optimization

Today we have Newton's method for optimization. The main difference of this method with gradient descent is the use of higher derivatives, namely, the Hessian matrix of the objective function. The use of higher derivatives provides information of the curvature besides the slope information used in gradient descent. The following image compare the path for gradient descent (green) and Newton's method (red).

Comparison of the path followed by gradient descent and Newton's method.

Mathematically, the update is written as

\begin{equation*} \mathbf{x}_k = \mathbf{x}_{k-1} - H(\mathbf{x}_k)^{-1} \nabla f(\mathbf{x_k}) \end{equation*}

where \(H(\mathbf{x}_k)\) is the Hessian matrix at step k.

We will test the method with the Rosenbrock's function

\begin{equation*} f(x_1, x_2) = (1-x_1)^2 + 100(x_2-{x_1}^2)^2 \end{equation*}

Following are the codes.

Python

from __future__ import division, print_function
from numpy import array
from numpy.linalg import norm, solve


def newton_opt(fun, grad, hess, x, niter=50, gtol=1e-8, verbose=False):
    msg = "Maximum number of iterations reached."
    g = grad(x)
    for cont in range(niter):
        if verbose:
            print("n: {}, x: {}, g: {}".format(cont, x, g))
        x = x - solve(hess(x), g)
        g = grad(x)
        if norm(g) < gtol:
            msg = "Extremum found with desired accuracy."
            break
    return x, fun(x), msg


def rosen(x):
    return (1 - x[0])**2 + 100*(x[1] - x[0]**2)**2


def rosen_grad(x):
    return array([
        -2*(1 - x[0]) - 400*x[0]*(x[1] - x[0]**2),
        200*(x[1] - x[0]**2)])

def rosen_hess(x):
    return array([[-400*(x[1]-x[0]**2) + 800*x[0]**2 + 2, -400*x[0]],
                 [-400*x[0], 200]])


print(newton_opt(rosen, rosen_grad, rosen_hess, [2.0, 1.0], verbose=True))

with result

n: 0, x: [2.0, 1.0], g: [ 2402.  -600.]
n: 1, x: [ 1.99833611  3.99334443], g: [  1.99888520e+00  -5.53708323e-04]
n: 2, x: [ 1.00055248  0.0055331 ], g: [ 398.44998412 -199.11443262]
n: 3, x: [ 1.00054972  1.00109974], g: [  1.09944359e-03  -1.52451385e-09]
n: 4, x: [ 1.         0.9999997], g: [  1.20876952e-04  -6.04384753e-05]
(array([ 1.,  1.]), 0.0, 'Extremum found with desired accuracy.'

Julia

function newton_opt(fun, grad, hess, x; niter=50, gtol=1e-8, verbose=false)
    msg = "Maximum number of iterations reached."
    g = grad(x)
    for cont = 1:niter
        if verbose
            println("n: $(cont), x: $(x), g: $(g)")
        end
        x = x - hess(x)\g
        g = grad(x)
        if norm(g) < gtol
            msg = "Extremum found with desired accuracy."
            break
        end
    end
    return x, fun(x), msg
end


function rosen(x)
    return (1 - x[1])^2 + 100*(x[2] - x[1]^2)^2
end


function rosen_grad(x)
    return [-2*(1 - x[1]) - 400*x[1]*(x[2] - x[1]^2);
            200*(x[2] - x[1]^2)]
end


function rosen_hess(x)
    return [-400*(x[2] - x[1]^2) + 800*x[1]^2 + 2 -400*x[1];
            -400*x[1] 200]
end



println(newton_opt(rosen, rosen_grad, rosen_hess, [2.0, 1.0], verbose=true))

with result

n: 1, x: [2.0, 1.0], g: [2402.0, -600.0]
n: 2, x: [1.99834, 3.99334], g: [1.99889, -0.000553708]
n: 3, x: [1.00055, 0.0055331], g: [398.45, -199.114]
n: 4, x: [1.00055, 1.0011], g: [0.00109944, -1.52451e-9]
n: 5, x: [1.0, 1.0], g: [0.000120877, -6.04385e-5]
([1.0, 1.0], 0.0, "Extremum found with desired accuracy.")

Comparison Python/Julia

Regarding number of lines we have: 34 in Python and 37 in Julia. The comparison in execution time is done with %timeit magic command in IPython and @benchmark in Julia.

For Python:

%timeit newton_opt(rosen, rosen_grad, rosen_hess, [2.0, 1.0])

with result

1000 loops, best of 3: 247 µs per loop

For Julia:

@benchmark newton_opt(rosen, rosen_grad, rosen_hess, [2.0, 1.0])

with result

BenchmarkTools.Trial:
  memory estimate:  5.48 KiB
  allocs estimate:  120
  --------------
  minimum time:     5.784 μs (0.00% GC)
  median time:      6.030 μs (0.00% GC)
  mean time:        6.953 μs (10.00% GC)
  maximum time:     572.279 μs (95.96% GC)
  --------------
  samples:          10000
  evals/sample:     6

In this case, we can say that the Python code is roughly 40 times slower than the Julia one.

Comparison with gradient descent

We see an improvement in the number of iterations compared with gradient descent, that is, we moved from 40 iterations to 4 iterations, even if we demand the method to have higher accuracy, \(10^{-12}\), for example.

The appearance of this faster convergence does not come for free, of course. When using Newton's method we have two major drawbacks:

We need to compute the Hessian of the function, that might prove really difficult even if we have the analytical expression for our function.
We need to solve a system of equation in each iteration.

Numerical methods challenge: Day 7

Nicolás Guarín-Zapata

2017-10-07 14:51

Comments

Nelder-Mead method

Today we have Nelder-Mead method. This method uses a simplex over \(\mathbb{R}^n\), e.g., a triangle in \(\mathbb{R}^2\) or a tetrahedron over \(\mathbb{R}^3\).

In a single step of the method, we remove the vertex with the worst function value and replace it with another one with a better value. The new point is obtained by reflecting, expanding, or contracting the simplex along the line joining the worst vertex with the centroid of the remaining vertices. If we cannot find a better point in this manner, we retain only the vertex with the best function value, and we shrink the simplex by moving all other vertices towards this value. Nocedal and Wright (2006), Numerical Optimization.

Following are the codes.

Python

from __future__ import division, print_function
import numpy as np
from numpy import array
from numpy.linalg import norm


def nelder_mead_step(fun, verts, alpha=1, gamma=2, rho=0.5,
                     sigma=0.5):
    """Nelder-Mead iteration according to Wikipedia _[1]


    References
    ----------
     .. [1] Wikipedia contributors. "Nelder–Mead method." Wikipedia,
         The Free Encyclopedia. Wikipedia, The Free Encyclopedia,
         1 Sep. 2016. Web. 20 Sep. 2016.
    """
    nverts, _ = verts.shape
    f = np.apply_along_axis(fun, 1, verts)
    # 1. Order
    order = np.argsort(f)
    verts = verts[order, :]
    f = f[order]
    # 2. Calculate xo, the centroid"
    xo = verts[:-1, :].mean(axis=0)
    # 3. Reflection
    xr = xo + alpha*(xo - verts[-1, :])
    fr = fun(xr)
    if f[0]<=fr and fr<f[-2]:
        new_verts = np.vstack((verts[:-1, :], xr))
    # 4. Expansion
    elif fr<f[0]:
        xe = xo + gamma*(xr - xo)
        fe = fun(xe)
        if fe < fr:
            new_verts = np.vstack((verts[:-1, :], xe))
        else:
            new_verts = np.vstack((verts[:-1, :], xr))
    # 5. Contraction
    else:
        xc = xo + rho*(verts[-1, :] - xo)
        fc = fun(xc)
        if fc < f[-1]:
            new_verts = np.vstack((verts[:-1, :], xc))
    # 6. Shrink
        else:
            new_verts = np.zeros_like(verts)
            new_verts[0, :] = verts[0, :]
            for k in range(1, nverts):
                new_verts[k, :] = sigma*(verts[k,:] - verts[0,:])

    return new_verts


def nelder_mead(fun, x, niter=200, atol=1e-8, verbose=False):
    msg = "Maximum number of iterations reached."
    f_old = fun(x.mean(0))
    for cont in range(niter):
        if verbose:
            print("n: {}, x: {}".format(cont, x.mean(0)))
        x = nelder_mead_step(fun, x)
        df = fun(x.mean(0)) - f_old
        f_old = fun(x.mean(0))
        if norm(df) < atol:
            msg = "Extremum found with desired accuracy."
            break
    return x.mean(0), f_old, msg


def rosen(x):
    return (1 - x[0])**2 + 100*(x[1] - x[0]**2)**2


x = array([[1, 0],
           [1, 1],
           [2, 0]])
print(nelder_mead(rosen, x))

with result

(array([0.99994674, 0.99987728]), 2.90769311467343e-08, 'Extremum found with desired accuracy.')

Julia

function nelder_mead_step(fun, verts; alpha=1, gamma=2, rho=0.5,
                     sigma=0.5)
    nverts, _ = size(verts)
    f = [fun(verts[row, :]) for row in 1:nverts]
    # 1. Order
    order = sortperm(f)
    verts = verts[order, :]
    f = f[order]
    # 2. Calculate xo, the centroid
    xo = mean(verts[1:end - 1, :], 1)
    # 3. Reflection
    xr = xo + alpha*(xo - verts[end, :]')
    fr = fun(xr)
    if f[1]<=fr && fr<f[end - 1]
        new_verts = [verts[1:end-1, :]; xr]
    # 4. Expansion
    elseif fr<f[1]
        xe = xo + gamma*(xr - xo)
        fe = fun(xe)
        if fe < fr
            new_verts = [verts[1:end-1, :]; xe]
        else
            new_verts = [verts[1:end-1, :]; xr]
        end
    # 5. Contraction
    else
        xc = xo + rho*(verts[end, :]' - xo)
        fc = fun(xc)
        if fc < f[end]
            new_verts = [verts[1:end-1, :]; xc]
    # 6. Shrink
        else
            new_verts = zeros(verts)
            new_verts[1, :] = verts[1, :]
            for k =  1:nverts
                new_verts[k, :] = sigma*(verts[k,:] - verts[1,:])
            end
        end
    end

    return new_verts
end


function nelder_mead(fun, x; niter=50, atol=1e-8, verbose=false)
    msg = "Maximum number of iterations reached."
    f_old = fun(mean(x, 1))
    for cont = 1:niter
        if verbose
            println("n: $(cont), x: $(mean(x, 1))")
        end
        x = nelder_mead_step(fun, x)
        df = fun(mean(x, 1)) - f_old
        f_old = fun(mean(x, 1))
        if norm(df) < atol
            msg = "Extremum found with desired accuracy."
            break
        end
    end
    return mean(x, 1), f_old, msg
end


function rosen(x)
    return (1 - x[1])^2 + 100*(x[2] - x[1]^2)^2
end


x = [1 0;
    1 1;
    2 0]
println(nelder_mead(rosen, x, verbose=false))

with result

([0.999947 0.999877], 2.9076931147093985e-8, "Extremum found with desired accuracy.")

Comparison Python/Julia

Regarding number of lines we have: 38 in Python and 39 in Julia. The comparison in execution time is done with %timeit magic command in IPython and @benchmark in Julia.

For Python:

%timeit nelder_mead(rosen, x)

with result

100 loops, best of 3: 7.82 ms per loop

For Julia:

@benchmark grad_descent(rosen, rosen_grad, [2.0, 1.0])

with result

BenchmarkTools.Trial:
  memory estimate:  162.23 KiB
  allocs estimate:  4780
  --------------
  minimum time:     462.926 μs (0.00% GC)
  median time:      506.511 μs (0.00% GC)
  mean time:        552.411 μs (3.86% GC)
  maximum time:     5.179 ms (80.31% GC)
  --------------
  samples:          9008
  evals/sample:     1

In this case, we can say that the Python code is roughly 15 times slower than the Julia one.

Numerical methods challenge: Day 6

Nicolás Guarín-Zapata

2017-10-06 20:05

Comments

Gradient descent

Today we have gradient descent method. The main idea in this method is to use the steepest direction to move from a point to a new one in the neighbourhood of a differentiable function. This is illustrated in the following image, where we can see that the steps are given perpendicular to the isocontours of the function. If we think the function as the topography of certain region, it would probably be safe to think the isocontour of the functions as the path that cows would like to walk, while the gradient direction the path for a mountain goat.

Mathematically, the update is written as

\begin{equation*} \mathbf{x}_k = \mathbf{x}_{k-1} - \gamma_k \nabla f(\mathbf{x_k}) \end{equation*}

where \(\gamma_k\) is the size of the kth step. Intuitively, we can see that the size of the step should decrease while we approach the extremum of the function. This is normally achieving by a line search. For gradient descent method, we could use the Barzilai-Borwein method:

\begin{equation*} \gamma_{n} = \frac{(\mathbf x_{n} - \mathbf x_{n-1})^T[\nabla f(\mathbf x_{n}) - \nabla f(\mathbf x_{n-1})]}{||\nabla f(\mathbf x_{n}) - \nabla f(\mathbf x_{n-1})||^2} \end{equation*}

We will test the method with the Rosenbrock's function

\begin{equation*} f(x_1, x_2) = (1-x_1)^2 + 100(x_2-{x_1}^2)^2 \end{equation*}

Gradient descent present some problems with this function, that is not convex. Nevertheless, if we use an initial point that is close enough to the minimum \([1, 1]\), we can achieve convergence.

Following are the codes.

Python

from __future__ import division, print_function
from numpy import array
from numpy.linalg import norm


def grad_descent(fun, grad, x, niter=50, gtol=1e-8, verbose=False):
    msg = "Maximum number of iterations reached."
    g_old = grad(x)
    gamma = 0.1
    for cont in range(niter):
        if verbose:
            print("n: {}, x: {}, g: {}".format(cont, x, g_old))
        dx = -gamma*g_old
        x = x + dx
        g = grad(x)
        dg = g - g_old
        g_old = g
        gamma = dx.dot(dg)/dg.dot(dg)
        if norm(g) < gtol:
            msg = "Extremum found with desired accuracy."
            break
    return x, fun(x), msg


def rosen(x):
    return (1 - x[0])**2 + 100*(x[1] - x[0]**2)**2


def rosen_grad(x):
    return array([
        -2*(1 - x[0]) - 400*x[0]*(x[1] - x[0]**2),
        200*(x[1] - x[0]**2)])


print(grad_descent(rosen, rosen_grad, [2.0, 1.0], verbose=True))

with result

n: 0, x: [2.0, 1.0], g: [ 2402.  -600.]
n: 1, x: [-238.2   61. ], g: [ -5.40030319e+09  -1.13356480e+07]
n: 2, x: [  1.87289769  61.50393131], g: [-43446.62297136  11599.23711122]
n: 3, x: [  1.87482914  61.50341566], g: [-43485.61077531  11597.68626767]
n: 4, x: [ -0.2532018   62.07096513], g: [  6277.59251909  12401.37079452]
n: 5, x: [  1.40217916e-02   6.25988648e+01], g: [  -353.0701476   12519.73363327]
n: 6, x: [  2.98797952e-04   6.30854769e+01], g: [ -9.53932693e+00   1.26170954e+04]
n: 7, x: [  3.49096082e-03   5.88633946e+01], g: [   -84.18892291  11772.67648837]
n: 8, x: [ 0.42114221  0.46054405], g: [-48.8618897  56.6366569]
n: 9, x: [ 0.66471507  0.17821457], g: [ 69.42537577 -52.72631034]
n: 10, x: [ 0.50504193  0.29948111], g: [-9.96223909  8.88275049]
n: 11, x: [ 0.52491812  0.28175867], g: [-2.25608389  1.24392746]
n: 12, x: [ 0.53044731  0.27871006], g: [-0.37379949 -0.53285773]
n: 13, x: [ 0.53133016  0.27996858], g: [-0.43934324 -0.46863181]
n: 14, x: [ 0.53252827  0.28124656], g: [-0.4365402  -0.46795943]
n: 15, x: [ 0.75411231  0.51877873], g: [ 14.56231057  -9.98132891]
n: 16, x: [ 0.7050077   0.55243611], g: [-16.21302574  11.08005   ]
n: 17, x: [ 0.73088975  0.53474821], g: [-0.69854407  0.10967699]
n: 18, x: [ 0.73204207  0.53456728], g: [-0.14989113 -0.26366293]
n: 19, x: [ 0.73228024  0.53498623], g: [-0.16984866 -0.24962496]
n: 20, x: [ 0.73260201  0.53545913], g: [-0.16949786 -0.24931553]
n: 21, x: [ 0.93339279  0.83080364], g: [ 14.95730865  -8.08369381]
n: 22, x: [ 0.89610321  0.85095683], g: [-17.39715957   9.59117536]
n: 23, x: [ 0.91610476  0.83992984], g: [-0.41766894  0.13638094]
n: 24, x: [ 0.91659561  0.83976956], g: [-0.02823623 -0.07559088]
n: 25, x: [ 0.91662795  0.83985612], g: [-0.03817128 -0.0701336 ]
n: 26, x: [ 0.91667285  0.83993863], g: [-0.03814167 -0.07009733]
n: 27, x: [ 0.99186712  0.97813179], g: [ 2.23273615 -1.13372137]
n: 28, x: [ 0.98342663  0.98241763], g: [-6.0476644   3.05793919]
n: 29, x: [ 0.98959509  0.97929862], g: [ -2.08791162e-02   3.50119013e-05]
n: 30, x: [ 0.98961644  0.97929858], g: [-0.00409146 -0.00842531]
n: 31, x: [ 0.9896206   0.97930713], g: [-0.00421464 -0.00835884]
n: 32, x: [ 0.98963284  0.97933141], g: [-0.0042095  -0.00834897]
n: 33, x: [ 0.99991222  0.99971914], g: [ 0.04194186 -0.02106056]
n: 34, x: [ 0.99597256  1.00169739], g: [-3.88678974  1.9472097 ]
n: 35, x: [ 0.99987194  0.99974387], g: [ -2.42382231e-04  -6.86507784e-06]
n: 36, x: [ 0.99987219  0.99974388], g: [ -5.01566886e-05  -1.02746923e-04]
n: 37, x: [ 0.99987224  0.99974398], g: [ -5.10336616e-05  -1.02258276e-04]
n: 38, x: [ 0.99987255  0.99974461], g: [ -5.09073070e-05  -1.02006794e-04]
n: 39, x: [ 0.99999998  0.99999996], g: [  5.05854337e-06  -2.54465444e-06]
n: 40, x: [ 0.99998735  1.00000631], g: [-0.01266881  0.00632184]
(array([ 1.,  1.]), 7.2961338114681859e-21, 'Extremum found with desired accuracy.')

Julia

function grad_descent(fun, grad, x; niter=50, gtol=1e-8, verbose=false)
    msg = "Maximum number of iterations reached."
    g_old = grad(x)
    gamma = 0.1
    for cont = 1:niter
        if verbose
            println("n: $(cont), x: $(x), g: $(g_old)")
        end
        dx = - gamma*g_old
        x = x + dx
        g = grad(x)
        dg = g - g_old
        g_old = g
        gamma = dx' * dg / (dg' * dg)
        if norm(g) < gtol
            msg = "Extremum found with desired accuracy."
            break
        end
    end
    return x, fun(x), msg
end


function rosen(x)
    return (1 - x[1])^2 + 100*(x[2] - x[1]^2)^2
end


function rosen_grad(x)
    return [-2*(1 - x[1]) - 400*x[1]*(x[2] - x[1]^2);
            200*(x[2] - x[1]^2)]
end


println(grad_descent(rosen, rosen_grad, [2.0, 1.0], verbose=true))

with result

n: 1, x: [2.0, 1.0], g: [2402.0, -600.0]
n: 2, x: [-238.2, 61.0], g: [-5.4003e9, -1.13356e7]
n: 3, x: [1.8729, 61.5039], g: [-43446.6, 11599.2]
n: 4, x: [1.87483, 61.5034], g: [-43485.6, 11597.7]
n: 5, x: [-0.253202, 62.071], g: [6277.59, 12401.4]
n: 6, x: [0.0140218, 62.5989], g: [-353.07, 12519.7]
n: 7, x: [0.000298798, 63.0855], g: [-9.53933, 12617.1]
n: 8, x: [0.00349096, 58.8634], g: [-84.1889, 11772.7]
n: 9, x: [0.421142, 0.460544], g: [-48.8619, 56.6367]
n: 10, x: [0.664715, 0.178215], g: [69.4254, -52.7263]
n: 11, x: [0.505042, 0.299481], g: [-9.96224, 8.88275]
n: 12, x: [0.524918, 0.281759], g: [-2.25608, 1.24393]
n: 13, x: [0.530447, 0.27871], g: [-0.373799, -0.532858]
n: 14, x: [0.53133, 0.279969], g: [-0.439343, -0.468632]
n: 15, x: [0.532528, 0.281247], g: [-0.43654, -0.467959]
n: 16, x: [0.754112, 0.518779], g: [14.5623, -9.98133]
n: 17, x: [0.705008, 0.552436], g: [-16.213, 11.08]
n: 18, x: [0.73089, 0.534748], g: [-0.698544, 0.109677]
n: 19, x: [0.732042, 0.534567], g: [-0.149891, -0.263663]
n: 20, x: [0.73228, 0.534986], g: [-0.169849, -0.249625]
n: 21, x: [0.732602, 0.535459], g: [-0.169498, -0.249316]
n: 22, x: [0.933393, 0.830804], g: [14.9573, -8.08369]
n: 23, x: [0.896103, 0.850957], g: [-17.3972, 9.59118]
n: 24, x: [0.916105, 0.83993], g: [-0.417669, 0.136381]
n: 25, x: [0.916596, 0.83977], g: [-0.0282362, -0.0755909]
n: 26, x: [0.916628, 0.839856], g: [-0.0381713, -0.0701336]
n: 27, x: [0.916673, 0.839939], g: [-0.0381417, -0.0700973]
n: 28, x: [0.991867, 0.978132], g: [2.23274, -1.13372]
n: 29, x: [0.983427, 0.982418], g: [-6.04766, 3.05794]
n: 30, x: [0.989595, 0.979299], g: [-0.0208791, 3.50119e-5]
n: 31, x: [0.989616, 0.979299], g: [-0.00409146, -0.00842531]
n: 32, x: [0.989621, 0.979307], g: [-0.00421464, -0.00835884]
n: 33, x: [0.989633, 0.979331], g: [-0.0042095, -0.00834897]
n: 34, x: [0.999912, 0.999719], g: [0.0419419, -0.0210606]
n: 35, x: [0.995973, 1.0017], g: [-3.88679, 1.94721]
n: 36, x: [0.999872, 0.999744], g: [-0.000242382, -6.86508e-6]
n: 37, x: [0.999872, 0.999744], g: [-5.01567e-5, -0.000102747]
n: 38, x: [0.999872, 0.999744], g: [-5.10337e-5, -0.000102258]
n: 39, x: [0.999873, 0.999745], g: [-5.09073e-5, -0.000102007]
n: 40, x: [1.0, 1.0], g: [5.05854e-6, -2.54465e-6]
n: 41, x: [0.999987, 1.00001], g: [-0.0126688, 0.00632184]
([1.0, 1.0], 7.296133811468186e-21, "Root found with desired accuracy.")

Comparison Python/Julia

Regarding number of lines we have: 38 in Python and 39 in Julia. The comparison in execution time is done with %timeit magic command in IPython and @benchmark in Julia.

For Python:

%timeit grad_descent(rosen, rosen_grad, [2.0, 1.0])

with result

1000 loops, best of 3: 386 µs per loop

For Julia:

@benchmark grad_descent(rosen, rosen_grad, [2.0, 1.0])

with result

BenchmarkTools.Trial:
  memory estimate:  16.91 KiB
  allocs estimate:  251
  --------------
  minimum time:     6.479 μs (0.00% GC)
  median time:      7.393 μs (0.00% GC)
  mean time:        13.437 μs (18.45% GC)
  maximum time:     2.029 ms (95.94% GC)
  --------------
  samples:          10000
  evals/sample:     5

In this case, we can say that the Python code is roughly 50 times slower than the Julia one.

Numerical methods challenge: Day 5

Nicolás Guarín-Zapata

2017-10-05 13:21

Comments

Broyden's method

Today we have Broyden's method. The main idea in this method is to compute the Jacobian matrix just at the first iteration and change it for each iteration doing rank-1 updates.

\begin{equation*} \mathbf{x}_k = \mathbf{x}_{k-1} - J_n \mathbf{F}(\mathbf{x_k}) \end{equation*}

where we need estimate the Jacobian matrix at step \(n\) by

\begin{equation*} \mathbf J_n = \mathbf J_{n - 1} + \frac{\Delta \mathbf f_n - \mathbf J_{n - 1} \Delta \mathbf x_n}{\|\Delta \mathbf x_n\|^2} \Delta \mathbf x_n^{\mathrm T} \end{equation*}

that correspond to rank-1 updates of the Jacobian matrix.

We will test the method with the function \(\mathbf{F}(x, y) = (x + 2y - 2, x^2 + 4x - 4)\) with solution \(\mathbf{x} = (0, 1)\).

Following are the codes.

Python

from __future__ import division, print_function
from numpy import array, outer, dot
from numpy.linalg import solve, norm, det

def broyden(fun, jaco, x, niter=50, ftol=1e-12, verbose=False):
    msg = "Maximum number of iterations reached."
    J = jaco(x)
    for cont in range(niter):
        if det(J) < ftol:
            x = None
            msg = "Derivative near to zero."
            break
        if verbose:
            print("n: {}, x: {}".format(cont, x))
        f_old = fun(x)
        dx = -solve(J, f_old)
        x = x + dx
        f = fun(x)
        df = f - f_old
        J = J + outer(df - dot(J, dx), dx)/dot(dx, dx)
        if norm(f) < ftol:
            msg = "Root found with desired accuracy."
            break
    return x, msg


def fun(x):
    return array([x[0] + 2*x[1] - 2, x[0]**2 + 4*x[1]**2 - 4])


def jaco(x):
    return array([
            [1, 2],
            [2*x[0], 8*x[1]]])


print(broyden(fun, jaco, [1.0, 2.0]))

Julia

function broyden(fun, jaco, x, niter=50, ftol=1e-12, verbose=false)
    msg = "Maximum number of iterations reached."
    J = jaco(x)
    for cont = 1:niter
        if det(J) < ftol
            x = nothing
            msg = "Derivative near to zero."
            break
        end
        if verbose
            println("n: $(cont), x: $(x)")
        end
        f_old = fun(x)
        dx = -J\f_old
        x = x + dx
        f = fun(x)
        df = f - f_old
        J = J + (df - J*dx) * dx'/ (dx' * dx)
        if norm(f) < ftol
            msg = "Root found with desired accuracy."
            break
        end
    end
    return x, msg
end

function fun(x)
    return [x[1] + 2*x[2] - 2, x[1]^2 + 4*x[2]^2 - 4]
end


function jaco(x)
    return [1 2;
           2*x[1] 8*x[2]]
end


println(broyden(fun, jaco, [1.0, 2.0]))

Comparison Python/Julia

Regarding number of lines we have: 38 in Python and 39 in Julia. The comparison in execution time is done with %timeit magic command in IPython and @benchmark in Julia.

For Python:

%timeit broyden(fun, jaco, [1.0, 2.0])

with result

1000 loops, best of 3: 703 µs per loop

For Julia:

@benchmark broyden(fun, jaco, [1.0, 2.0])

with result

BenchmarkTools.Trial:
  memory estimate:  14.41 KiB
  allocs estimate:  220
  --------------
  minimum time:     12.099 μs (0.00% GC)
  median time:      12.867 μs (0.00% GC)
  mean time:        15.378 μs (10.78% GC)
  maximum time:     3.511 ms (97.53% GC)
  --------------
  samples:          10000
  evals/sample:     1

In this case, we can say that the Python code is roughly 50 times slower than the Julia one.

Comparison Newton/Broyden

Following, we are comparing Newton's and Broyden's method. We are using the function \(\mathbf{x}^T \mathbf{x} + \mathbf{x}\) for this test.

Python

The code for the function and Jacobian is

from numpy import diag
fun = lambda x: x**2 + x
jaco = lambda x: diag(2*x + 1)

and the results are:

n	Newton (μs)	Broyden (μs)
2	500	664
10	541	717
100	3450	4800

Julia

The code for the function and Jacobian is

fun(x) = x' * x + x
jaco(x) = diagm(2*x + 1)

and the results are:

n	Newton (μs)	Broyden (μs)
2	1.76	1.65
10	56.42	5.12
100	1782	367

In this case, we are comparing the mean values of the results from @benchmark.

Numerical methods challenge: Day 4

Nicolás Guarín-Zapata

2017-10-04 21:30

Comments

Newton's method: vector case

Today we have Newton's method one more time. In this case, the function is vector-valued. This implies a slight modification over the original post. The new approximation is computed from the old one using

\begin{equation*} \mathbf{x}_k = \mathbf{x}_{k-1} - J(\mathbf{x}_k)^{-1} \mathbf{F}(\mathbf{x_k}) \end{equation*}

where we need to use the Jacobian matrix \(J\).

We will test the method with the function \(\mathbf{F}(x, y) = (x + 2y - 2, x^2 + 4x - 4)\) with solution \(\mathbf{x} = (0, 1)\).

Following are the codes.

Python

from __future__ import division, print_function
from numpy import array
from numpy.linalg import solve, norm, det

def newton(fun, jaco, x, niter=50, ftol=1e-12, verbose=False):
    msg = "Maximum number of iterations reached."
    for cont in range(niter):
        J = jaco(x)
        f = fun(x)
        if det(J) < ftol:
            x = None
            msg = "Derivative near to zero."
            break
        if verbose:
            print("n: {}, x: {}".format(cont, x))
        x = x - solve(J, f)
        if norm(f) < ftol:
            msg = "Root found with desired accuracy."
            break
    return x, msg


def fun(x):
    return array([x[0] + 2*x[1] - 2, x[0]**2 + 4*x[1]**2 - 4])


def jaco(x):
    return array([
            [1, 2],
            [2*x[0], 8*x[1]]])


print(newton(fun, jaco, [1.0, 10.0]))

Julia

function newton(fun, jaco, x, niter=50, ftol=1e-12, verbose=false)
    msg = "Maximum number of iterations reached."
    for cont = 1:niter
        J = jaco(x)
        f = fun(x)
        if det(J) < ftol
            x = nothing
            msg = "Derivative near to zero."
            break
        end
        if verbose
            println("n: $(cont), x: $(x)")
        end
        x = x - J\f
        if norm(f) < ftol
            msg = "Root found with desired accuracy."
            break
        end
    end
    return x, msg
end


function fun(x)
    return [x[1] + 2*x[2] - 2, x[1]^2 + 4*x[2]^2 - 4]
end


function jaco(x)
    return [1.0 2.0;
            2*x[1] 8*x[2]]
end


println(newton(fun, jaco, [1.0, 10.0]))

Comparison

Regarding number of lines we have: 31 in Python and 33 in Julia. The comparison in execution time is done with %timeit magic command in IPython and @benchmark in Julia.

For Python:

%timeit newton(fun, jaco, [1.0, 10.0])

with result

1000 loops, best of 3: 284 µs per loop

For Julia:

@benchmark newton(fun, jaco, [1.0, 10.0])

with result

BenchmarkTools.Trial:
  memory estimate:  10.44 KiB
  allocs estimate:  192
  --------------
  minimum time:     6.818 μs (0.00% GC)
  median time:      7.167 μs (0.00% GC)
  mean time:        9.607 μs (16.53% GC)
  maximum time:     2.953 ms (97.40% GC)
  --------------
  samples:          10000
  evals/sample:     4

In this case, we can say that the Python code is roughly 40 times slower than the Julia one. This is an improvement compared to the previous examples, where the ratio was around 100. The reason for this "improvement" might be in the inversion of the Jacobian, that calls a numpy routine, doing the weight-lifting for us.

Numerical methods challenge: Day 3

Nicolás Guarín-Zapata

2017-10-03 19:26

Comments

Newton's method

Today's method is Newton's method. This method is used to solve the equation \(f(x) = 0\) for \(x\) real, and \(f\) a differentiable function. It starts with an initial guess \(x_0\) and it succesively refine it by finding the intercept of the tangent line to the function with zero. The new approximation is computed from the old one using

\begin{equation*} x_k = x_{k-1} - \frac{f(x)}{f'(x)} \end{equation*}

Convergence of this method is generally faster than bisection method. Nevertheless, the convergence is not guaranteed. Another drawback is the need of the derivative of the function.

We will use the function \(f(x) = \cos(x) - x\) to test the codes, and the initial guess is 1.0.

Following are the codes.

Python

from __future__ import division, print_function
from numpy import abs, cos, sin

def newton(fun, grad, x, niter=50, ftol=1e-12, verbose=False):
    msg = "Maximum number of iterations reached."
    for cont in range(niter):
        if abs(grad(x)) < ftol:
            x = None
            msg = "Derivative near to zero."
            break
        if verbose:
            print("n: {}, x: {}".format(cont, x))
        x = x - fun(x)/grad(x)
        if abs(fun(x)) < ftol:
            msg = "Root found with desired accuracy."
            break
    return x, msg


def fun(x):
    return cos(x) - x


def grad(x):
    return -sin(x) - 1.0


print(newton(fun, grad, 1.0))

Julia

function newton(fun, grad, x, niter=50, ftol=1e-12, verbose=false)
    msg = "Maximum number of iterations reached."
    for cont = 1:niter
        if abs(grad(x)) < ftol
            x = nothing
            msg = "Derivative near to zero."
            break
        end
        if verbose
            println("n: $(cont), x: $(x)")
        end
        x = x - fun(x)/grad(x)
        if abs(fun(x)) < ftol
            msg = "Root found with desired accuracy."
            break
        end
    end
    return x, msg
end


function fun(x)
    return cos(x) - x
end


function grad(x)
    return -sin(x) - 1.0
end


println(newton(fun, grad, 1.0))

Comparison

Regarding number of lines we have: 28 in Python and 32 in Julia. The comparison in execution time is done with %timeit magic command in IPython and @benchmark in Julia.

For Python:

%timeit newton(fun, grad, 1.0)

with result

10000 loops, best of 3: 27.3 µs per loop

For Julia:

@benchmark newton(fun, grad, 1.0)

with result

BenchmarkTools.Trial:
  memory estimate:  48 bytes
  allocs estimate:  2
  --------------
  minimum time:     327.925 ns (0.00% GC)
  median time:      337.956 ns (0.00% GC)
  mean time:        351.064 ns (0.80% GC)
  maximum time:     8.118 μs (92.60% GC)
  --------------
  samples:          10000
  evals/sample:     226