Implementing steepest descent algorithm, variable step size

Question

I am trying to implement steepest descent algorithm in programming languages (C/C++/fortran).

For example minimization of f(x1,x2) = x1^3 + x2^3 - 2*x1*x2

Estimate starting design point x0, iteration counter k0, convergence parameter tolerence = 0.1. Say this staring point is (1,0)
Compute gradient of f(x1,x2) at the current point x(k) as grad(f). I will use numerical differentiation here.

d/dx1 (f) = lim (h->0) (f(x1+h,x2) - f(x1,x2) )/h

This is grad(f)=(3*x1^2 - 2*x2, 3*x2^2 - 2*x1)

grad(f) at (0,1) is c0 = (3,-2)
since L2 norm of c0 > tolerence, we proceed for next step
direction d0 = -c0 = (-3,2)
Calculate step size a. Minimize f(a) = f(x0 + ad0) = (1-3a,2a) = (1-3a)^3 + (2a)^3 - 2(1-3a)*(2a). I am not keeping constant step size.
update: new[x1,x2] = old[x1,x2]x + a*d0.

I do not understand how to do step 5. I have a 1D minimization program with bisection method, and it looks like:

program main()
    ...
    ...
    define upper, lower interval
    call function value
    ...calculations
    ...
    ...


function value (input x1in) (output xout)
    ...function is x^4 - 2x^2 + x + 10 
    xout = (xin)^4 - 2*(xin)^2 + (xin) + 10

In this case, looking at step 5, I cannot pass symbolic a. Any ideas how to implement the algorithm in programming language, especially step 5? Please suggest if there is altogether different way to program this. I have seen many programs with constant step size, but I want to compute it at every step. This algorithm can be easy to implement in MATLAB ot python sympy using symbolics, but I do not want to use symbolics. Any suggestions appreciated. Thanks.

Are you asking how to calculate a, how to store it for use in every iteration or how to pass it as a parameter to the function? In order to help you, we need to see the actual relevant portion of code that you use. — o_weisman, May 30 '16 at 11:12
The code to calculate 1D optimum value has around 200 lines and several included other files. It may not be a good idea to give it here. So I gave a rough template how that code works. — de23edced, May 30 '16 at 12:00
@o_weisman Basically, I have a function code that takes in variable, plugs in in the equation and gives out the result of function. In my case of gradient method algo, there is this symbolic variable `a` that I don't know how to handle. — de23edced, May 30 '16 at 12:02
If you know how to mathematically compute a, just compute it and pass it as another parameter to the function. If you don't, you should probably ask on a different forum that deals with math. — o_weisman, May 31 '16 at 06:37

score 0 · Accepted Answer · edited May 23 '17 at 11:58

If C++ is an option, you can take advantage of functors and lambdas.

Let's consider a function we want to minimize, for example y = x² - x + 2. It can be represented as a function object, which is a class with an overloaded operator():

struct MyFunc {
    double operator()( double x ) const {
        return  x * x - x + 2.0;
    }
};

Now we can declare an object of this type, use it like a function and pass it to other templated function as a templated parameter.

// given this templated function:
template < typename F >
void tabulate_function( F func, double a, double b, int steps ) {
    //  the functor     ^^^^^^  is passed to the templated function
    double step = (b - a) / (steps - 1);

    std::cout << "    x          f(x)\n------------------------\n";
    for ( int i = 0; i < steps; ++i ) {
        double x = a + i * step,
               fx = func(x);
        //          ^^^^^^^ call the operator() of the functor
        std::cout << std::fixed << std::setw(8) << std::setprecision(3) << x
                  << std::scientific << std::setw(16) << std::setprecision(5)
                  << fx << '\n';
    }   
}

// we can use the previous functor like this:
MyFunc example;
tabulate_function(example, 0.0, 2.0, 21);

OP's function can be implemented (given an helper class to represent 2D points) in a similar way:

struct MyFuncVec {
    double operator()( const Point &p ) const {
        return p.x * p.x * p.x  +  p.y * p.y * p.y  -  2.0 * p.x * p.y;
    }
};

The gradient of that function can be represented (given a class which implement a 2D vector) by:

struct MyFuncGradient {
    Vector operator()( const Point &p ) {
        return Vector(3.0 * p.x * p.x  -  2.0 * p.y, 3.0 * p.y * p.y  -  2.0 * p.x);
    }
};

Now, the fifth step of OP question requests to minimize the first function along the direction of the gradient using a monodimensional optimization algorithm which requires a monodimensional function to be passed. We can solve this issue using a lambda:

    MyFuncVec funcOP;
    MyFuncGradient grad_funcOP; 
    Point p0(0.2, 0.8);
    Vector g = grad_funcOP(p0);

    // use a lambda to transform the OP function to 1D
    auto sliced_func = [&funcOP, &p0, &g] ( double t ) -> double {
        // those variables ^^^ ^^^ ^^ are captured and used
        return funcOP(p0 - t * g);
    };

    tabulate_function(sliced_func, 0, 0.5, 21);

Live example HERE.

Thanks for your detailed explanation. which compiler did you use for your 'live example'? I got several compile time errors with gcc. — de23edced, Jun 02 '16 at 23:06
@de23edced ideone.com should use g++ too. Have you tried setting the option `-std=c++0x` at command line? Lambdas are a C++11 feature. — Bob__, Jun 03 '16 at 07:19

Implementing steepest descent algorithm, variable step size

1 Answers1