Documentation

This is machine translation

Mouseover text to see original. Click the button below to return to the English version of the page.

Include Derivatives in Problem-Based Workflow

Why Include Derivatives?

The example shows how to include derivative information in nonlinear problem-based optimization. Including gradients or a Hessian in an optimization can give the following benefits:

• More robust results. Finite differencing steps sometimes reach points where the objective or a nonlinear constraint function is undefined, not finite, or complex.

• Analytic gradients can be more accurate than finite difference estimates.

• Including a Hessian can lead to faster convergence, meaning fewer iterations.

• Analytic gradients can be faster to calculate than finite difference estimates, especially for problems with a sparse structure. For complicated expressions, however, analytic gradients can be slower to calculate.

Despite these advantages, the problem-based approach currently does not use derivative information. To use derivatives in problem-based optimization, convert your problem using `prob2struct`, and edit the resulting objective and constraint functions.

Create Optimization Problem

Create a problem-based nonlinear optimization. With 2-D control variables `x` and `y`, use the objective function

Include the constraint that the sum of squares of `x` and `y` is no more than 4.

`fun2` is not a rational function of its arguments. Therefore, to include it in an optimization problem, you must convert it to an optimization expression using `fcn2optimexpr`.

```prob = optimproblem; x = optimvar('x',2); y = optimvar('y',2); fun1 = 100*(x(2) - x(1)^2)^2 + (1 - x(1))^2; fun2 = @(x,y)-exp(-sum((x - y).^2))*exp(-exp(-y(1)))*sech(y(2)); prob.Objective = fun1 + fcn2optimexpr(fun2,x,y); prob.Constraints.cons = sum(x.^2 + y.^2) <= 4;```

Convert Problem to Solver-Based Form

To include derivatives, convert the problem to a structure using `prob2struct`.

`problem = prob2struct(prob);`

During the conversion, `prob2struct` creates function files that represent the objective and nonlinear constraint functions. By default, these functions have the names `'generatedObjective.m'` and `'generatedConstraints.m'`.

Calculate Derivatives and Keep Track of Variables

Calculate the derivatives associated with the objective and nonlinear constraint functions. If you have a Symbolic Math Toolbox™ license, you can use the `gradient` or `jacobian` function to help compute the derivatives. See Symbolic Math Toolbox Calculates Gradients and Hessians.

The solver-based approach has one control variable. Each optimization variable (`x` or `y`, in this example) is a portion of the control variable.

You can find the mapping between optimization variables and the single control variable in the generated objective function file, `'generatedObjective.m'`. The beginning of the file contains these lines of code:

```%% Variable indices. idx_x = [1 2]; idx_y = [3 4]; %% Map solver-based variables to problem-based. x = inputVariables(idx_x); x = x(:); y = inputVariables(idx_y); y = y(:);```

In this code, the control variable has the name `inputVariables`.

Alternatively, you can find the mapping by using `varindex`.

```idx = varindex(prob); disp(idx.x)```
` 1 2`
`disp(idx.y)`
` 3 4`

Once you know the mapping, you can use standard calculus to find the following expressions for the gradient `grad` of the objective function `objective = fun1 + fun2` with respect to the control variable `[x(:);y(:)]`.

`$\text{grad}=\left[\begin{array}{c}2{x}_{1}-400{x}_{1}\left({x}_{2}-{x}_{1}^{2}\right)+{\sigma }_{1}-2\\ 200{x}_{2}-200{x}_{1}^{2}+{\sigma }_{2}\\ -{\sigma }_{1}-d\mathrm{exp}\left(-{y}_{1}\right)\\ -{\sigma }_{2}+d\text{\hspace{0.17em}}\text{tanh}\left({y}_{2}\right)\end{array}\right],$`

where

`$\begin{array}{l}c=\mathrm{exp}\left(-{\left({x}_{1}-{y}_{1}\right)}^{2}-{\left({x}_{2}-{y}_{2}\right)}^{2}\right)\\ d=c\mathrm{exp}\left(-\mathrm{exp}\left(-{y}_{1}\right)\right)\text{sech}\left({y}_{2}\right)\\ {\sigma }_{1}=2\left({x}_{1}-{y}_{1}\right)d\\ {\sigma }_{2}=2\left({x}_{2}-{y}_{2}\right)d.\end{array}$`

Edit the Objective and Constraint Files

To include the calculated gradients in the objective function file, edit `'generatedObjective.m'` as follows.

```%% Insert gradient calculation here. % If you include a gradient, notify the solver by setting the % SpecifyObjectiveGradient option to true. if nargout > 1 c = exp(-sum((x - y).^2)); d = c*exp(-exp(-y(1)))*sech(y(2)); sigma1 = 2*(x(1) - y(1))*d; sigma2 = 2*(x(2) - y(2))*d; grad = zeros(4,1); grad(1) = 2*x(1) - 400*x(1)*(x(2) - x(1)^2) + sigma1 - 2; grad(2) = 200*x(2) - 200*x(1)^2 + sigma2; grad(3) = -sigma1 - d*exp(-y(1)); grad(4) = -sigma2 + d*tanh(y(2)); end```

Recall that the nonlinear constraint is ```x(1)^2 + x(2)^2 + y(1)^2 + y(2)^2 <= 4```. Clearly, the gradient of this constraint function is `2*[x;y]`. To include the calculated gradients of the nonlinear constraint, edit `'generatedConstraints.m'` as follows.

```%% Insert gradient calculation here. % If you include a gradient, notify the solver by setting the % SpecifyConstraintGradient option to true. if nargout > 2 cineqGrad = 2*[x;y]; ceqGrad = []; end```

Run Problem Using Two Methods

Run the problem using both the problem-based and solver-based methods to see the differences. To run the problem using derivative information, create appropriate options in the problem structure.

```options = optimoptions('fmincon','SpecifyObjectiveGradient',true,... 'SpecifyConstraintGradient',true); problem.options = options;```

Nonlinear problems require a nonempty `x0` field in the problem structure.

```x0 = [-1;2;1;-1]; problem.x0 = x0;```

Call `fmincon` on the problem structure.

`[xsolver,fvalsolver,exitflagsolver,outputsolver] = fmincon(problem)`
```Local minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance, and constraints are satisfied to within the value of the constraint tolerance. <stopping criteria details> xsolver = 0.8671 0.7505 1.0433 0.5140 fvalsolver = -0.5500 exitflagsolver = 1 outputsolver = struct with fields: iterations: 46 funcCount: 77 constrviolation: 0 stepsize: 7.4091e-06 algorithm: 'interior-point' firstorderopt: 7.5203e-07 cgiterations: 9 message: '↵Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 7.520304e-07,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 0.000000e+00, is less than options.ConstraintTolerance = 1.000000e-06.↵↵'```

Compare this solution with the one obtained from `solve`, which uses no derivative information.

```init.x = x0(1:2); init.y = x0(3:4); [xproblem,fvalproblem,exitflagproblem,outputproblem] = solve(prob,init)```
```Local minimum found that satisfies the constraints. Optimization completed because the objective function is non-decreasing in feasible directions, to within the value of the optimality tolerance, and constraints are satisfied to within the value of the constraint tolerance. <stopping criteria details> xproblem = struct with fields: x: [2×1 double] y: [2×1 double] fvalproblem = -0.5500 exitflagproblem = OptimalSolution outputproblem = struct with fields: iterations: 48 funcCount: 276 constrviolation: 0 stepsize: 7.9340e-07 algorithm: 'interior-point' firstorderopt: 6.5496e-07 cgiterations: 9 message: '↵Local minimum found that satisfies the constraints.↵↵Optimization completed because the objective function is non-decreasing in ↵feasible directions, to within the value of the optimality tolerance,↵and constraints are satisfied to within the value of the constraint tolerance.↵↵<stopping criteria details>↵↵Optimization completed: The relative first-order optimality measure, 6.549635e-07,↵is less than options.OptimalityTolerance = 1.000000e-06, and the relative maximum constraint↵violation, 0.000000e+00, is less than options.ConstraintTolerance = 1.000000e-06.↵↵' solver: 'fmincon'```

Both solutions report the same function value to display precision, and both require roughly the same number of iterations (46 using gradient information, 48 without). However, the solution using gradients requires only 77 function evaluations, compared to 276 for the solution without gradients.