Optimization using MATLAB

This tutorial is designed to help readers solve optimization problems in MATLAB through various examples and approaches. We will explore three widely used tools/interfaces: (i) MATLAB’s Optimization toolbox, (ii) YALMIP in conjunction with MATLAB, and (iii) CVX integrated with MATLAB. Additionally, the tutorial will delve into optimizing functions in the complex domain and demonstrate techniques for enhancing convergence rates by utilizing analytically computed gradients and Hessians. We will learn an important tips and tricks involves verifying the accuracy of these calculations using MATLAB’s Symbolic Toolbox. Please note that all the examples presented in this tutorial are based on MATLAB version 2021b. The tutorial’s content includes:

Table of Contents

Introduction

Optimization represents a fundamental mathematical concept involving the maximization or minimization of a specific utility function. The structure of this utility function varies based on the application’s context and the specific field of study. Once an optimization problem is formulated, the subsequent challenge lies in obtaining a solution. Unfortunately, deriving a closed-form (analytical) solution is often not feasible. In such cases, numerical optimization methods come into play. Numerous solvers are accessible to tackle optimization problems, particularly if you are aware of the problem’s categorical type. This tutorial demonstrates several examples of how to model and solve optimization problems using MATLAB. It’s essential to note that this post doesn’t encompass all available tools; instead, it focuses on a select few. Additionally, the tutorial also touches tools capable of handling nonconvex functions, even though they might yield suboptimal solutions.

Solving Linear Programming (LP)

The general Linear Programming (LP) problem can be formulated as follows:

$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & f^Tx\\ \rm{s.t.} \quad & Ax\leq b\\ \quad & A_{\rm eq}x=b_{\rm eq} \end{aligned} \end{equation} $$

Consider the following specific example of an LP problem that we aim to solve: $$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & -x_1 - \frac{1}{3}x_2\\ \rm{s.t.} \quad & \begin{bmatrix} 1 & 1\\ 1 & \frac{1}{4}\\ 1 & -1\\ -\frac{1}{4} & -1\\ -1 & -1\\ -1 & 1 \end{bmatrix}x\leq \begin{bmatrix} 2\\ 1\\ 2\\ 1\\ -1\\ 2 \end{bmatrix}\\ \quad & x_1+\frac{1}{4}x_2= \frac{1}{2}\\ \quad &-1\leq x_1 \leq 1.5\\ \quad &-0.5 \leq x_2 \leq 1.25 \end{aligned} \end{equation} $$

First, let us define the parameters for the LP example described above:

clc;
clear;
close all;

% Defining inequality constraint matrix
A = [1,1;
    1,1/4;
    1,-1;
    -1/4, -1;
    -1,-1;
    -1,1];

% Defining inequality constraint vector
b = [2;1;2;1;-1;2];

% Defining equality constraint matrix and vector
Aeq = [1 1/4];
beq = 1/2;

% Defining lower bounds and upper bounds on the optimizing variables
lb = [-1,-0.5];
ub = [1.5,1.25];

Note that the lower and upper bounds on the variables can also be modeled as inequality constraints and included in the matrix $A$ and vector $b$.

Example on how to model LP with linprog of MATLAB

We will use MATLAB’s linprog , a dedicated solver for linear programming problems. The LP problem can be modeled and solved as follows:

% Weights of the objective
f = [-1;-1/3];

% Solving the LP
[x,fval] = linprog(f,A,b,Aeq,beq,lb,ub);

One can also solve a LP problem using the interior-point method, which can be invoked through MATLAB’s optimoptions as follows:

% Setting the options for linprog with interior-point method
options = optimoptions('linprog','Algorithm','interior-point');

% Solving the LP
[x,fval] =  linprog(f,A,b,Aeq,beq,lb,ub,options);

The solution obtained with linprog for the example is (format of display is changed):

x = [0.1875;1.25]

fval = -0.6042

Example on how to model LP with YALMIP in MATLAB

We can solve the same LP problem by modeling it with YALMIP as follows

% Defining the optimization variable
x = sdpvar(2,1);

% Defining the objective
obj = f'*x;

% Defining the constraints
cons = [A*x<=b,Aeq*x==beq,-1<=x(1)<=1.5,-0.5<=x(2)<=1.25];

% Setting up the options and setting verbose not to display
options = sdpsettings('verbose',0);

% Solving the optimization problem
diagnostics = optimize(cons,obj,options);

if diagnostics.problem == 0
 disp(value(x)) % value function to display sdp variables value
 disp(value(obj))

elseif diagnostics.problem == 1
 disp('Solver thinks it is infeasible')
else
 disp('Something else happened')
end

Example on how to model LP with CVX in MATLAB

Now, we shall solve the same LP problem by modeling it with CVX as follows:

cvx_begin quiet
    variable x(2,1)

    minimize dot(f,x);

    subject to
        A*x<=b;
        Aeq*x==beq;
        -1<=x(1)<=1.5;
        -0.5<=x(2)<=1.25;
cvx_end

Solving Quadratic Programming (QP)

The general form of a Quadratic Programming (QP) problem is given by:

$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & \frac{1}{2}x^THx + f^Tx\\ \rm{s.t.} \quad & Ax\leq b\\ \quad & A_{\rm eq}x=b_{\rm eq} \end{aligned} \end{equation} $$

We will focus on the case where H is at least positive semidefinite, thus confining ourselves to convex QP problems. For the forthcoming example, we will consider positive definite matrices, as this ensures a unique solution and facilitates easy comparison with results from YALMIP and CVX. It’s important to note that MATLAB’s quadprog function and YALMIP can handle even indefinite matrices, but they may only provide suboptimal solutions. However, CVX does not accept indefinite matrices, as that would render the problem nonconvex.

Consider the following example of QP that we intend to solve: (Note: if some equations are not rendering properly in mobile browser, then in browser options select “desktop site”) $$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & \frac{1}{2}x_1^2+x_2^2-x_1x_2-2x_1-6x_2\\ \rm{s.t.} \quad & x_1 + x_2 \leq 2\\ & -x_1 + 2x_2 \leq 2\\ & 2x_1 + x_2 \leq 3\\ & x_1 + x_2 =0 \end{aligned} \end{equation} $$

One can reformulate the problem into a matrix-vector representation to obtain the following QP parameters:

$$ \begin{equation} \begin{aligned} H &= \begin{bmatrix} 1 & -1\\ -1 & 2 \end{bmatrix}\\ f &= \begin{bmatrix} -2\\ -6 \end{bmatrix}\\ A &= \begin{bmatrix} 1 & 1\\ -1 & 2\\ 2 & 1 \end{bmatrix}\\ b &= \begin{bmatrix} 2\\ 2\\ 6 \end{bmatrix}\\ A_{\rm eq} &= \begin{bmatrix} 1 & 2 \end{bmatrix}\\ b_{\rm eq} &= 0 \end{aligned} \end{equation} $$

Now we can begin modeling the QP problem in MATLAB. The first step will be to define the fixed parameters of the QP problem.

clc;
clear;
close all;

H = [1 -1; -1 2];
f = [-2; -6];
A = [1 1; -1 2; 2 1];
b = [2; 2; 3];

Aeq = [1 2];
beq = 0;

Example on how to model QP with quadprog of MATLAB

We will use MATLAB’s quadprog , which is a dedicated solver for QP. The problem can be solved as follows:

[x,fval] = quadprog(H,f,A,b,Aeq,beq);

The solution obtained with quadprog for the example is (format of display is changed):

x = [-0.4;0.2]

fval = -0.2

Note that quadprog can also accept the bounds on the variable as an input, similar to what we did in the LP example.

Example on how to model QP with YALMIP in MATLAB

We can solve the same QP problem by modeling it with YALMIP as follows

% Define the variable
x = sdpvar(2,1);

% Objective
obj = 0.5*(x'*H*x)+f'*x;

% Constraints
cons = [A*x<=b,Aeq*x==beq];

options = sdpsettings('verbose',0);

diagnostics = optimize(cons,obj,options);

if diagnostics.problem == 0
 x = value(x)
 fval = value(obj)

elseif diagnostics.problem == 1
 disp('Solver thinks it is infeasible')
else
 disp('Something else happened')
end

Example on how to model QP with CVX in MATLAB

We can now solve the same QP problem by modeling it with CVX as follows

cvx_begin quiet
    cvx_solver mosek
    variable x(2,1)

    minimize 0.5*quad_form(x,H) + dot(f,x);

    subject to
        A*x<=b;
        Aeq*x==beq;
cvx_end

This example also demonstrates how to invoke the MOSEK solver with CVX to solve the problem.

Solving Quadratic Constrained Quadratic Programming (QCQP)

The general form of a Quadratic Constrained Quadratic Programming (QCQP) problem is given by:

$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & x^TQ_0x + 2r_0^Tx + c_0\\ \rm{s.t.} \quad & x^TQ_ix + 2r_i^Tx + c_i \leq 0,\ i = 1,\ldots, n \end{aligned} \end{equation} $$

We will focus on examples where the matrices are positive definite. However, MATLAB functions, YALMIP, and CVX can handle semidefinite matrices and sometimes even non-convex quadratic functions (MATLAB toolbox and YALMIP), but they may provide suboptimal solutions. CVX, in particular, does not accept non-convex models.

We shall generate the fixed data for the QCQP example randomly as follows:

clc;
clear;
close all;

% Set the seed for repeatability
rng(5);

% Generate sample data
M = 4; % Size of matrices
n = 4; % Number of constraints + objective

Q = zeros(M,M,n); % Including objective
r = zeros(M,n);
c = zeros(n,1);

% Generating all data at one place
% (this is to ensure data is same across all examples)
Qfull = randn(M,M,n);
rfull = randn(M,n);
cfull = randn(n);

% Defining objective parameters
Q(:,:,1) = Qfull(:,:,1)*Qfull(:,:,1)' + eye(M);
r(:,1) = rfull(:,1);
c(1) = cfull(1);

% Defining constraint parameters
for i = 2:n
    Qtmp = Qfull(:,:,i);

    Q(:,:,i) = Qtmp*Qtmp';
    r(:,i) = rfull(:,i);
    c(i) = cfull(i);
end

Example on how to model QCQP with fmincon of MATLAB

MATLAB does not have a dedicated solver for QCQP. However, we can use the interior-point method via the fmincon function which is one of the powerful general purpose solver.

As a first step, we need to define functions for the objective and its constraints (non-linear). We can define these functions as anonymous as follows:

% Defining the function as anonymous function
fun = @(x)(x'*Q(:,:,1)*x + 2*r(:,1)'*x + c(1));

% Defining non-linear constraints
noncol = @(x)consfun(x,Q,r,c,n);

Since the objective function is concise (short), it has been defined as an inline equation, and here fun serves as the handle for the objective function. However, for the constraints, which are multiple, we have simply defined a function handle noncol , and its definition is provided below:

% % Function to compute equality and inequality nonlinear constraints
function [cons,conseq] = consfun(x,Q,r,c,n)
cons = zeros(n-1,1); % Number of constraints
for i = 2:n

    cons(i-1) = x'*Q(:,:,i)*x + 2*r(:,i)'*x + c(i);

end
conseq = [ ];
end

One can define this user-defined function at the end of the script for better readability. Now, let’s proceed to see how to solve the problem.

% Defining the initial point
x0 = zeros(M,1);

% Setting the options for algorithm: interior-point method
options = optimoptions('fmincon','Algorithm','interior-point');

% Solving the optimization problem
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);

The solution obtained with fmincon for the example is (format of display is changed):

x = [-0.013;-0.0689;0.1170;0.0579]

fval = -0.5080

The variables exitflag and output contain additional useful information about the results, such as constraint violations and more. It is important to note that fmincon can sometimes slightly violate constraints. Although these violations may be minimal, one should examine these extra variables to assess the reliability of the results for the specific problem at hand.

Note that fmincon can also accept linear inequality and equality constraints as its third, fourth, fifth, and sixth arguments, respectively. Moreover, one can define lower and upper bounds on the variables as the seventh and eighth arguments. It is also worth noting that fmincon can accept nonlinear equality constraints. However, we did not consider that case in this example to maintain focus on the convex QCQP problem.

Example on how to model QCQP with YALMIP in MATLAB

We can now solve the same QCQP problem by modeling it with YALMIP as follows:

% Defining the sdp variables
x = sdpvar(M,1);

% Defining the objective function
obj = (x'*Q(:,:,1)*x + 2*r(:,1)'*x + c(1));

% Defining all constraints iteratively
cons = [];
for i = 2:n

    cons = [cons,x'*Q(:,:,i)*x + 2*r(:,i)'*x + c(i)<=0];

end

% Setting up the options
options = sdpsettings('verbose',0,'solver','mosek');

% Solving the problem
diagnostics = optimize(cons,obj,options);

if diagnostics.problem == 0
 disp(value(x))
 disp(value(x))

elseif diagnostics.problem == 1
 disp('Solver thinks it is infeasible')
else
 disp('Something else happened')
end

With YALMIP, we can use the code snippet check(cons) to determine how well the constraints are being satisfied. In this example, we also learned how to invoke the mosek solver using YALMIP.

Example on how to model QCQP with CVX in MATLAB

We can now solve the same QCQP problem by modeling it with CVX as follows:

cvx_begin quiet
    cvx_solver mosek
    variable x(M,1)

    minimize quad_form(x,Q(:,:,1)) + 2*dot(r(:,1),x)+ c(1);

    subject to
        for i = 2:n
            quad_form(x,Q(:,:,i)) + 2*dot(r(:,i),x)+ c(i)<=0;
        end
cvx_end

Solving Second Order Conic Programing (SOCP)

The general form of a Second Order Conic Programming (SOCP) problem is given by:

$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & f^Tx\\ \rm{s.t.} \quad & \Vert A_ix + b_i \Vert \leq \gamma^T_i x + d_i,\ i = 1,\ldots, n\\ &Bx \leq q\\ & B_{\rm eq}x = q_{\rm eq} \end{aligned} \end{equation} $$

To provide an example, we shall solve the previously discussed QCQP problem as a SOCP problem. This can be done after some mathematical manipulations to reformulate the QCQP problem into an SOCP problem. The reformulation process typically involves converting the quadratic constraints into conic form which we shall skip and provide the final form:

$$ \begin{equation} \begin{aligned} \underset{x,t}{\rm{minimize}} \quad & t\\ \rm{s.t.} \quad & \Vert Q_0^{\frac{1}{2}}x + Q_0^{-\frac{1}{2}}r_0 \Vert \leq t,\\ & \Vert Q_i^{\frac{1}{2}}x + Q_i^{-\frac{1}{2}}r_i \Vert \leq d_i, \ i = 1,\ldots, n\\ \end{aligned} \end{equation} $$ where $d_i = \Vert Q_i^{-\frac{1}{2}}r_i \Vert^2 - c_i$.

Note that in the previous example, we considered $Q_i, i=0,\ldots, n$ to be positive definite. However, the problem can also be reformulated for positive semidefinite matrices. For a comprehensive reference on this reformulation, please refer: here

Also, note that for the general SOCP problem, there are no structural assumptions on the $A_i, i =1,\ldots, n$ matrices, i.e., they need not be PSD or PD.

Example on how to model SOCP with coneprog of MATLAB

MATLAB offers a dedicated solver, solver coneprog for conic programming. The first step is understanding of how to define a Second Order Constraint (SOC) in MATLAB. he syntax to model an SOC is as follows:

socConstraints = secondordercone(A,-b,gamma,-d);

Note that the variable notations are adapted to align with how MATLAB’s documentation defines them. Additionally, the signs of $b_i$ and $d_i$ are flipped due to the way MATLAB formulates the SOCP problem constraints. For a detailed reference on these conventions, please refer

Now let’s solve the QCQP problem using MATLAB’s coneprog. The first step involves defining the matrices and vectors A,b, $\gamma$ and d variables as required for the SOCP format:

% Defining parameters to store SOCP parameters
A = zeros(M,M+1,n); % Including objective
b = zeros(M,n);
d = zeros(n);

% Defining parameters SoCP of MATLAB

A(:,:,1) = [sqrtm(Q(:,:,1)),zeros(M,1)]; % Extra zero column for slack variable t
b(:,1) = sqrtm(Q(:,:,1))\r(:,1);
d(1) = sqrt(norm(b(:,1))^2 - c(1));

% Defining constraint parameters
for i = 2:n

    A(:,:,i) = [sqrtm(Q(:,:,i)),zeros(M,1)];
    b(:,i) = sqrtm(Q(:,:,i))\r(:,i);
    d(i) = sqrt(norm(b(:,i))^2 - c(i));

    if ~isreal(d(i))
        error('Infeasible constraints');
    end
end

Here, we have defined our SOCP problem using two variables, $x$ and $t$. However, MATLAB allows us to define only one variable array. Therefore, we will treat our new variable array $x$ to represent $x := [x;t]$ (this is not a mathematical transformation but a representation for coding purposes). Consequently, we need to add an extra zero column to the matrices $A_i, i = 0,\ldots,n$ to accommodate the additional variable $t$. Additionally, it is important mathematically to ensure that $d_i$ for $i = 0,\ldots,n$ are real and positive to maintain feasibility.

The variables $\gamma_i$ and $f$ will be defined directly in the function as they are straightforward to formulate. If required, we can define them beforehand as well. Now, we are ready to solve the problem using MATLAB’s coneprog. Here’s how you can set up and solve the SOCP:

% Defining SoC constraints

% Objective
socConstraints(n) = secondordercone(A(:,:,1),-b(:,1),[zeros(M,1);1],0);

% Constraints
for i = 2:n

    socConstraints(i-1) = secondordercone(A(:,:,i),-b(:,i),zeros(M+1,1),-d(i));

end

[x,fval] = coneprog([zeros(M,1);1],socConstraints,[],[],[],[],[],[]);

Note that coneprog can also accept linear inequalities as the third and fourth arguments, and equality constraints can be provided as the fifth and sixth arguments. Additionally, lower and upper bounds on the variables can be defined using the seventh and eighth arguments. This flexibility allows for comprehensive specification of the problem constraints and variable limits within the coneprog framework.

The solution obtained with coneprog for the example is (format of display is changed):

x(1:n-1,1) = [0.0135;-0.0691;0.1175;0.0578]

x(n,1) = 2.0375

fval = 2.0375

Example on how to model SOCP with YALMIP in MATLAB

With YALMIP, one can define multiple optimization variables, making it more flexible than MATLAB’s Optimization Toolbox in that regard. The syntax used in YALMIP closely resembles the mathematical formulation of the problem. Let’s first define the parameters of the SOCP:

% Defining parameters to store SOCP parameters
A = zeros(M,M,n); % Including objective
b = zeros(M,n);
d = zeros(n);

% Defining objective parameters

A(:,:,1) = sqrtm(Q(:,:,1));
b(:,1) = sqrtm(Q(:,:,1))\r(:,1);
d(1) = sqrt(norm(b(:,1))^2 - c(1));

% Defining constraint parameters
for i = 2:n

    A(:,:,i) = sqrtm(Q(:,:,i));
    b(:,i) = sqrtm(Q(:,:,i))\r(:,i);
    d(i) = sqrt(norm(b(:,i))^2 - c(i));

    if ~isreal(d(i))
        error('Infeasible constraints');
    end
end

Now solving the problem as follows

% Defining optimizing variables
x = sdpvar(M,1);
t = sdpvar(1,1);

% Defining the objective
obj = t;

% Defining the constraints
cons = cone(A(:,:,1)*x + b(:,1),t);
for i = 2:n

    cons = [cons,cone(A(:,:,i)*x + b(:,i),d(i))];

end

options = sdpsettings('verbose',0,'solver','mosek');

diagnostics = optimize(cons,obj,options);

if diagnostics.problem == 0
 disp(value(x))
 disp(value(obj))

elseif diagnostics.problem == 1
 disp('Solver thinks it is infeasible')
else
 disp('Something else happened')
end

Example on how to model SOCP with CVX in MATLAB

Now we can solve the same problem using CVX. CVX is also flexible in terms of defining the number of optimization variables, similar to YALMIP. For defining the parameters of the SOCP problem, one can use the same code as in the YALMIP version of the SOCP example. Once the parameters are defined, we can solve the problem as follows:

cvx_begin quiet
    cvx_solver mosek
    variable x(M,1)
    variable t

    minimize t;

    subject to
        norm(A(:,:,1)*x + b(:,1))<=t;
        for i = 2:n
            norm(A(:,:,i)*x + b(:,i))<=d(i);
        end
cvx_end

Solving QCQP/SOCP for real-valued complex domain functions

In signal processing and communications, complex-valued variables are commonly encountered. However, all the utility functions, such as error and rate, are typically real-valued. To address this, we will explore how to solve a real-valued problem that involves complex variables, specifically in the context of a QCQP problem. By employing strategies similar to those used for SOCP or other real-valued optimization techniques, we can effectively handle complex variables.

Let’s start by defining some random data to formulate a convex QCQP problem involving real-valued complex variables:

clc;
clear;
close all;

rng(1); % Set the seed for repeatability

% Generate sample data
M = 4; % Size of matrices
n = 4; % Number of constraints + objective

Q = zeros(M,M,n); % Including objective
r = zeros(M,n);
c = zeros(n,1);

% Generating all data at one place
% (this is to ensure data is same across all examples)
Qfull = randn(M,M,n) + 1j*randn(M,M,n);
rfull = randn(M,n) + 1j* randn(M,n);
cfull = randn(n);

% Defining objective parameters
Q(:,:,1) = Qfull(:,:,1)*Qfull(:,:,1)' + eye(M);
r(:,1) = rfull(:,1);
c(1) = cfull(1);

% Defining constraint parameters
for i = 2:n
    Qtmp = Qfull(:,:,i);

    Q(:,:,i) = Qtmp*Qtmp';
    r(:,i) = rfull(:,i);
    c(i) = cfull(i);
end

We will begin by solving the problem using YALMIP and CVX, as both of these tools support optimization involving complex variables. Afterward, we will demonstrate how to adapt and solve these types of problems using MATLAB’s optimization toolbox, which typically handles real variables. Here’s how we proceed:

Example on how to solve convex QCQP problem with complex variables with YALMIP

% Defining the complex vector variable
z = sdpvar(M,1,'full','complex');

% Defining the objective
obj = (z'*Q(:,:,1)*z + 2*real(r(:,1)'*z) + c(1));

% Defining the constraints
cons = [];
for i = 2:n

    cons = [cons,z'*Q(:,:,i)*z + 2*real(r(:,i)'*z) + c(i)<=0];

end

% Setting the options
options = sdpsettings('verbose',0);

% Solving the problem
diagnostics = optimize(cons,obj,options);

if diagnostics.problem == 0
 disp(value(z))
 disp(value(obj))

elseif diagnostics.problem == 1
 disp('Solver thinks it is infeasible')
else
 disp('Something else happened')
end

The solution obtained with YALMIP for the above example is (format of display is changed):

z = [-0.0131 + 0.0438i;0.0713 - 0.1038i;-0.0295 - 0.0003i;0.0868 + 0.0375i]

obj = -0.8042

Example on how to solve convex QCQP problem with complex variables with CVX

cvx_begin quiet
    cvx_solver mosek
    variable x(M,1) complex

    minimize quad_form(x,Q(:,:,1)) + 2*real(dot(r(:,1),x))+ c(1);

    subject to
        for i = 2:n
            quad_form(x,Q(:,:,i)) + 2*real(dot(r(:,i),x))+ c(i)<=0;
        end
cvx_end

Example on how to solve convex QCQP problem with complex variables with MATLAB optimization toolbox

The MATLAB Optimization Toolbox does not directly support complex variables. However, any real-valued function involving complex variables can be reformulated to an equivalent function using real variables. This transformation involves splitting each complex variable into its real and imaginary components, effectively doubling the number of variables but retaining a formulation that is compatible with MATLAB’s real-variable optimization functions.

For example with the general convex QCQP

$$ \begin{equation} \begin{aligned} \underset{z\in \mathbb{C}^N}{\rm{minimize}} \quad & z^HQ_0z + 2\mathcal{R}\{r_0^Hz\} + c_0\\ \rm{s.t.} \quad & z^HQ_iz + 2\mathcal{R}\{r_i^Hz\} + c_i \leq 0,\ i = 1,\ldots, n \end{aligned} \end{equation} $$

One can re-write equivalently as follows:

$$ \begin{equation} \begin{aligned} \underset{x\in \mathbb{R}^{2N}}{\rm{minimize}} \quad & x^T\bar{Q}_0x + 2\bar{r}_0^Tx + c_0\\ \rm{s.t.} \quad & x^T\bar{Q}_ix + 2\bar{r}_i^Tx + c_i \leq 0,\ i = 1,\ldots, n \end{aligned} \end{equation} $$

where

$$ \begin{equation} \begin{aligned} x &= \begin{bmatrix} \mathcal{R}\{z\}\\ \mathcal{I}\{z\} \end{bmatrix}\\ \bar{Q}_i &= \begin{bmatrix} \mathcal{R}\{Q_i\} & -\mathcal{I}\{Q_i\}\\ \mathcal{I}\{Q_i\} & \mathcal{R}\{Q_i\} \end{bmatrix}\\ \bar{r}_i &= \begin{bmatrix} \mathcal{R}\{r_i\}\\ \mathcal{I}\{r_i\} \end{bmatrix} \end{aligned} \end{equation} $$ where $\mathcal{R}\{\cdot\}$ and $\mathcal{I}\{\cdot\}$ are real and imaginary parts of the argument. After solving, the objective’s value remains the same as in the case of the complex variable model. To obtain the point of optimality, one can retrieve it using $z := x(1:n,1) + i x(n+1,1)$ (representation for code purposes only). Now that we understand the approach, we can proceed to solve the problems.

% Reformulating to real variables
Qb = zeros(2*M,2*M,n);
rb = zeros(2*M,n);
for i = 1:n
    Qr = real(Q(:,:,i));
    Qi = imag(Q(:,:,i));

    rr =  real(r(:,i));
    ri = imag(r(:,i));

    Qb(:,:,i) = [Qr,-1*Qi;Qi,Qr];
    rb(:,i) = [rr;ri];
end

% Defiing the objective function
fun = @(x)(x'*Qb(:,:,1)*x + 2*rb(:,1)'*x + c(1));

% Defining the nonlinear constraints
noncol = @(x)consfun(x,Qb,rb,c,n);

% Defining the intial value
x0 = zeros(2*M,1);

% Setting up the options
options = optimoptions('fmincon','Algorithm','interior-point');

% Solving the optimization problem
[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);

% Obtaining the complex variable from x
z = x(1:M) + 1j*x(M+1:end);

% Function to compute nonlinear constraint
function [cons,conseq] = consfun(x,Qb,rb,c,n)
cons = zeros(n-1,1); % Number of constraints
for i = 2:n

    cons(i-1) = x'*Qb(:,:,i)*x + 2*rb(:,i)'*x + c(i);

end
conseq = [ ];
end

General Convex Programming

We will now explore some general convex functions that do not fit neatly into the representations discussed above. Although it is sometimes possible to reformulate these into standard forms by introducing additional variables, we will avoid this to maintain our focus on solving optimization problems in MATLAB. For general convex optimization, we can use one of three tools: fmincon from MATLAB, or YALMIP or CVX, depending on which toolbox you are more comfortable with.

However, it is important to note that with CVX, the objective function must be convex across its entire domain, including regions outside the feasible set. If this condition is not met, we cannot directly solve such problems with CVX, as it is strictly designed for problems with convex objectives and constraints. To use CVX for such cases, the objective might need restructuring, but we will not cover that here.

One significant advantage of using fmincon is that you can supply analytical gradients and Hessians (both of which are optional). This can enhance the reliability of the results. We will now present an example where the objective function is not convex over its entire domain, but it is convex within the feasibility region.

Consider the following example:

$$ \begin{equation} \begin{aligned} \underset{x}{\rm{minimize}} \quad & c(x_2 + x_1^2)^2 + (1 - x_1)^2\\ \rm{s.t.} \quad & x_1^2 + (x_2 - a)^2 \leq 2,\\ & x_2 \geq 0 \end{aligned} \end{equation} $$

for $c = 100, a = 3$. This problem is convex in the feasibility region. Let’s compute its objective, say $f(x)$, and constraint’s, say $f_1(x)$, gradient and Hessian.

$$ \begin{equation} f(x) = c(x_2 + x_1^2)^2 + (1-x_1)^2 \end{equation}

\begin{equation} \nabla f(x) = \begin{bmatrix} 4c(x_2 + x_1^2)x_1 - 2(1-x_1)\\ 2c(x_2 + x_1^2) \end{bmatrix} \end{equation}

\begin{equation} \nabla^2 f(x) = \begin{bmatrix} 4c(x_2 + 3x_1^2) + 2 & 4cx_1\\ 4cx_1 & 2c \end{bmatrix} \end{equation} $$

Gradient and Hessian of the constraint are: $$ \begin{equation} \begin{aligned} f_1(x) &= x_1^2 + (x_2 - a)^2 - 2\\ \nabla f_1(x) &= \begin{bmatrix} 2x_1\\ 2(x_2 - a) \end{bmatrix}\\ \nabla^2 f_1(x) &= \begin{bmatrix} 2 & 0\\ 0 & 2 \end{bmatrix} \end{aligned} \end{equation} $$ One can verify from the Hessians that the problem is convex within the feasibility region. We will solve the above problem using fmincon and YALMIP only, as CVX is suitable for problems that are convex across their entire domain.

Solving the above example with YALMIP

clc;
clear;
close all;

% Paramters of the problem
c = 100;
a = 3;

% Number of variables
x = sdpvar(2,1);

% Defining the objective
obj = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

% Defining the constraints
cons = x(2)>=0;
cons = [cons,10*(x(1)^2 + (x(2)-a)^2)<=2*10];

% Setting the options
options = sdpsettings('verbose',0);

% Solving the problem
diagnostics = optimize(cons,obj,options);

if diagnostics.problem == 0
 disp(value(x))
 disp(value(obj))

elseif diagnostics.problem == 1
 disp('Solver thinks it is infeasible')
else
 disp('Something else happened')
end

and the solution obtained is (format of display is changed)

x = [0.0023;1.5858]
obj = 252.469541

There are many other examples on the YALMIP website.

Let us now solve the same problem using MATLAB in three different ways: one way directly, and two other ways by supplying the theoretical gradients and Hessians.

Case 1: Solving a general optimization problem. In this case, we will solve the problem directly using MATLAB’s optimization tools without providing derivatives. This approach relies on MATLAB’s numerical methods to estimate gradients and Hessians, which can be less efficient but requires less preparation required.

clc;
clear;
close all;

% Linear constraints
A = [0,-1];
b = 0;

% Initial value
x0 = [0;3];

% Setting the options
options = optimoptions('fmincon','Algorithm','interior-point');

% Solving the problem
[x,fval,exitflag,output] = fmincon(@computeFun,x0,A,b,[],[],[],[],@computenonlincons,options);

% Function to compute function value
function f = computeFun(x)

% Calculate objective f
f = 100*(x(2) + x(1)^2)^2 + (1-x(1))^2;

end


% Function to compute nonlinear equality and inequality constraints
function [c,ceq] = computenonlincons(x)
c = x(1)^2 + (x(2)-3)^2 - 2;
ceq = [ ];
end

Note that fmincon can handle problems where both the objective and constraint functions are continuous and have continuous first derivatives, although these functions may be nonconvex. On the other hand, YALMIP can manage any convex function, whether smooth or not, and in some cases, even nonconvex functions. In scenarios involving nonconvexity, both fmincon and YALMIP will potentially provide a suboptimal solution.

Optimization of smooth function with fmincon using theoretical gradient and Hessian

In some cases, providing theoretically computed gradients and Hessians to the interior-point method-based fmincon can enhance its reliability, making the code more robust and improving the algorithm’s convergence rate. This feature distinguishes fmincon from YALMIP and CVX, which can be particularly slow, especially when dealing with exponential cones.

Case 2: Solving the same previous problem using theoretical gradient and Hessians. In this case, we will solve the problem using fmincon while supplying the theoretical gradients and Hessians. This approach can lead to faster convergence and more reliable results compared to numerical approximations.

clc;
clear;
close all;

% Parameters of the problem
c = 100;
a = 3;

% Function handles for fun, gradient and Hessian
fun = @(x)computeFunGrad(x,c);
noncol = @(x)computenonlinConsGrad(x,a);
Hess = @(x,lambda)computeHessianfcn(x,lambda,c);

% Linear constraints
A = [0,-1];
b = 0;

x0 = [0;0];
options = optimoptions('fmincon','Algorithm','interior-point',...
    'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
    'HessianFcn',Hess);

[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);

% Function to compute function value and gradiant value at the point
function [f,gradf] = computeFunGrad(x,c)
% Calculate objective f
f = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

gradf = [4*c*(x(2)+x(1)^2)*x(1)-2*(1-x(1));
    2*c*(x(2)+x(1)^2)];

end

% Function to Hessian of Lagrangian at the point
function Hout = computeHessianfcn(x,lambda,c)
% Hessian of objective
H = [12*c*x(1)^2+4*c*x(2)+2, 4*c*x(1);
    4*c*x(1), 2*c];

% Hessian of nonlinear inequality constraint
Hg = 2*eye(2);
Hout = H + lambda.ineqnonlin*Hg;
end

% Function to compute gradient for all equality and inequality constraints
function [c,ceq,gc,gceq] = computenonlinConsGrad(x,a)
c = (x(1)^2 + (x(2)-a)^2 - 2);
ceq = [ ];

gc = [2*x(1);2*(x(2)-a)];
gceq = [];
end

Case 3: Solving with theoretical gradient and Hessians-Multiple

clc;
clear;
close all;

% Paramters of the problem
c = 100;
a = 3;

% Function handles for fun, gradient and Hessian
fun = @(x)computeFunGrad(x,c);
noncol = @(x)computenonlinConsGrad(x,a);
HessMult = @(x,lambda,v)computeHessMultFcn(x,lambda,v,c);

% Linear constraints
A = [0,-1];
b = 0;

x0 = [0;0];
options = optimoptions('fmincon','Algorithm','interior-point',...
    'SpecifyObjectiveGradient',true,'SpecifyConstraintGradient',true,...
    'SubproblemAlgorithm','cg','HessianMultiplyFcn',HessMult);

[x,fval,exitflag,output] = fmincon(fun,x0,[],[],[],[],[],[],noncol,options);

% Function to compute function value and gradiant value at the point
function [f,gradf] = computeFunGrad(x,c)
% Calculate objective f
f = c*(x(2) - x(1)^2)^2 + (1-x(1))^2;

gradf = [4*c*(x(2)+x(1)^2)*x(1)-2*(1-x(1));
    2*c*(x(2)+x(1)^2)];

end

% Function to Hessian of Lagrangian at the point
function W = computeHessMultFcn(x,lambda,v,c)
% Hessian of objective
H = [12*c*x(1)^2+4*c*x(2)+2, 4*c*x(1);
    4*c*x(1), 2*c];

% Hessian of nonlinear inequality constraint
Hg = 2*eye(2);
Hout = H + lambda.ineqnonlin*Hg;

W = Hout*v;
end

% Function to compute gradient for all equality and inequality constraints
function [c,ceq,gc,gceq] = computenonlinConsGrad(x,a)
c = x(1)^2 + (x(2)-a)^2 - 2;
ceq = [ ];

gc = [2*x(1);2*(x(2)-a)];
gceq = [];
end

Case 3 will be helpful, especially when you have a large sparse Hessian matrix. It speeds up the convergence.

Note that when we are computing Hessian, we are actually computing Hessian of Lagrangian which is as follows

$$ \nabla^2 L(x,\lambda) = \nabla^2 f(x) + \sum_{i}\lambda_{\rm ineq}(i)\nabla^2 c_i(x) + \sum_{j}\lambda_{\rm eq}(i)\nabla^2 ceq_j(x) $$

where $c_i(x), i =1,\ldots,N$ and $ceq_j(x), j =1,\ldots,M$ are equality and inequality constraints. The $\lambda$’s are Lagrangian multipliers. The notations may seem unconventional, but they are chosen to align with MATLAB’s syntax for easier understanding. In the previous example, we had only one inequality constraint. However, when dealing with multiple constraints, you can employ a loop to iteratively add them. Here’s a syntactical representation:

Hobj % Hessian of the objective

% Adding Hessian of inequality nonlinear constraints
for i = 1:N
Hobj  = Hobj + lambda.ineqnonlin(1)*Hc(:,:,i); % where Hc(:,:,i) is Hessian of c_i(x)
end

% Adding Hessian of equality nonlinear constraints (not if it is not affine, then it is nonconvex)
for j = 1:M
Hobj  = Hobj + lambda.eqnonlin(1)*Hceq(:,:,i); % where Hceq(:,:,j) is Hessian of ceq_j(x)
end

This approach enables efficient handling of multiple constraints in the optimization problem.

Computing Gradient/Hessian using symbolic toolbox of MATLAB

To err is human, and hence there is a high probability that one may make small mistakes when computing gradients and Hessians with complicated functions. One way to verify the correctness of these expressions is by using the Symbolic Math Toolbox in MATLAB.

The following code computes symbolic gradients and Hessians, which we used in the example above:

clc;
clear;
close all;

% Defining data
x = sym('x',[2,1]);

% Parameter variable
syms c a;

% Defining function symbolically
fs = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

% Symbolic gradient
gradfs = gradient(fs,x);

% Symbolic hessian
hessfs = hessian(fs,x);

% Defining constraint symbolically
gs = x(1)^2 + (x(2)-a)^2 - 2;

% Symbolic gradient
gradgs = gradient(gs,x);

% Symbolic hessian
hessgs = hessian(gs,x);

Note that we have even parameterized $c$ and $a$ to demonstrate that we can use the symbolic toolbox in a more general scenario where parameters can vary. However, if desired, one can fix them to some real value. These symbolic expressions can be compared against the analytical expressions derived manually to ensure correctness.

One way to check is to display the expression in the command line and verify it by observation. Another method would be to evaluate the expression at several sample points and check, as follows:

% Sample checking over various values
M = 1e2;
eps = 1e-7;
for i = 1:M
    x0 = randn(2,1);
    c0  = randn;
    a0  = randn;

    % Symbolic function value
    gradfsVal = double(subs(subs(gradfs,x,x0),c,c0));
    hessfsval = double(subs(subs(hessfs,x,x0),c,c0));

    % Computing theoretical
    %f = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

    gradfVal = [4*c0*(x0(2)+x0(1)^2)*x0(1)-2*(1-x0(1));
             2*c0*(x0(2)+x0(1)^2)];

    hessfVal = [12*c0*x0(1)^2+4*c0*x0(2)+2, 4*c0*x0(1);
    4*c0*x0(1), 2*c0];

    % Checking the accuracy
    if abs(gradfsVal - gradfVal) >= eps
        error('Gradient is in error');
    end

    if sum(abs(hessfsval - hessfVal),'all')>=eps
        error('Hessian is in error');
    end

    % Symbolic constraint value
    gradgsVal = double(subs(subs(gradgs,x,x0),a,a0));
    hessgsval = double(subs(subs(hessgs,x,x0),a,a0));

    % Computing theoretical
    gradgVal = [2*x0(1);2*(x0(2)-a0)];

    hessgVal = 2*eye(2);

    % Checking the accuracy
    if abs(gradgsVal - gradgVal) >= eps
        error('Gradient of constraint is in error');
    end

    if sum(abs(hessgsval - hessgVal),'all')>=eps
        error('Hessian of constraint is in error');
    end

end

Another way to check is with the help of fmincon as discussed here

Converting symbolically computed Gradient/Hessian to a MATLAB function

In the final step, it’s worth noting that one doesn’t always need to derive analytical expressions manually using MATLAB. MATLAB offers a feature to convert symbolically computed gradients and Hessians into functions, which can be directly utilized in your code. This process helps prevent errors from creeping in when typing out these expressions manually. However, it’s essential to define the function correctly without mistakes.

clc;
clear;
close all;

% Defining data
x = sym('x',[2,1]);

% Parameter variable
syms c;

% Defining function symbolically
fs = c*(x(2) + x(1)^2)^2 + (1-x(1))^2;

% Symbolic gradient
gradfs = gradient(fs,x);

% Symbolic hessian
hessfs = hessian(fs,x);

matlabFunction(gradfs,'File','myGrad','vars',{x,c});
matlabFunction(hessfs,'File','myHess','vars',{x,c});

The provided code creates two .m files in your folder: myGrad.m and myHess.m. These files take inputs x, c and output the gradient and Hessian, respectively, at the given point x and parameterized by c. This feature of MATLAB, which generates function files from symbolic expressions, is incredibly useful. It allows us to avoid repeatedly using the symbolic toolbox, which can be quite slow.

Some general discussion

For maximizing the objective, just multiple objective function by $-1$ for MATLAB optimization toolbox. However, note that the YALMIP and CVX supports maximize operation without having to flip the sign.

A few general comments on the tools

The three popular choices with MATLAB to solve an optimization problem are:

  • MATLAB optimization toolbox
    • Fast and efficient as all functions are optimized to MATLAB.
    • Have more flexibility in debugging the code.
    • All functions of MATLAB can be used while modelling with no incompatibility issues.
    • The best aspect of fmincon function is that it can accept analytical gradient and Hessian of objective and constraints, which increases the reliability, results are more robust and converges with fewer iterations.
    • Can handle most of the convex and non-convex problems.
    • Cannot handle real-valued complex domain function without re-modelling
  • YALMIP to prototype optimization problems in MATLAB
    • Syntax is straightforward to follow and model the problem.
    • Almost all MATLAB functions are compatible with YALMIP.
    • Can handle convex and non-convex problems.
    • Can interface with most popular solvers like MOSEK, SeDuMi, Gurobi etc.
    • However, it requires a license for the solvers, which is free for academics.
    • Don’t have the option to input analytically computed gradients and Hessians.
    • Can handle real-valued complex domain functions without re-modelling
  • CVX to prototype convex optimization problems in MATLAB.
    • Handles only convex optimization problems.
    • One needs to adhere to the so-called disciplined convex programming (DCP) ruleset; otherwise, CVX may not recognize it as convex and may not accept the model even if the problem is convex.
    • If CVX accepts the model, one can be assured that the problem is convex.
    • CVX doesn’t have the option to input analytically computed gradients and Hessians.
    • Can handle real-valued complex domain convex function without re-modelling.
    • Can interface with most popular solvers, however, requires license similar to YALMIP.

YALMIP and CVX can solve many other convex problems which are not yet currently directly supported by MATLAB like SDP. One can see the documentation of YALMIP and CVX for information on that.

Some tips when solving optimization problems using MATLAB

  • Whenever the constraint values are very small or very large, precision issues may arise. To address such situations, one can artificially scale the constraints. For example, the following two constraints are equivalent: $$ \begin{equation} f_i(x)\leq 0 \end{equation} $$ and $$ \begin{equation} c_i f_i(x)\leq 0 \end{equation} $$ where $c_i$ is some positive constant.

  • Likewise, one can scale the objective value before optimization and then rescale it after the optimization is solved. This ensures numerical stability and helps mitigate precision issues caused by extremely large or small objective values.

  • Sometimes, even scaling the variable of interest may help. For instance, one can solve for $x = ay$ for some positive constant $a$ and replace all functions $f_i(x)$ by $f_i(y)=f_i(\frac{x}{a})$ and once the results are obtained rescale them.

  • When dealing with logarithmic functions, the performance of YALMIP/CVX may vary depending on the specific function involved. These solvers typically rely on exponential cones, which can be slower compared to other standard convex problems. In such cases, based on my experience, using fmincon with analytically computed gradients and Hessians can offer significant benefits in terms of both accuracy and speed of convergence.

Zakir Hussain Shaik
Zakir Hussain Shaik
PhD Student in Communications Systems

My research interests include wireless communications, distributed signal processing, and convex optimization

Related