Blog – Christo du Plessis

Calculation of the conjugate of piecewise functions

December 26, 2016 No Comments

Conjugate of piecewise function

In this post, we’ll show how to calculate the conjugate of a piecewise function $f(x)$ ,
such as

$\begin{align*} f(x) = \begin{cases} f_1(x) & l_1 \leq x \leq l_2 \\ f_2(x) & l_2 < x \leq l_3 \end{cases} \end{align*}$

The conjugate for the above is:

$\begin{align*} f^\ast(y) & = \sup_x xy - f(x), \\ & = \max\left(\sup_{l_1 \leq x \leq l_2} xy - f_1(x), \sup_{l_2 < x \leq l_3} xy - f_2(x)\right), \\ & = \max\left(f_1^\ast(y), f_2^\ast(y) \right), \end{align*}$

where $f_1^\ast(y)$ is the conjugate of the function $f_1(x)$ constrained to $l_1 \leq x \leq l_2$ . So, computing the conjugate of a piecewise function is simple as long as the conjugates of the sub-functions can be computed easily.

Conjugate of constrained function

In this section, we’ll show how to calculate the convex conjugate of a function $\bar{f}(x)$ , which is the function $f(x)$ which is constrained to a domain $l \leq x \leq u$ .
The conjugate is defined as:

$\begin{align*} \bar{f}^\ast(y) = \sup_{l \leq x \leq u} xy - f(x). \end{align*}$

For the unconstrained case, the solution would be given by:

$\begin{align*} x = \left[f' \right]^{-1}(y), \end{align*}$

where $f'(x)$ is the derivative of $f(x)$ . Since $f(x)$ is convex, $f'(x)$ is a monotonically increasing function. Therefore, we have

$\begin{align*} y_l' = f'(l), \quad y_u' = f'(u). \end{align*}$

The conjugate of the constrained function is therefore:

$\begin{align*} \bar{f}^\ast(y) = \begin{cases} ly - f(l) & y < y_l, \\ f^\ast(y) & y_l \leq y \leq y_u, \\ uy - f(u) & y > y_u. \end{cases} \end{align*}$

The convex conjugate for the unconstrained function can be automatically calculated in Python — as described in the previous post.

Code that do the above is given below:

def calc_const_conj(fun_str, lstr='l', ustr='u', varnames='x l u', fname='plot.png'):
    # set the symbols
    vars = sp.symbols(varnames)
    x = vars[0] if isinstance(vars, tuple) else vars
    y = sp.symbols('y', real=True)

    # set the function and objective
    fun = parse_expr(fun_str)
    obj = x*y - fun
    fun_diff = sp.diff(fun, x)
    obj_diff = sp.diff(obj, x)

    l = parse_expr(lstr)
    u = parse_expr(ustr)

    # calculate yl and yu
    yl = sp.simplify(fun.subs(x, l))
    yu = sp.simplify(fun.subs(x, u))

    dyl = sp.simplify(fun_diff.subs(x, l))
    dyu = sp.simplify(fun_diff.subs(x, u))

    # calculate derivative of obj and solve for zero
    sol = solve(obj_diff, x)

    # substitute solution into objective
    solfun = sp.collect(sp.simplify(obj.subs(x, sol[0])), x)

    # print the function and conjugate:
    print('{0:45} y < {1}'.format(str(sp.simplify(l*y - yl)), dyl))
    print('{0:45} {1} <= y <= {2}'.format(str(solfun), dyl, dyu))
    print('{0:45} y > {1}'.format(str(sp.simplify(u*y - yu)), dyu))

# Example: calc_const_conj('1/2*x**2', lstr='-1', ustr='1', varnames='x')

Table of conjugates

	Function	Constraint	Conjugate
Linear	$ax + b$	$l \leq x \leq u$	$\begin{cases} uy -ua - b & y \geq a \\ ly-la - b & y < a \end{cases}$
Simple Quadratic	$\frac{1}{2}ax^2$	$l \leq x \leq u$	$\begin{cases} \frac{l}{2}\left(2y - al \right) & y < al, \\ \frac{1}{2a}y^2 & al \leq y \leq au, \\ \frac{u}{2}(2y - au) & y > au. \end{cases}$
Quadratic	$a x^2 + bx + c$	$l \leq x \leq u$	$\begin{cases} -2al -b + ly & y < 2al + b \\ \frac{-4ac + b^2 - 2by + y^2}{4a} & 2al + b \leq y \leq 2au + b \\ -2au - b + uy & y > 2au + b \end{cases}$
PE-like	$\frac{1}{2}(x-1)^2$	$0 \leq x$	$\begin{cases} -\frac{1}{2} & y < -1, \\ \frac{1}{2}y^2 + y & -1 \leq y. \\ \end{cases}$
Hinge	$\max\left(0, \epsilon-x \right)$		$\begin{cases} \epsilon y & -1 \leq y \leq 0, \\ \infty & \textrm{otherwise}. \end{cases}$
(scaled)Hinge	$a\max\left(0, \epsilon-x \right)$		$\begin{cases} \epsilon y & -1 \leq y \leq 0, \\ \infty & \textrm{otherwise}. \end{cases}$

Uncategorized

Automatically calculating the convex conjugate (Fenchel dual) using Python/Sympy

October 18, 2016 No Comments

The convex conjugate (or Fenchel dual) for a single variable function $f(x)$ is defined as:

$\begin{align*} f^\ast(y) = \sup_{x} xy - f(x). \end{align*}$

We assume that $f(x)$ is convex, differentiable and simple (closed-form solution exist for minimum). The conjugate can then symbolically be calculated using the following Python code:

def calc_conjugate(str, varnames='x'):

    # set the symbols
    vars = sp.symbols(varnames)
    x = vars[0] if isinstance(vars, tuple) else vars
    y = sp.symbols('y', real=True)

    # set the function and objective
    fun = parse_expr(str)
    obj = x*y - fun

    # calculate derivative of obj and solve for zero
    sol = solve(sp.diff(obj, x), x)

    # substitute solution into objective
    solfun = sp.simplify(obj.subs(x, sol[0]))

    # if extra values were passed add to lambda function
    varnames = [y] + list(vars[1:]) if isinstance(vars, tuple) else y

    return (sp.sstr(solfun), lambdify(vars, fun, 'numpy'), lambdify(varnames, solfun, 'numpy'))

The full source code is here. The above code was used to calculate and plot the conjugates of several functions, which is given below.

	Function	Conjugate
Quadratic	$x^2$	$\frac{1}{4}y^2$
Exponential	$\exp(x)$	$y(\log(y) - 1)$
Log	$-\log(x)$	$-\log(-y) - 1$
Log-Exponential	$\log(1+\exp(x))$	$[1-y]\log(1-y) + y\log y$
x log(x)	$x\log(x)$	$\exp(y - 1)$

Uncategorized

Fast projection onto a Simplex (Python)

August 22, 2016 No Comments

(work in progress)

Many convex optimization problems can be solved, fairly simply, if the projection onto the constraints can be quickly and simply calculated. By projection onto the constraints $\mathcal{C}$ , we mean the solution to the following optimization problem:

(1) $\begin{align*} \begin{array}{lll} P_\mathcal{C}(y) = & \displaystyle \arg \min_{x} & \frac{1}{2}\left\|x-y \right\|^2, \\ & \textrm{s.t.} & x \in \mathcal{C}. \end{array} \end{align*}$

As an example, the subgradient descent method can incorporate the projection operator to deal with constraints. Here we focus on projection on a simplex: $x^\top a = 1, x >= 0$ . The objective function that we wish to solve is:

(2) $\begin{align*} \begin{array}{lll} P_\mathcal{C}(y) = & \displaystyle \arg \min_{x} & \frac{1}{2}\left\|x-y \right\|^2, \\ & \textrm{s.t.} & x \succeq 0, \\ & & a^\top x = 1. \end{array} \end{align*}$

As a test case, we can implement the projection in cvxpy:

import cvxpy as cvx
import numpy as np

def proj_simplex_cvxpy(a, y):
    '''
    Returns the point in the simplex a^Tx = 1, x&amp;amp;amp;amp;gt;=0 that is
     closest to y (according to Euclidian distance)
    '''
    d = len(a)

    # setup the objective and constraints and solve the problem
    x = cvx.Variable(d)
    obj = cvx.Minimize(cvx.sum_squares(x - y))
    constr = [x &gt;= 0, a*x == 1]
    prob = cvx.Problem(obj, constr)
    prob.solve()

    return np.array(x.value).squeeze()

We can write the dual as:

$\begin{align*} L(\lambda) & = \min_{x} \sum_{i=1}^{n}\left[\frac{1}{2}(x_i - y_i)^2 + \lambda a_i x_i \right] -\lambda\qquad \textrm{s.t.} \quad x \succeq 0, \\ & =\min_{x} \sum_{i=1}^{n}\left[\frac{1}{2}x_i^2 - x_i[y_i - \lambda a_i]- \frac{1}{2}y_i^2\right] - \lambda \qquad \textrm{s.t.} \quad x \succeq 0. \end{align*}$

Since the $x_i$ decouples (for fixed $\lambda$ ) it can be solved as:

$\begin{align*} x_i = \max\left(0, y_i-\lambda a_i \right). \end{align*}$

The dual objective (which is to be maximized) is therefore:

$\begin{align*} L(\lambda) & = -\sum_{i=1}^n \frac{1}{2}\max(0, y_i - \lambda a_i)^2 - \frac{1}{2}y_i^2 - \lambda \end{align*}$

An example of the dual function is given below (source code is here):

As $\lambda$ is decreased from a large value, the values of $x_i$ will “switch on” from $0$ . To determine where the variables will switch on we sort the values according to $y_i/a_i$ .
Consider that we have an index $\mathcal{I}$ that sorts according to $y_i/a_i$ . In other words:

$\begin{align*} y_{I_1}/a_{I_1} \leq y_{I_2}/a_{I_2} \leq \ldots \leq y_{I_n}/a_{I_n}. \end{align*}$

We can refer to these as our indexed $\lambda$ values,

$\begin{align*} \lambda_i = \frac{y_{\cI_i}}{a_{\cI_i}}. \end{align*}$

Suppose that we have a $\lambda$ value. We then want to find an index $k$ , so that

$\begin{align*} \lambda_{k-1} < \lambda \leq \lambda_k. \end{align*}$

We can then write the objective function as:

$\begin{align*} L(\lambda) & = -\sum_{i=k}^{n} \frac{1}{2}(y_{\cI_i} - \lambda a_{\cI_i})^2 - \lambda + \frac{1}{2}\sum_{j=1}^n y_j^2. \end{align*}$

The derivative of the above is therefore:

$\begin{align*} \partial L(\lambda) & = \sum_{i=k}^n a_{\cI_i}(y_{\cI_i} - \lambda a_{\cI_i}) - 1.\\ \end{align*}$

By searching on the sorted list, we get an index $k$ , such that $\partial L(\lambda_k) \leq 0$ , but $\partial L(\lambda_{k-1}) > 0$ .
The optimal lambda value can then be calculated as:

$\begin{align*} \lambda = \frac{\sum_{i=k}^n a_{\cI_i}y_{\cI_i} + 1}{\sum_{i=k}^n a_{\cI_i}}. \end{align*}$

The final algorithm is:

def proj(a, y):

    l = y/a
    idx = np.argsort(l)
    d = len(l)

    evalpL = lambda k: np.sum(a[idx[k:]]*(y[idx[k:]] - l[idx[k]]*a[idx[k:]]) ) -1


    def bisectsearch():
        idxL, idxH = 0, d-1
        L = evalpL(idxL)
        H = evalpL(idxH)

        if L<0:
            return idxL

        while (idxH-idxL)>1:
            iMid = int((idxL+idxH)/2)
            M = evalpL(iMid)

            if M>0:
                idxL, L = iMid, M
            else:
                idxH, H = iMid, M

        return idxH

    k = bisectsearch()
    lam = (np.sum(a[idx[k:]]*y[idx[k:]])-1)/np.sum(a[idx[k:]])

    x = np.maximum(0, y-lam*a)

    return x

Other resources:

MATLAB implementation of projection onto unit simplex (description here)
Michelot, C. (1986). A finite algorithm for finding the projection of a point onto the canonical simplex of∝ n. Journal of Optimization Theory and Applications, 50(1), 195-200.

Todo:

Implement the code in C/C++
Give a better test case
Simplify the explanation and implementation
Give a practical example (e.g. estimation problem with subgradient descent)

Blog

Topics (to be done in coming weeks)

June 1, 2016 No Comments

Projection on a simplex (with Python implementation)
Speed comparison between quadratic programming solvers of Gurobi and cvxpy using SVM as baseline
Python implementation of robust SVM (with ramp loss) using the ConCave-Convex Procedure

Latex test:

$f(x) = x^2$