Calculation of the conjugate of piecewise functions

Conjugate of piecewise function

In this post, we’ll show how to calculate the conjugate of a piecewise function f(x),
such as

    \begin{align*} f(x) = \begin{cases} f_1(x) & l_1 \leq x \leq l_2 \\ f_2(x) & l_2 < x \leq l_3 \end{cases} \end{align*}

The conjugate for the above is:

    \begin{align*} f^\ast(y) & = \sup_x xy - f(x), \\ & = \max\left(\sup_{l_1 \leq x \leq l_2} xy - f_1(x), \sup_{l_2 < x \leq l_3} xy - f_2(x)\right), \\ & = \max\left(f_1^\ast(y), f_2^\ast(y) \right), \end{align*}

where f_1^\ast(y) is the conjugate of the function f_1(x) constrained to l_1 \leq x \leq l_2. So, computing the conjugate of a piecewise function is simple as long as the conjugates of the sub-functions can be computed easily.

Conjugate of constrained function

In this section, we’ll show how to calculate the convex conjugate of a function \bar{f}(x), which is the function f(x) which is constrained to a domain l \leq x \leq u.
The conjugate is defined as:

    \begin{align*} \bar{f}^\ast(y) = \sup_{l \leq x \leq u} xy - f(x). \end{align*}

For the unconstrained case, the solution would be given by:

    \begin{align*} x = \left[f' \right]^{-1}(y), \end{align*}

where f'(x) is the derivative of f(x). Since f(x) is convex, f'(x) is a monotonically increasing function. Therefore, we have

    \begin{align*} y_l' = f'(l), \quad  y_u' = f'(u). \end{align*}

The conjugate of the constrained function is therefore:

    \begin{align*} \bar{f}^\ast(y) = \begin{cases} ly - f(l) & y < y_l, \\  f^\ast(y) & y_l \leq y \leq y_u, \\ uy - f(u) & y > y_u. \end{cases} \end{align*}

The convex conjugate for the unconstrained function can be automatically calculated in Python — as described in the previous post.

Code that do the above is given below:

def calc_const_conj(fun_str, lstr='l', ustr='u', varnames='x l u', fname='plot.png'):
    # set the symbols
    vars = sp.symbols(varnames)
    x = vars[0] if isinstance(vars, tuple) else vars
    y = sp.symbols('y', real=True)

    # set the function and objective
    fun = parse_expr(fun_str)
    obj = x*y - fun
    fun_diff = sp.diff(fun, x)
    obj_diff = sp.diff(obj, x)

    l = parse_expr(lstr)
    u = parse_expr(ustr)

    # calculate yl and yu
    yl = sp.simplify(fun.subs(x, l))
    yu = sp.simplify(fun.subs(x, u))

    dyl = sp.simplify(fun_diff.subs(x, l))
    dyu = sp.simplify(fun_diff.subs(x, u))

    # calculate derivative of obj and solve for zero
    sol = solve(obj_diff, x)

    # substitute solution into objective
    solfun = sp.collect(sp.simplify(obj.subs(x, sol[0])), x)

    # print the function and conjugate:
    print('{0:45} y < {1}'.format(str(sp.simplify(l*y - yl)), dyl))
    print('{0:45} {1} <= y <= {2}'.format(str(solfun), dyl, dyu))
    print('{0:45} y > {1}'.format(str(sp.simplify(u*y - yu)), dyu))

# Example: calc_const_conj('1/2*x**2', lstr='-1', ustr='1', varnames='x')

Table of conjugates

Function Constraint Conjugate
Linear ax + b l \leq x \leq u \begin{cases} uy -ua - b & y \geq a \\ ly-la - b & y < a \end{cases}
Simple Quadratic \frac{1}{2}ax^2 l \leq x \leq u \begin{cases} \frac{l}{2}\left(2y - al \right)  & y < al, \\ \frac{1}{2a}y^2 & al \leq y \leq au, \\ \frac{u}{2}(2y - au) & y > au. \end{cases}
Quadratic a x^2 + bx + c l \leq x \leq u \begin{cases} -2al  -b + ly & y < 2al + b \\ \frac{-4ac + b^2 - 2by + y^2}{4a}  & 2al + b \leq y \leq 2au + b \\ -2au - b + uy  &  y > 2au + b \end{cases}
PE-like \frac{1}{2}(x-1)^2 0 \leq x \begin{cases} -\frac{1}{2} & y < -1, \\ \frac{1}{2}y^2 + y & -1 \leq y. \\ \end{cases}
Hinge \max\left(0, \epsilon-x \right) \begin{cases} \epsilon y & -1 \leq y \leq 0, \\ \infty & \textrm{otherwise}. \end{cases}
(scaled)Hinge a\max\left(0, \epsilon-x \right) \begin{cases} \epsilon y & -1 \leq y \leq 0, \\ \infty & \textrm{otherwise}. \end{cases}
Continue Reading

Automatically calculating the convex conjugate (Fenchel dual) using Python/Sympy

The convex conjugate (or Fenchel dual) for a single variable function f(x) is defined as:

    \begin{align*} f^\ast(y) = \sup_{x} xy - f(x). \end{align*}

We assume that f(x) is convex, differentiable and simple (closed-form solution exist for minimum). The conjugate can then symbolically be calculated using the following Python code:

def calc_conjugate(str, varnames='x'):

    # set the symbols
    vars = sp.symbols(varnames)
    x = vars[0] if isinstance(vars, tuple) else vars
    y = sp.symbols('y', real=True)

    # set the function and objective
    fun = parse_expr(str)
    obj = x*y - fun

    # calculate derivative of obj and solve for zero
    sol = solve(sp.diff(obj, x), x)

    # substitute solution into objective
    solfun = sp.simplify(obj.subs(x, sol[0]))

    # if extra values were passed add to lambda function
    varnames = [y] + list(vars[1:]) if isinstance(vars, tuple) else y

    return (sp.sstr(solfun), lambdify(vars, fun, 'numpy'), lambdify(varnames, solfun, 'numpy'))

The full source code is here. The above code was used to calculate and plot the conjugates of several functions, which is given below.

Function Conjugate
Quadratic x^2 \frac{1}{4}y^2
Exponential \exp(x) y(\log(y) - 1)
Log -\log(x) -\log(-y) - 1
Log-Exponential \log(1+\exp(x)) [1-y]\log(1-y) + y\log y
x log(x) x\log(x) \exp(y - 1)
quadexploglog-expxlogx
Continue Reading

Fast projection onto a Simplex (Python)

(work in progress)

Many convex optimization problems can be solved, fairly simply, if the projection onto the constraints can be quickly and simply calculated. By projection onto the constraints \mathcal{C}, we mean the solution to the following optimization problem:

(1)   \begin{align*} \begin{array}{lll} P_\mathcal{C}(y) = & \displaystyle \arg \min_{x} & \frac{1}{2}\left\|x-y \right\|^2, \\ & \textrm{s.t.} & x \in \mathcal{C}. \end{array} \end{align*}

As an example, the subgradient descent method can incorporate the projection operator to deal with constraints. Here we focus on projection on a simplex: x^\top a = 1, x >= 0. The objective function that we wish to solve is:

(2)   \begin{align*} \begin{array}{lll} P_\mathcal{C}(y) = & \displaystyle \arg \min_{x} & \frac{1}{2}\left\|x-y \right\|^2, \\ & \textrm{s.t.} & x \succeq 0, \\ & & a^\top x = 1. \end{array} \end{align*}

As a test case, we can implement the projection in cvxpy:

import cvxpy as cvx
import numpy as np

def proj_simplex_cvxpy(a, y):
    '''
    Returns the point in the simplex a^Tx = 1, x&amp;amp;amp;amp;gt;=0 that is
     closest to y (according to Euclidian distance)
    '''
    d = len(a)

    # setup the objective and constraints and solve the problem
    x = cvx.Variable(d)
    obj = cvx.Minimize(cvx.sum_squares(x - y))
    constr = [x &gt;= 0, a*x == 1]
    prob = cvx.Problem(obj, constr)
    prob.solve()

    return np.array(x.value).squeeze()

We can write the dual as:

    \begin{align*} L(\lambda) & = \min_{x} \sum_{i=1}^{n}\left[\frac{1}{2}(x_i - y_i)^2 + \lambda a_i x_i \right] -\lambda\qquad \textrm{s.t.} \quad x \succeq 0, \\ & =\min_{x} \sum_{i=1}^{n}\left[\frac{1}{2}x_i^2 - x_i[y_i - \lambda a_i]- \frac{1}{2}y_i^2\right] - \lambda \qquad \textrm{s.t.} \quad x \succeq 0. \end{align*}

Since the x_i decouples (for fixed \lambda) it can be solved as:

    \begin{align*} x_i = \max\left(0, y_i-\lambda a_i \right). \end{align*}

The dual objective (which is to be maximized) is therefore:

    \begin{align*} L(\lambda) & = -\sum_{i=1}^n \frac{1}{2}\max(0, y_i - \lambda a_i)^2 - \frac{1}{2}y_i^2 - \lambda \end{align*}

An example of the dual function is given below (source code is here):

daulexample

As \lambda is decreased from a large value, the values of x_i will “switch on” from 0. To determine where the variables will switch on we sort the values according to y_i/a_i.
Consider that we have an index \mathcal{I} that sorts according to y_i/a_i. In other words:

    \begin{align*} y_{I_1}/a_{I_1} \leq  y_{I_2}/a_{I_2} \leq  \ldots \leq  y_{I_n}/a_{I_n}. \end{align*}

We can refer to these as our indexed \lambda values,

    \begin{align*} \lambda_i = \frac{y_{\cI_i}}{a_{\cI_i}}. \end{align*}

Suppose that we have a \lambda value. We then want to find an index k, so that

    \begin{align*} \lambda_{k-1} < \lambda \leq \lambda_k. \end{align*}

We can then write the objective function as:

    \begin{align*} L(\lambda) & = -\sum_{i=k}^{n} \frac{1}{2}(y_{\cI_i} - \lambda a_{\cI_i})^2 - \lambda + \frac{1}{2}\sum_{j=1}^n y_j^2. \end{align*}

The derivative of the above is therefore:

    \begin{align*} \partial L(\lambda) & = \sum_{i=k}^n a_{\cI_i}(y_{\cI_i} - \lambda a_{\cI_i}) - 1.\\ \end{align*}

By searching on the sorted list, we get an index k, such that \partial L(\lambda_k) \leq 0, but \partial L(\lambda_{k-1}) > 0.
The optimal lambda value can then be calculated as:

    \begin{align*} \lambda = \frac{\sum_{i=k}^n a_{\cI_i}y_{\cI_i} + 1}{\sum_{i=k}^n a_{\cI_i}}. \end{align*}

The final algorithm is:

def proj(a, y):

    l = y/a
    idx = np.argsort(l)
    d = len(l)

    evalpL = lambda k: np.sum(a[idx[k:]]*(y[idx[k:]] - l[idx[k]]*a[idx[k:]]) ) -1


    def bisectsearch():
        idxL, idxH = 0, d-1
        L = evalpL(idxL)
        H = evalpL(idxH)

        if L<0:
            return idxL

        while (idxH-idxL)>1:
            iMid = int((idxL+idxH)/2)
            M = evalpL(iMid)

            if M>0:
                idxL, L = iMid, M
            else:
                idxH, H = iMid, M

        return idxH

    k = bisectsearch()
    lam = (np.sum(a[idx[k:]]*y[idx[k:]])-1)/np.sum(a[idx[k:]])

    x = np.maximum(0, y-lam*a)

    return x

Other resources:

  • MATLAB implementation of projection onto unit simplex (description here)
  • Michelot, C. (1986). A finite algorithm for finding the projection of a point onto the canonical simplex of∝ n. Journal of Optimization Theory and Applications, 50(1), 195-200.

 

Todo:

  1. Implement the code in C/C++
  2. Give a better test case
  3. Simplify the explanation and implementation
  4. Give a practical example (e.g. estimation problem with subgradient descent)
Continue Reading