upgrade
upgrade

๐Ÿ”ŽConvex Geometry

Important Convex Functions

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Convex functions are the backbone of optimization theoryโ€”and you'll encounter them constantly in problems involving minimization, duality, and constraint handling. The reason convexity matters so much is that any local minimum of a convex function is automatically a global minimum, which makes optimization tractable. When you're working through problems in convex geometry, you're being tested on your ability to recognize convexity, verify it using second derivatives or other criteria, and apply the right function to model a given scenario.

This guide organizes important convex functions by their structural properties and typical applications: elementary functions with simple curvature, functions built from set geometry, and specialized functions used in optimization algorithms. Don't just memorize definitionsโ€”know what makes each function convex and when you'd reach for it in a problem.


Elementary Convex Functions

These are the foundational functions you'll use to build intuition about convexity. Their convexity follows directly from calculusโ€”specifically, non-negative second derivatives guarantee convexity for twice-differentiable functions.

Linear Functions

  • Form: f(x)=ax+bf(x) = ax + bโ€”both convex and concave simultaneously, making them affine functions
  • Second derivative equals zero, which satisfies the convexity condition fโ€ฒโ€ฒ(x)โ‰ฅ0f''(x) \geq 0 trivially
  • Define hyperplanes and half-spaces in convex optimization; serve as objective functions in linear programming

Quadratic Functions

  • Form: f(x)=ax2+bx+cf(x) = ax^2 + bx + c with a>0a > 0โ€”the positive leading coefficient ensures the parabola opens upward
  • Constant positive second derivative fโ€ฒโ€ฒ(x)=2a>0f''(x) = 2a > 0 confirms strict convexity throughout the domain
  • Central to quadratic programming, where objectives are quadratic and constraints are linear; appears in least-squares problems

Exponential Functions

  • Form: f(x)=eaxf(x) = e^{ax}โ€”convex for all real values of aa, with fโ€ฒโ€ฒ(x)=a2eax>0f''(x) = a^2 e^{ax} > 0
  • Rapid growth property makes it essential for modeling compound growth, decay processes, and risk in economics
  • Log-sum-exp function logโก(โˆ‘exi)\log(\sum e^{x_i}) is a smooth convex approximation to the max function

Compare: Linear vs. Quadraticโ€”both have constant second derivatives, but linear functions have fโ€ฒโ€ฒ=0f'' = 0 (affine) while quadratics have fโ€ฒโ€ฒ>0f'' > 0 (strictly convex). If asked to find a unique minimizer, you need strict convexityโ€”linear functions won't give you that.


Functions Derived from Concave Building Blocks

Some important convex functions arise by negating concave functions or applying transformations. Understanding this relationship helps you convert between convex and concave formulations in optimization problems.

Negative Log Functions

  • Form: f(x)=โˆ’logโก(x)f(x) = -\log(x)โ€”convex on x>0x > 0 since fโ€ฒโ€ฒ(x)=1/x2>0f''(x) = 1/x^2 > 0
  • Logarithm itself is concave, but its negative gives you a convex function useful for barrier methods
  • Models diminishing returns in economics; appears in maximum likelihood estimation and information-theoretic objectives

Entropy Functions

  • Form: f(p)=โˆ’โˆ‘pilogโก(pi)f(p) = -\sum p_i \log(p_i)โ€”this is negative entropy, which is convex in the probability vector pp
  • Strict convexity follows from the strict concavity of xlogโกxx \log x and properties of probability simplices
  • Fundamental in information theory for channel capacity problems; used in maximum entropy methods and statistical mechanics

Compare: Negative log vs. Entropyโ€”both derive convexity from the concavity of logarithms, but negative log operates on scalars while entropy operates on probability distributions. Entropy problems typically include the constraint โˆ‘pi=1\sum p_i = 1.


Norm-Based Functions

Norms provide a natural way to measure size and distance, and their convexity follows from the triangle inequality and positive homogeneityโ€”not from second derivatives.

Norm Functions

  • Form: f(x)=โˆฅxโˆฅf(x) = \|x\|โ€”convex for any valid norm, including L1L_1, L2L_2, and LโˆžL_\infty
  • Triangle inequality โˆฅx+yโˆฅโ‰คโˆฅxโˆฅ+โˆฅyโˆฅ\|x + y\| \leq \|x\| + \|y\| directly implies convexity via the definition
  • Essential for regularization in machine learning: L1L_1 promotes sparsity (LASSO), L2L_2 prevents large weights (ridge regression)

Support Functions

  • Form: ฯƒC(x)=supโก{โŸจx,yโŸฉ:yโˆˆC}\sigma_C(x) = \sup\{\langle x, y \rangle : y \in C\}โ€”the supremum of linear functions over a convex set CC
  • Convex because suprema of convex functions are convex; each โŸจx,yโŸฉ\langle x, y \rangle is linear (hence convex) in xx
  • Characterizes convex sets completely through duality; critical in Fenchel conjugate theory and geometric optimization

Compare: Norms vs. Support Functionsโ€”both are convex and positively homogeneous, but norms measure the "size" of a vector while support functions measure how far a set extends in a given direction. Support functions encode geometric information about sets.


Set-Based Convex Functions

These functions encode constraints geometrically, allowing you to incorporate feasibility conditions directly into the objective. They're essential for understanding constrained optimization through an unconstrained lens.

Indicator Functions

  • Form: ฮดC(x)=0\delta_C(x) = 0 if xโˆˆCx \in C, and ฮดC(x)=+โˆž\delta_C(x) = +\infty otherwiseโ€”an extended real-valued function
  • Convex when CC is convex because the epigraph {(x,t):xโˆˆC,tโ‰ฅ0}\{(x, t) : x \in C, t \geq 0\} is a convex set
  • Converts constrained problems to unconstrained: minimizing f(x)+ฮดC(x)f(x) + \delta_C(x) enforces xโˆˆCx \in C automatically

Barrier Functions

  • Approach +โˆž+\infty as xx approaches the boundary of a feasible region; example: โˆ’โˆ‘logโก(xi)-\sum \log(x_i) for x>0x > 0
  • Self-concordant barriers satisfy special smoothness conditions enabling efficient interior-point algorithms
  • Keep iterates strictly feasible during optimization, avoiding boundary issues that plague other methods

Compare: Indicator vs. Barrierโ€”both enforce feasibility, but indicator functions are discontinuous (hard constraints) while barrier functions are smooth (soft constraints that strengthen near boundaries). Interior-point methods use barriers; projected gradient methods work with indicators.


Transformation-Preserving Functions

These functions maintain convexity under specific operations, allowing you to build complex convex functions from simpler ones while preserving optimization tractability.

Perspective Functions

  • Form: g(x,t)=tโ‹…f(x/t)g(x, t) = t \cdot f(x/t) for t>0t > 0โ€”extends a convex function ff to a higher-dimensional domain
  • Preserves convexity: if ff is convex, then gg is jointly convex in (x,t)(x, t)
  • Appears in relative entropy KL(pโˆฅq)=โˆ‘pilogโก(pi/qi)\text{KL}(p \| q) = \sum p_i \log(p_i/q_i) and portfolio optimization with scaling

Compare: Perspective vs. Original Functionโ€”the perspective transformation adds a scaling variable tt while maintaining convexity. This is useful when your problem naturally involves ratios or when you need to homogenize a function.


Quick Reference Table

ConceptBest Examples
Calculus-based convexity (fโ€ฒโ€ฒโ‰ฅ0f'' \geq 0)Quadratic, Exponential, Negative Log
Affine functionsLinear
Norm-based convexityL1L_1 norm, L2L_2 norm, Support functions
Set-encoding functionsIndicator, Barrier
Probability/information theoryEntropy, Negative log
Convexity-preserving operationsPerspective functions
Regularization in MLNorm functions (L1L_1, L2L_2)

Self-Check Questions

  1. Both linear and quadratic functions have constant second derivatives. Why does a quadratic function have a unique minimizer while a linear function (with aโ‰ 0a \neq 0) does not?

  2. Which two functions from this guide would you use to model a constrained optimization problem where you want iterates to stay strictly inside the feasible region? How do they differ in approach?

  3. The logarithm is concave, yet entropy-based objectives appear in convex optimization. Explain how convexity is achieved in entropy minimization problems.

  4. Compare norm functions and support functions: what geometric property do they share, and what different information do they encode about vectors versus sets?

  5. You're given a convex function f(x)f(x) and need to create a joint function g(x,t)g(x, t) that remains convex while allowing the input to be scaled by tt. Which function type accomplishes this, and what condition must tt satisfy?