upgrade
upgrade

📊Mathematical Methods for Optimization

Key Concepts of Karush-Kuhn-Tucker Conditions

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

The Karush-Kuhn-Tucker (KKT) conditions are the backbone of constrained optimization—and if you're studying optimization methods, you're going to see them everywhere. These conditions tell you exactly what must be true at an optimal solution when you're dealing with inequality constraints, which is the realistic scenario in most applications. Whether you're optimizing a portfolio with budget limits, training a support vector machine, or allocating resources in an engineering system, KKT conditions provide the mathematical machinery to find and verify optimal solutions.

Here's what you're really being tested on: understanding why each condition exists, how they work together, and when they guarantee optimality. The four core conditions—stationarity, primal feasibility, dual feasibility, and complementary slackness—aren't arbitrary rules; they encode deep geometric and economic intuitions about what "optimal" means at a constrained boundary. Don't just memorize the formulas—know what principle each condition captures and how to apply them in problem-solving.


The Four Core KKT Conditions

These conditions must all hold simultaneously at an optimal point. Think of them as a checklist: miss one, and you can't claim optimality.

Stationarity Condition

  • The gradient of the Lagrangian must equal zero—mathematically, f(x)+μigi(x)+λjhj(x)=0\nabla f(x^*) + \sum \mu_i \nabla g_i(x^*) + \sum \lambda_j \nabla h_j(x^*) = 0 at the optimal point
  • No improving direction exists that remains feasible; the objective function's gradient is "balanced" by the constraint gradients
  • Connects objective and constraints through multipliers, showing how tightly the constraints bind the solution

Primal Feasibility Condition

  • The solution must satisfy all original constraints—both gi(x)0g_i(x) \leq 0 for inequalities and hj(x)=0h_j(x) = 0 for equalities
  • Defines the feasible region where any valid solution must live; a point outside this region is meaningless regardless of its objective value
  • Foundation for all other conditions—you can't discuss optimality if you're not even in the feasible set

Dual Feasibility Condition

  • Lagrange multipliers for inequality constraints must be non-negative—that is, μi0\mu_i \geq 0 for all inequality constraints
  • Reflects the "shadow price" interpretation; a positive multiplier means relaxing that constraint would improve the objective
  • Critical for distinguishing KKT from basic Lagrange multipliers—equality constraints have unrestricted multipliers, but inequalities require μi0\mu_i \geq 0

Complementary Slackness Condition

  • Either a constraint is active OR its multiplier is zero—formally, μigi(x)=0\mu_i \cdot g_i(x^*) = 0 for each inequality constraint
  • Identifies binding constraints at the optimum; if gi(x)<0g_i(x^*) < 0 (slack exists), then μi=0\mu_i = 0
  • Powerful problem-solving tool—lets you systematically test cases by assuming which constraints are active

Compare: Primal Feasibility vs. Dual Feasibility—both are "feasibility" conditions, but primal concerns the original variables xx while dual concerns the multipliers μ\mu. On exams, mixing these up is a common error. Remember: primal = constraints satisfied, dual = multipliers non-negative.


Theoretical Foundations

Understanding why KKT conditions work requires grasping their relationship to simpler optimization theory and the conditions that make them valid.

Relationship to Lagrange Multipliers

  • KKT generalizes Lagrange multipliers from equality-only constraints to include inequalities—the original Lagrange method is a special case
  • Multipliers measure constraint sensitivity; μi\mu_i tells you how much the optimal objective would change per unit relaxation of constraint ii
  • Gradient alignment principle remains central—at optimality, f\nabla f must be expressible as a combination of constraint gradients

Regularity Conditions (Constraint Qualifications)

  • Ensure KKT conditions are actually necessary—without regularity, pathological cases can have optima that violate KKT
  • Slater's condition is the most common: requires existence of a strictly feasible point where all inequality constraints hold strictly (gi(x)<0g_i(x) < 0)
  • LICQ (Linear Independence Constraint Qualification) requires active constraint gradients to be linearly independent—another frequently tested qualification

Necessary vs. Sufficient Conditions

  • KKT conditions are necessary for local optima under regularity—any local minimum must satisfy them
  • Sufficiency requires convexity; for convex problems (convex objective, convex feasible region), KKT points are global optima
  • Non-convex problems may have multiple KKT points, and additional analysis determines which are true optima

Compare: Slater's Condition vs. LICQ—both are constraint qualifications, but Slater's applies to convex problems and requires strict feasibility, while LICQ applies more generally and requires gradient independence. If an FRQ asks you to verify KKT applicability, identify which qualification is appropriate for the problem type.


Interpretation and Application

These concepts bridge the abstract conditions to practical problem-solving and real-world meaning.

Geometric Interpretation

  • Optimality occurs at tangency—the level curve of ff is tangent to the constraint boundary at the solution
  • Active constraints define the binding boundary; the optimal point lies on these constraints, not in the interior
  • Gradient cone conditionf-\nabla f must lie in the cone generated by active constraint gradients

Applications in Constrained Optimization

  • Support vector machines use KKT conditions to identify support vectors—points where the margin constraint is active
  • Economic resource allocation interprets multipliers as prices; μi>0\mu_i > 0 means resource ii is scarce and valuable
  • Engineering design applies KKT to optimize performance subject to material, safety, or geometric constraints

Compare: Active vs. Inactive Constraints—an active constraint has gi(x)=0g_i(x^*) = 0 and potentially μi>0\mu_i > 0, while an inactive constraint has gi(x)<0g_i(x^*) < 0 and necessarily μi=0\mu_i = 0. When solving KKT problems, systematically testing active set combinations is your main strategy.


Quick Reference Table

ConceptKey Formula/Requirement
StationarityxL=0\nabla_x L = 0 at optimum
Primal Feasibilitygi(x)0g_i(x) \leq 0, hj(x)=0h_j(x) = 0
Dual Feasibilityμi0\mu_i \geq 0 for all inequality multipliers
Complementary Slacknessμigi(x)=0\mu_i \cdot g_i(x^*) = 0
Slater's ConditionExists xx with gi(x)<0g_i(x) < 0 strictly
LICQActive constraint gradients linearly independent
Convex SufficiencyKKT + convexity → global optimum
Shadow Price Interpretationμi=fbi\mu_i = -\frac{\partial f^*}{\partial b_i}

Self-Check Questions

  1. Which two KKT conditions specifically involve the Lagrange multipliers μi\mu_i, and how do their requirements differ?

  2. A constraint has g3(x)=2g_3(x^*) = -2 at the optimal point. What can you immediately conclude about μ3\mu_3, and which KKT condition tells you this?

  3. Compare and contrast Slater's condition and LICQ: when would you use each, and what does each guarantee about the KKT conditions?

  4. If you're solving a convex optimization problem and find a point satisfying all KKT conditions, what can you conclude about that point? How does your answer change for a non-convex problem?

  5. In an FRQ asking you to solve a constrained optimization problem with two inequality constraints, describe the systematic approach using complementary slackness to identify the optimal solution.