line search is an mathematical optimization technique used in iterative descent methods to find an appropriate step size,
The intuition is to think of
- Fixing a direction vector
at the current point . - Minimizing the objective function
along the line , where is the step size. - Finding the optimal step size
that sufficiently reduces the function value along the direction .
Usage: In Descent Method
Line search is typically integrated into descent method (e.g., gradient descent, Newton-Raphson method, conjugate gradient) to determine the optimal step size
- Initialize at the starting point
. - Compute the direction
for descent, typically the negative gradient direction in gradient descent or other calculated directions in other methods. - Perform line search to find the optimal step size
that minimizes the function along the direction : - Update the point
. - Repeat until convergence.
Why Line Search is Important
- Stability & Improved convergence: Without line search, a fixed step size might lead to overshooting or slow convergence. line search adapts the step size based on the current landscape of the objective function, ensuring more stable updates.
- Efficiency: By choosing the optimal step size, line search ensures that the algorithm makes meaningful progress at each iteration, avoiding unnecessarily small steps (slow convergence) or overly large steps (which might overshoot the minimum).
Types of Line Search
Aspect | exact line search | inexact line search |
---|---|---|
Definition | Finds the step size | Finds an |
Precision | Solves the line search problem precisely by determining the exact optimal | Approximates the solution to the line search problem, typically using heuristics or conditions like Wolfe conditions |
Convergence | Ensures optimality of the step size at every iteration but may not always be necessary for fast convergence. | Balances progress and efficiency, often converging more quickly in practice by accepting good-enough step sizes. |
Method Example | Can be solved using numerical approximation method such as bisection method or Newton-Raphson method to minimize | Typically uses methods like backtracking line search for an efficient estimate of the step size. |
Applicability | Best used when the cost of evaluating | Commonly used in high-dimensional or complex problems or where exact line search is too costly. Often used in algorithms like gradient descent and conjugate gradient |