| Type: | Package |
| Title: | Derivative-Based Optimization with User-Defined Convergence Criteria |
| Version: | 0.1.0 |
| Description: | Provides a derivative-based optimization framework that allows users to combine eight convergence criteria. Unlike standard optimization functions, this package includes a built-in mechanism to verify the positive definiteness of the Hessian matrix at the point of convergence. This additional check helps prevent the solver from falsely identifying non-optimal solutions, such as saddle points, as valid minima. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| Imports: | numDeriv, utils |
| Suggests: | knitr, rmarkdown, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2026-02-04 07:35:53 UTC; bene |
| Author: | Eunseong Cho [aut, cre] |
| Maintainer: | Eunseong Cho <bene@kw.ac.kr> |
| Repository: | CRAN |
| Date/Publication: | 2026-02-07 12:40:07 UTC |
Broyden-Fletcher-Goldfarb-Shanno (BFGS) Optimization
Description
Implements the damped BFGS Quasi-Newton algorithm with a Strong Wolfe line search for non-linear optimization, specifically tailored for SEM.
Usage
bfgs(
start,
objective,
gradient = NULL,
hessian = NULL,
lower = -Inf,
upper = Inf,
control = list(),
...
)
Arguments
start |
Numeric vector. Starting values for the optimization parameters. |
objective |
Function. The objective function to minimize. |
gradient |
Function (optional). Gradient of the objective function. |
hessian |
Function (optional). Hessian matrix of the objective function. |
lower |
Numeric vector. Lower bounds for box constraints. |
upper |
Numeric vector. Upper bounds for box constraints. |
control |
List. Control parameters including convergence flags:
|
... |
Additional arguments passed to objective, gradient, and Hessian functions. |
Details
bfgs is a Quasi-Newton method that maintains an approximation of the
inverse Hessian matrix. It is widely considered the most robust and
efficient member of the Broyden family of optimization methods.
BFGS vs. DFP:
While both bfgs and dfp update the inverse Hessian using
rank-two formulas, BFGS is generally more tolerant of inaccuracies in the
line search. This implementation uses the Sherman-Morrison formula to
update the inverse Hessian directly, avoiding the need for matrix inversion
at each step.
Strong Wolfe Line Search:
To maintain the positive definiteness of the Hessian approximation and
ensure global convergence, this algorithm employs a Strong Wolfe line search.
This search identifies a step length \alpha that satisfies both sufficient
decrease (Armijo condition) and the curvature condition.
Damping for Non-Convexity:
In Structural Equation Modeling (SEM), objective functions often exhibit
non-convex regions. When use_damped = TRUE, Powell's damping
strategy is applied to the update vectors to preserve the positive
definiteness of the Hessian approximation even when the curvature condition
is not naturally met.
Value
A list containing optimization results and iteration metadata.
References
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Fletcher, R. (1987). Practical Methods of Optimization. Wiley.
Davidon-Fletcher-Powell (DFP) Quasi-Newton Optimization
Description
Implements the DFP Quasi-Newton algorithm with a Strong Wolfe line search and optional Powell's damping for non-linear optimization.
Usage
dfp(
start,
objective,
gradient = NULL,
hessian = NULL,
lower = -Inf,
upper = Inf,
control = list(),
...
)
Arguments
start |
Numeric vector. Starting values for the optimization parameters. |
objective |
Function. The objective function to minimize. |
gradient |
Function (optional). Gradient of the objective function. |
hessian |
Function (optional). Hessian matrix for final verification. |
lower |
Numeric vector. Lower bounds for box constraints. |
upper |
Numeric vector. Upper bounds for box constraints. |
control |
List. Control parameters including convergence flags:
|
... |
Additional arguments passed to objective, gradient, and Hessian functions. |
Details
dfp is a Quasi-Newton method that maintains and updates an approximation
of the inverse Hessian matrix. Historically, it was the first Quasi-Newton
method discovered (Davidon, 1959) and later refined by Fletcher and Powell (1963).
DFP vs. BFGS: Both DFP and BFGS belong to the Broyden family of Quasi-Newton methods. While BFGS is generally preferred for its self-correcting properties regarding inaccuracies in the line search, DFP remains a fundamental algorithm that can be more sensitive to the local curvature of the objective function. In certain Structural Equation Modeling (SEM) contexts, DFP can provide alternative convergence paths when BFGS reaches a plateau.
Strong Wolfe Line Search:
To ensure the positive definiteness of the inverse Hessian update and guarantee
global convergence, this implementation employs a Strong Wolfe line search.
This identifies a step length \alpha that satisfies both:
-
Sufficient Decrease (Armijo Rule):
f(x + \alpha p) \le f(x) + c_1 \alpha \nabla f(x)^T p. -
Curvature Condition:
| \nabla f(x + \alpha p)^T p | \le c_2 | \nabla f(x)^T p |.
Powell's Damping Strategy:
Structural Equation Models often involve non-convex fitting functions.
When use_damped = TRUE, the algorithm applies Powell's damping
to the y vector used in the update formula. This ensures that the curvature
condition s^T y > 0 is maintained even in non-convex regions, preserving
the stability of the inverse Hessian approximation.
Value
A list containing optimization results and iteration metadata.
References
Davidon, W. C. (1959). Variable Metric Method for Minimization. AEC Research and Development Report, ANL-5990.
Fletcher, R., & Powell, M. J. D. (1963). A Rapidly Convergent Descent Method for Minimization. The Computer Journal, 6(2), 163-168.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Dogleg Trust-Region Optimization
Description
Implements the standard Powell's Dogleg Trust-Region algorithm for non-linear optimization.
Usage
dogleg(
start,
objective,
gradient = NULL,
hessian = NULL,
lower = -Inf,
upper = Inf,
control = list(),
...
)
Arguments
start |
Numeric vector. Starting values for the optimization parameters. |
objective |
Function. The objective function to minimize. |
gradient |
Function (optional). Gradient of the objective function. |
hessian |
Function (optional). Hessian matrix of the objective function. |
lower |
Numeric vector. Lower bounds for box constraints. |
upper |
Numeric vector. Upper bounds for box constraints. |
control |
List. Control parameters including convergence flags:
|
... |
Additional arguments passed to objective, gradient, and Hessian functions. |
Details
This function implements the classic Dogleg method within a Trust-Region framework, based on the strategy proposed by Powell (1970).
Trust-Region vs. Line Search:
Trust-Region methods define a neighborhood around the current point (the trust region
with radius \Delta) where a local quadratic model is assumed to be reliable.
Unlike Line Search methods that first determine a search direction and then
find an appropriate step length, this approach constrains the step size first
and then finds the optimal update within that boundary.
Powell's Dogleg Trajectory: The "Dogleg" trajectory is a piecewise linear path connecting:
The current point.
The Cauchy Point (
p_C): The minimizer of the quadratic model along the steepest descent direction.The Newton Point (
p_N): The unconstrained minimizer of the quadratic model(B^{-1}g).
The algorithm selects a step along this path such that it minimizes the quadratic
model while remaining within the radius \Delta.
Relationship to Double Dogleg:
While the double_dogleg algorithm (Dennis and Mei, 1979) introduces a bias
point to follow the Newton direction more closely, this standard Dogleg follows
the original two-segment trajectory.
Value
A list containing optimization results and iteration metadata.
References
Powell, M. J. D. (1970). A Hybrid Method for Nonlinear Equations. Numerical Methods for Nonlinear Algebraic Equations.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Double Dogleg Trust-Region Optimization
Description
Implements the Double Dogleg Trust-Region algorithm for non-linear optimization.
Usage
double_dogleg(
start,
objective,
gradient = NULL,
hessian = NULL,
lower = -Inf,
upper = Inf,
control = list(),
...
)
Arguments
start |
Numeric vector. Starting values for the optimization parameters. |
objective |
Function. The objective function to minimize. |
gradient |
Function (optional). Gradient of the objective function. |
hessian |
Function (optional). Hessian matrix of the objective function. |
lower |
Numeric vector. Lower bounds for box constraints. |
upper |
Numeric vector. Upper bounds for box constraints. |
control |
List. Control parameters including convergence flags starting with 'use_'.
|
... |
Additional arguments passed to objective, gradient, and Hessian functions. |
Details
This function implements the Double Dogleg method within a Trust-Region framework, primarily based on the work of Dennis and Mei (1979).
Trust-Region vs. Line Search:
While Line Search methods (like BFGS) first determine a search direction and then
find an appropriate step length, Trust-Region methods define a neighborhood
around the current point (the trust region with radius \Delta) where a local
quadratic model is assumed to be reliable. The algorithm then finds a step that
minimizes this model within the radius. This approach is generally more robust,
especially when the Hessian is not positive definite.
Powell's Dogleg vs. Double Dogleg:
Powell's original Dogleg method (1970) constructs a trajectory consisting of
two line segments: one from the current point to the Cauchy point, and another
from the Cauchy point to the Newton point. The "Double Dogleg" modification
by Dennis and Mei (1979) introduces an intermediate "bias" point (p_W)
along the Newton direction.
-
Cauchy Point (
p_C): The minimizer of the quadratic model along the steepest descent direction. -
Newton Point (
p_N): The minimizer of the quadratic model(B^{-1}g). -
Double Dogleg Point (
p_W): A point defined as\gamma \cdot p_N, where\gammais a scaling factor (bias) that ensures the path stays closer to the Newton direction while maintaining monotonic descent in the model.
This modification allows the algorithm to perform more like a Newton method earlier in the optimization process compared to the standard Dogleg.
Value
A list containing optimization results and iteration metadata.
References
Dennis, J. E., & Mei, H. H. (1979). Two New Unconstrained Optimization Algorithms which use Function and Gradient Values. Journal of Optimization Theory and Applications, 28(4), 453-482.
Powell, M. J. D. (1970). A Hybrid Method for Nonlinear Equations. Numerical Methods for Nonlinear Algebraic Equations.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Fast Numerical Gradient
Description
Provides a high-speed numerical gradient using forward or central differences.
Usage
fast_grad(f, x, diff_method = c("forward", "central"), ...)
Arguments
f |
Function. The objective function. |
x |
Numeric vector. Parameters at which to evaluate the gradient. |
diff_method |
String. Differentiation method: "forward" or "central". |
... |
Additional arguments passed to |
Value
A numeric vector of gradients.
Fast Numerical Hessian
Description
High-speed numerical Hessian calculation using finite differences.
Usage
fast_hess(f, x, diff_method = c("forward", "central"), ...)
Arguments
f |
Function. The objective function. |
x |
Numeric vector. Parameters at which to evaluate the Hessian. |
diff_method |
String. Differentiation method: "forward" or "central". |
... |
Additional arguments passed to |
Value
A symmetric Hessian matrix.
Fast Numerical Jacobian
Description
Calculates the Jacobian matrix for a vector-valued function (e.g., residuals).
Usage
fast_jac(f_res, x, diff_method = c("forward", "central"), ...)
Arguments
f_res |
Function. A function returning a vector of residuals. |
x |
Numeric vector. Parameters at which to evaluate the Jacobian. |
diff_method |
String. Differentiation method: "forward" or "central". |
... |
Additional arguments passed to |
Value
A Jacobian matrix of dimension (m x n).
Gauss-Newton Optimization
Description
Implements a full-featured Gauss-Newton algorithm for non-linear optimization, specifically optimized for Structural Equation Modeling (SEM).
Usage
gauss_newton(
start,
objective,
residual = NULL,
gradient = NULL,
hessian = NULL,
jac = NULL,
lower = -Inf,
upper = Inf,
control = list(),
...
)
Arguments
start |
Numeric vector. Starting values for the optimization parameters. |
objective |
Function. The objective function to minimize. |
residual |
Function (optional). Function that returns the residuals vector. |
gradient |
Function (optional). Gradient of the objective function. |
hessian |
Function (optional). Hessian matrix of the objective function. |
jac |
Function (optional). Jacobian matrix of the residuals. |
lower |
Numeric vector. Lower bounds for box constraints. |
upper |
Numeric vector. Upper bounds for box constraints. |
control |
List. Control parameters including convergence flags:
|
... |
Additional arguments passed to objective, gradient, and Hessian functions. |
Details
gauss_newton is a specialized optimization algorithm for least-squares
and Maximum Likelihood problems where the objective function can be
expressed as a sum of squared residuals.
Scaling and SEM Consistency:
To ensure consistent simulation results and standard error (SE) calculations,
this implementation adjusts the Gradient (2J^T r) and the Approximate
Hessian (2J^T J) to match the scale of the Maximum Likelihood (ML)
fitting function F_{ML}. This alignment is critical when calculating
asymptotic covariance matrices using the formula \frac{2}{n} H^{-1}.
Comparison with Newton-Raphson:
Unlike newton_raphson or modified_newton, which require the full
second-order Hessian, Gauss-Newton approximates the Hessian using the
Jacobian of the residuals. This is computationally more efficient and
provides a naturally positive-semidefinite approximation, though a ridge
adjustment is still provided for numerical stability.
Ridge Adjustment Strategy:
The function includes a "Ridge Rescue" mechanism. If the approximate Hessian
is singular or poorly conditioned for Cholesky decomposition, it iteratively
adds a diagonal ridge (\tau I) until numerical stability is achieved.
Value
A list containing optimization results and iteration metadata.
References
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Bollen, K. A. (1989). Structural Equations with Latent Variables. Wiley.
Fast Positive Definiteness Check
Description
Efficiently checks if a matrix is positive definite using Cholesky decomposition.
Usage
is_pd_fast(M)
Arguments
M |
Matrix. The matrix to verify. |
Value
Logical. TRUE if positive definite, FALSE otherwise.
Limited-memory BFGS with Box Constraints (L-BFGS-B)
Description
Performs bound-constrained minimization using the L-BFGS-B algorithm. This implementation handles box constraints via Generalized Cauchy Point (GCP) estimation and subspace minimization, featuring a limited-memory (two-loop recursion) inverse Hessian approximation.
Usage
l_bfgs_b(
start,
objective,
gradient = NULL,
hessian = NULL,
lower = -Inf,
upper = Inf,
control = list(),
...
)
Arguments
start |
Numeric vector. Initial values for the parameters. |
objective |
Function. The scalar objective function to be minimized. |
gradient |
Function (optional). Returns the gradient vector. If |
hessian |
Function (optional). Returns the Hessian matrix. Used for final
positive definiteness verification if |
lower, upper |
Numeric vectors. Lower and upper bounds for the parameters. Can be scalars if all parameters share the same bounds. |
control |
A list of control parameters:
|
... |
Additional arguments passed to objective, gradient, and Hessian functions. |
Value
A list containing optimization results and metadata.
Comparison with Existing Functions
This function adds three features for rigorous convergence control. First, it
applies an AND rule: all selected convergence criteria must be satisfied
simultaneously. Second, users can choose among eight distinct criteria
(e.g., changes in f, x, gradient, or predicted decrease) instead of relying
on fixed defaults. Third, it provides an optional verification using the
Hessian computed from derivatives (analytically when provided, or via numerical
differentiation). Checking the positive definiteness of this Hessian at the final
solution reduces the risk of declaring convergence at non-minimizing stationary
points, such as saddle points.
References
Byrd, R. H., Lu, P., Nocedal, J., & Zhu, C. (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5), 1190-1208.
Morales, J. L., & Nocedal, J. (2011). L-BFGS-B: Remark on algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Transactions on Mathematical Software, 38(1), 1-4.
Modified Newton-Raphson Optimization
Description
Implements an optimized Newton-Raphson algorithm for non-linear optimization featuring dynamic ridge adjustment and backtracking line search.
Usage
modified_newton(
start,
objective,
gradient = NULL,
hessian = NULL,
lower = -Inf,
upper = Inf,
control = list(),
...
)
Arguments
start |
Numeric vector. Starting values for the optimization parameters. |
objective |
Function. The objective function to minimize. |
gradient |
Function (optional). Gradient of the objective function. |
hessian |
Function (optional). Hessian matrix of the objective function. |
lower |
Numeric vector. Lower bounds for box constraints. |
upper |
Numeric vector. Upper bounds for box constraints. |
control |
List. Control parameters including convergence flags:
|
... |
Additional arguments passed to objective, gradient, and Hessian functions. |
Details
modified_newton is a line search optimization algorithm that utilizes
second-order curvature information (the Hessian matrix) to find the minimum
of an objective function.
Modified Newton vs. Trust-Region:
Unlike the dogleg and double_dogleg functions which use a
Trust-Region approach to constrain the step size, this function uses a
Line Search approach. It first determines the Newton direction
(the solution to H \Delta x = -g) and then performs a backtracking line
search to find a step length \alpha that satisfies the sufficient decrease
condition (Armijo condition).
Dynamic Ridge Adjustment:
If the Hessian matrix H is not positive definite (making it unsuitable for
Cholesky decomposition), the algorithm applies a dynamic ridge adjustment.
A diagonal matrix \tau I is added to the Hessian, where \tau is
increased until the matrix becomes positive definite. This ensures the
search direction always remains a descent direction.
Differentiation Methods: The function allows for independent selection of differentiation methods for the gradient and Hessian:
-
forward: Standard forward-difference numerical differentiation. -
central: Central-difference (more accurate but slower). -
complex: Complex-step differentiation (highly accurate for gradients). -
richardson: Richardson extrapolation via thenumDerivpackage.
Value
A list containing optimization results and iteration metadata.
Pure Newton-Raphson Optimization
Description
Implements the standard Newton-Raphson algorithm for non-linear optimization without Hessian modifications or ridge adjustments.
Usage
newton_raphson(
start,
objective,
gradient = NULL,
hessian = NULL,
lower = -Inf,
upper = Inf,
control = list(),
...
)
Arguments
start |
Numeric vector. Starting values for the optimization parameters. |
objective |
Function. The objective function to minimize. |
gradient |
Function (optional). Gradient of the objective function. |
hessian |
Function (optional). Hessian matrix of the objective function. |
lower |
Numeric vector. Lower bounds for box constraints. |
upper |
Numeric vector. Upper bounds for box constraints. |
control |
List. Control parameters including convergence flags:
|
... |
Additional arguments passed to objective, gradient, and Hessian functions. |
Details
newton_raphson provides a classic second-order optimization approach.
Comparison with Modified Newton:
Unlike modified_newton, this function does not apply dynamic ridge
adjustments (Levenberg-Marquardt style) to the Hessian. If the Hessian is
singular or cannot be inverted via solve(), the algorithm will
terminate. This "pure" implementation is often preferred in simulation
studies where the behavior of the exact Newton step is of interest.
Predicted Decrease:
This function explicitly calculates the Predicted Decrease (pred\_dec),
which is the expected reduction in the objective function value based on
the local quadratic model:
m(p) = f + g^T p + \frac{1}{2} p^T H p
Stability and Simulations:
All return values are explicitly cast to scalars (e.g., as.numeric,
as.logical) to ensure stability when the function is called within
large-scale simulation loops or packaged into data frames.
Value
A list containing optimization results and iteration metadata.
References
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Bollen, K. A. (1989). Structural Equations with Latent Variables. Wiley.