Help for package optimflex

Type:

Package

Title:

Derivative-Based Optimization with User-Defined Convergence Criteria

Version:

0.1.0

Description:

Provides a derivative-based optimization framework that allows users to combine eight convergence criteria. Unlike standard optimization functions, this package includes a built-in mechanism to verify the positive definiteness of the Hessian matrix at the point of convergence. This additional check helps prevent the solver from falsely identifying non-optimal solutions, such as saddle points, as valid minima.

License:

MIT + file LICENSE

Encoding:

UTF-8

RoxygenNote:

7.3.3

Imports:

numDeriv, utils

Suggests:

knitr, rmarkdown, testthat (≥ 3.0.0)

Config/testthat/edition:

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2026-02-04 07:35:53 UTC; bene

Author:

Eunseong Cho [aut, cre]

Maintainer:

Eunseong Cho <bene@kw.ac.kr>

Repository:

CRAN

Date/Publication:

2026-02-07 12:40:07 UTC

Broyden-Fletcher-Goldfarb-Shanno (BFGS) Optimization

Description

Implements the damped BFGS Quasi-Newton algorithm with a Strong Wolfe line search for non-linear optimization, specifically tailored for SEM.

Usage

bfgs(
  start,
  objective,
  gradient = NULL,
  hessian = NULL,
  lower = -Inf,
  upper = Inf,
  control = list(),
  ...
)

Arguments

start

Numeric vector. Starting values for the optimization parameters.

objective

Function. The objective function to minimize.

gradient

Function (optional). Gradient of the objective function.

hessian

Function (optional). Hessian matrix of the objective function.

lower

Numeric vector. Lower bounds for box constraints.

upper

Numeric vector. Upper bounds for box constraints.

control

List. Control parameters including convergence flags:

use_abs_f: Logical. Use absolute change in objective for convergence.
use_rel_f: Logical. Use relative change in objective for convergence.
use_abs_x: Logical. Use absolute change in parameters for convergence.
use_rel_x: Logical. Use relative change in parameters for convergence.
use_grad: Logical. Use gradient norm for convergence.
use_posdef: Logical. Verify positive definiteness at convergence.
use_pred_f: Logical. Record predicted objective decrease.
use_pred_f_avg: Logical. Record average predicted decrease.
diff_method: String. Method for numerical differentiation.

...

Additional arguments passed to objective, gradient, and Hessian functions.

Details

bfgs is a Quasi-Newton method that maintains an approximation of the inverse Hessian matrix. It is widely considered the most robust and efficient member of the Broyden family of optimization methods.

BFGS vs. DFP: While both bfgs and dfp update the inverse Hessian using rank-two formulas, BFGS is generally more tolerant of inaccuracies in the line search. This implementation uses the Sherman-Morrison formula to update the inverse Hessian directly, avoiding the need for matrix inversion at each step.

Strong Wolfe Line Search: To maintain the positive definiteness of the Hessian approximation and ensure global convergence, this algorithm employs a Strong Wolfe line search. This search identifies a step length \alpha that satisfies both sufficient decrease (Armijo condition) and the curvature condition.

Damping for Non-Convexity: In Structural Equation Modeling (SEM), objective functions often exhibit non-convex regions. When use_damped = TRUE, Powell's damping strategy is applied to the update vectors to preserve the positive definiteness of the Hessian approximation even when the curvature condition is not naturally met.

Value

A list containing optimization results and iteration metadata.

References

Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Fletcher, R. (1987). Practical Methods of Optimization. Wiley.

Davidon-Fletcher-Powell (DFP) Quasi-Newton Optimization

Description

Implements the DFP Quasi-Newton algorithm with a Strong Wolfe line search and optional Powell's damping for non-linear optimization.

Usage

dfp(
  start,
  objective,
  gradient = NULL,
  hessian = NULL,
  lower = -Inf,
  upper = Inf,
  control = list(),
  ...
)

Arguments

start

Numeric vector. Starting values for the optimization parameters.

objective

Function. The objective function to minimize.

gradient

Function (optional). Gradient of the objective function.

hessian

Function (optional). Hessian matrix for final verification.

lower

Numeric vector. Lower bounds for box constraints.

upper

Numeric vector. Upper bounds for box constraints.

control

List. Control parameters including convergence flags:

use_abs_f: Logical. Use absolute change in objective for convergence.
use_rel_f: Logical. Use relative change in objective for convergence.
use_abs_x: Logical. Use absolute change in parameters for convergence.
use_rel_x: Logical. Use relative change in parameters for convergence.
use_grad: Logical. Use gradient norm for convergence.
use_posdef: Logical. Verify positive definiteness at convergence.
use_pred_f: Logical. Record predicted objective decrease.
use_pred_f_avg: Logical. Record average predicted decrease.

...

Additional arguments passed to objective, gradient, and Hessian functions.

Details

dfp is a Quasi-Newton method that maintains and updates an approximation of the inverse Hessian matrix. Historically, it was the first Quasi-Newton method discovered (Davidon, 1959) and later refined by Fletcher and Powell (1963).

DFP vs. BFGS: Both DFP and BFGS belong to the Broyden family of Quasi-Newton methods. While BFGS is generally preferred for its self-correcting properties regarding inaccuracies in the line search, DFP remains a fundamental algorithm that can be more sensitive to the local curvature of the objective function. In certain Structural Equation Modeling (SEM) contexts, DFP can provide alternative convergence paths when BFGS reaches a plateau.

Strong Wolfe Line Search: To ensure the positive definiteness of the inverse Hessian update and guarantee global convergence, this implementation employs a Strong Wolfe line search. This identifies a step length \alpha that satisfies both:

Sufficient Decrease (Armijo Rule): f(x + \alpha p) \le f(x) + c_1 \alpha \nabla f(x)^T p.
Curvature Condition: | \nabla f(x + \alpha p)^T p | \le c_2 | \nabla f(x)^T p |.

Powell's Damping Strategy: Structural Equation Models often involve non-convex fitting functions. When use_damped = TRUE, the algorithm applies Powell's damping to the y vector used in the update formula. This ensures that the curvature condition s^T y > 0 is maintained even in non-convex regions, preserving the stability of the inverse Hessian approximation.

Value

A list containing optimization results and iteration metadata.

References

Davidon, W. C. (1959). Variable Metric Method for Minimization. AEC Research and Development Report, ANL-5990.
Fletcher, R., & Powell, M. J. D. (1963). A Rapidly Convergent Descent Method for Minimization. The Computer Journal, 6(2), 163-168.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.

Dogleg Trust-Region Optimization

Description

Implements the standard Powell's Dogleg Trust-Region algorithm for non-linear optimization.

Usage

dogleg(
  start,
  objective,
  gradient = NULL,
  hessian = NULL,
  lower = -Inf,
  upper = Inf,
  control = list(),
  ...
)

Arguments

start

Numeric vector. Starting values for the optimization parameters.

objective

Function. The objective function to minimize.

gradient

Function (optional). Gradient of the objective function.

hessian

Function (optional). Hessian matrix of the objective function.

lower

Numeric vector. Lower bounds for box constraints.

upper

Numeric vector. Upper bounds for box constraints.

control

List. Control parameters including convergence flags:

use_abs_f: Logical. Use absolute change in objective for convergence.
use_rel_f: Logical. Use relative change in objective for convergence.
use_abs_x: Logical. Use absolute change in parameters for convergence.
use_rel_x: Logical. Use relative change in parameters for convergence.
use_grad: Logical. Use gradient norm for convergence.
use_posdef: Logical. Verify positive definiteness at convergence.
use_pred_f: Logical. Record predicted objective decrease.
use_pred_f_avg: Logical. Record average predicted decrease.

...

Additional arguments passed to objective, gradient, and Hessian functions.

Details

This function implements the classic Dogleg method within a Trust-Region framework, based on the strategy proposed by Powell (1970).

Trust-Region vs. Line Search: Trust-Region methods define a neighborhood around the current point (the trust region with radius \Delta) where a local quadratic model is assumed to be reliable. Unlike Line Search methods that first determine a search direction and then find an appropriate step length, this approach constrains the step size first and then finds the optimal update within that boundary.

Powell's Dogleg Trajectory: The "Dogleg" trajectory is a piecewise linear path connecting:

The current point.
The Cauchy Point (p_C): The minimizer of the quadratic model along the steepest descent direction.
The Newton Point (p_N): The unconstrained minimizer of the quadratic model (B^{-1}g).

The algorithm selects a step along this path such that it minimizes the quadratic model while remaining within the radius \Delta.

Relationship to Double Dogleg: While the double_dogleg algorithm (Dennis and Mei, 1979) introduces a bias point to follow the Newton direction more closely, this standard Dogleg follows the original two-segment trajectory.

Value

A list containing optimization results and iteration metadata.

References

Powell, M. J. D. (1970). A Hybrid Method for Nonlinear Equations. Numerical Methods for Nonlinear Algebraic Equations.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.

Double Dogleg Trust-Region Optimization

Description

Implements the Double Dogleg Trust-Region algorithm for non-linear optimization.

Usage

double_dogleg(
  start,
  objective,
  gradient = NULL,
  hessian = NULL,
  lower = -Inf,
  upper = Inf,
  control = list(),
  ...
)

Arguments

start

Numeric vector. Starting values for the optimization parameters.

objective

Function. The objective function to minimize.

gradient

Function (optional). Gradient of the objective function.

hessian

Function (optional). Hessian matrix of the objective function.

lower

Numeric vector. Lower bounds for box constraints.

upper

Numeric vector. Upper bounds for box constraints.

control

List. Control parameters including convergence flags starting with 'use_'.

use_abs_f: Logical. Use absolute change in objective for convergence.
use_rel_f: Logical. Use relative change in objective for convergence.
use_abs_x: Logical. Use absolute change in parameters for convergence.
use_rel_x: Logical. Use relative change in parameters for convergence.
use_grad: Logical. Use gradient norm for convergence.
use_posdef: Logical. Verify positive definiteness at convergence.
use_pred_f: Logical. Record predicted objective decrease.
use_pred_f_avg: Logical. Record average predicted decrease.

...

Additional arguments passed to objective, gradient, and Hessian functions.

Details

This function implements the Double Dogleg method within a Trust-Region framework, primarily based on the work of Dennis and Mei (1979).

Trust-Region vs. Line Search: While Line Search methods (like BFGS) first determine a search direction and then find an appropriate step length, Trust-Region methods define a neighborhood around the current point (the trust region with radius \Delta) where a local quadratic model is assumed to be reliable. The algorithm then finds a step that minimizes this model within the radius. This approach is generally more robust, especially when the Hessian is not positive definite.

Powell's Dogleg vs. Double Dogleg: Powell's original Dogleg method (1970) constructs a trajectory consisting of two line segments: one from the current point to the Cauchy point, and another from the Cauchy point to the Newton point. The "Double Dogleg" modification by Dennis and Mei (1979) introduces an intermediate "bias" point (p_W) along the Newton direction.

Cauchy Point (p_C): The minimizer of the quadratic model along the steepest descent direction.
Newton Point (p_N): The minimizer of the quadratic model (B^{-1}g).
Double Dogleg Point (p_W): A point defined as \gamma \cdot p_N, where \gamma is a scaling factor (bias) that ensures the path stays closer to the Newton direction while maintaining monotonic descent in the model.

This modification allows the algorithm to perform more like a Newton method earlier in the optimization process compared to the standard Dogleg.

Value

A list containing optimization results and iteration metadata.

References

Dennis, J. E., & Mei, H. H. (1979). Two New Unconstrained Optimization Algorithms which use Function and Gradient Values. Journal of Optimization Theory and Applications, 28(4), 453-482.
Powell, M. J. D. (1970). A Hybrid Method for Nonlinear Equations. Numerical Methods for Nonlinear Algebraic Equations.
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.

Fast Numerical Gradient

Description

Provides a high-speed numerical gradient using forward or central differences.

Usage

fast_grad(f, x, diff_method = c("forward", "central"), ...)

Arguments

f

Function. The objective function.

x

Numeric vector. Parameters at which to evaluate the gradient.

diff_method

String. Differentiation method: "forward" or "central".

...

Additional arguments passed to f.

Value

A numeric vector of gradients.

Fast Numerical Hessian

Description

High-speed numerical Hessian calculation using finite differences.

Usage

fast_hess(f, x, diff_method = c("forward", "central"), ...)

Arguments

f

Function. The objective function.

x

Numeric vector. Parameters at which to evaluate the Hessian.

diff_method

String. Differentiation method: "forward" or "central".

...

Additional arguments passed to f.

Value

A symmetric Hessian matrix.

Fast Numerical Jacobian

Description

Calculates the Jacobian matrix for a vector-valued function (e.g., residuals).

Usage

fast_jac(f_res, x, diff_method = c("forward", "central"), ...)

Arguments

f_res

Function. A function returning a vector of residuals.

x

Numeric vector. Parameters at which to evaluate the Jacobian.

diff_method

String. Differentiation method: "forward" or "central".

...

Additional arguments passed to f_res.

Value

A Jacobian matrix of dimension (m x n).

Gauss-Newton Optimization

Description

Implements a full-featured Gauss-Newton algorithm for non-linear optimization, specifically optimized for Structural Equation Modeling (SEM).

Usage

gauss_newton(
  start,
  objective,
  residual = NULL,
  gradient = NULL,
  hessian = NULL,
  jac = NULL,
  lower = -Inf,
  upper = Inf,
  control = list(),
  ...
)

Arguments

start

Numeric vector. Starting values for the optimization parameters.

objective

Function. The objective function to minimize.

residual

Function (optional). Function that returns the residuals vector.

gradient

Function (optional). Gradient of the objective function.

hessian

Function (optional). Hessian matrix of the objective function.

jac

Function (optional). Jacobian matrix of the residuals.

lower

Numeric vector. Lower bounds for box constraints.

upper

Numeric vector. Upper bounds for box constraints.

control

List. Control parameters including convergence flags:

use_abs_f: Logical. Use absolute change in objective for convergence.
use_rel_f: Logical. Use relative change in objective for convergence.
use_abs_x: Logical. Use absolute change in parameters for convergence.
use_rel_x: Logical. Use relative change in parameters for convergence.
use_grad: Logical. Use gradient norm for convergence.
use_posdef: Logical. Verify positive definiteness at convergence.
use_pred_f: Logical. Record predicted objective decrease.
use_pred_f_avg: Logical. Record average predicted decrease.
diff_method: String. Method for numerical differentiation.

...

Additional arguments passed to objective, gradient, and Hessian functions.

Details

gauss_newton is a specialized optimization algorithm for least-squares and Maximum Likelihood problems where the objective function can be expressed as a sum of squared residuals.

Scaling and SEM Consistency: To ensure consistent simulation results and standard error (SE) calculations, this implementation adjusts the Gradient (2J^T r) and the Approximate Hessian (2J^T J) to match the scale of the Maximum Likelihood (ML) fitting function F_{ML}. This alignment is critical when calculating asymptotic covariance matrices using the formula \frac{2}{n} H^{-1}.

Comparison with Newton-Raphson: Unlike newton_raphson or modified_newton, which require the full second-order Hessian, Gauss-Newton approximates the Hessian using the Jacobian of the residuals. This is computationally more efficient and provides a naturally positive-semidefinite approximation, though a ridge adjustment is still provided for numerical stability.

Ridge Adjustment Strategy: The function includes a "Ridge Rescue" mechanism. If the approximate Hessian is singular or poorly conditioned for Cholesky decomposition, it iteratively adds a diagonal ridge (\tau I) until numerical stability is achieved.

Value

A list containing optimization results and iteration metadata.

References

Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Bollen, K. A. (1989). Structural Equations with Latent Variables. Wiley.

Fast Positive Definiteness Check

Description

Efficiently checks if a matrix is positive definite using Cholesky decomposition.

Usage

is_pd_fast(M)

Arguments

M

Matrix. The matrix to verify.

Value

Logical. TRUE if positive definite, FALSE otherwise.

Limited-memory BFGS with Box Constraints (L-BFGS-B)

Description

Performs bound-constrained minimization using the L-BFGS-B algorithm. This implementation handles box constraints via Generalized Cauchy Point (GCP) estimation and subspace minimization, featuring a limited-memory (two-loop recursion) inverse Hessian approximation.

Usage

l_bfgs_b(
  start,
  objective,
  gradient = NULL,
  hessian = NULL,
  lower = -Inf,
  upper = Inf,
  control = list(),
  ...
)

Arguments

start

Numeric vector. Initial values for the parameters.

objective

Function. The scalar objective function to be minimized.

gradient

Function (optional). Returns the gradient vector. If NULL, numerical derivatives are used.

hessian

Function (optional). Returns the Hessian matrix. Used for final positive definiteness verification if use_posdef = TRUE.

lower, upper

Numeric vectors. Lower and upper bounds for the parameters. Can be scalars if all parameters share the same bounds.

control

A list of control parameters:

use_abs_f: Logical. Criterion: |f_{new} - f_{old}| < tol_abs_f.
use_rel_f: Logical. Criterion: |(f_{new} - f_{old}) / f_{old}| < tol_rel_f.
use_abs_x: Logical. Criterion: \max |x_{new} - x_{old}| < tol_abs_x.
use_rel_x: Logical. Criterion: \max |(x_{new} - x_{old}) / x_{old}| < tol_rel_x.
use_grad: Logical. Criterion: \|g\|_\infty < tol_grad.
use_posdef: Logical. Criterion: Positive definiteness of the Hessian.
max_iter: Maximum number of iterations (default: 10000).
m: Number of L-BFGS memory updates (default: 5).
tol_abs_f, tol_rel_f: Tolerances for function value change.
tol_abs_x, tol_rel_x: Tolerances for parameter change.
tol_grad: Tolerance for the projected gradient (default: 1e-4).

...

Additional arguments passed to objective, gradient, and Hessian functions.

Value

A list containing optimization results and metadata.

Comparison with Existing Functions

This function adds three features for rigorous convergence control. First, it applies an AND rule: all selected convergence criteria must be satisfied simultaneously. Second, users can choose among eight distinct criteria (e.g., changes in f, x, gradient, or predicted decrease) instead of relying on fixed defaults. Third, it provides an optional verification using the Hessian computed from derivatives (analytically when provided, or via numerical differentiation). Checking the positive definiteness of this Hessian at the final solution reduces the risk of declaring convergence at non-minimizing stationary points, such as saddle points.

References

Byrd, R. H., Lu, P., Nocedal, J., & Zhu, C. (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5), 1190-1208.

Morales, J. L., & Nocedal, J. (2011). L-BFGS-B: Remark on algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Transactions on Mathematical Software, 38(1), 1-4.

Modified Newton-Raphson Optimization

Description

Implements an optimized Newton-Raphson algorithm for non-linear optimization featuring dynamic ridge adjustment and backtracking line search.

Usage

modified_newton(
  start,
  objective,
  gradient = NULL,
  hessian = NULL,
  lower = -Inf,
  upper = Inf,
  control = list(),
  ...
)

Arguments

start

Numeric vector. Starting values for the optimization parameters.

objective

Function. The objective function to minimize.

gradient

Function (optional). Gradient of the objective function.

hessian

Function (optional). Hessian matrix of the objective function.

lower

Numeric vector. Lower bounds for box constraints.

upper

Numeric vector. Upper bounds for box constraints.

control

List. Control parameters including convergence flags:

use_abs_f: Logical. Use absolute change in objective for convergence.
use_rel_f: Logical. Use relative change in objective for convergence.
use_abs_x: Logical. Use absolute change in parameters for convergence.
use_rel_x: Logical. Use relative change in parameters for convergence.
use_grad: Logical. Use gradient norm for convergence.
use_posdef: Logical. Verify positive definiteness at convergence.
use_pred_f: Logical. Record predicted objective decrease.
use_pred_f_avg: Logical. Record average predicted decrease.
grad_diff: String. Method for gradient differentiation.
hess_diff: String. Method for Hessian differentiation.

...

Additional arguments passed to objective, gradient, and Hessian functions.

Details

modified_newton is a line search optimization algorithm that utilizes second-order curvature information (the Hessian matrix) to find the minimum of an objective function.

Modified Newton vs. Trust-Region: Unlike the dogleg and double_dogleg functions which use a Trust-Region approach to constrain the step size, this function uses a Line Search approach. It first determines the Newton direction (the solution to H \Delta x = -g) and then performs a backtracking line search to find a step length \alpha that satisfies the sufficient decrease condition (Armijo condition).

Dynamic Ridge Adjustment: If the Hessian matrix H is not positive definite (making it unsuitable for Cholesky decomposition), the algorithm applies a dynamic ridge adjustment. A diagonal matrix \tau I is added to the Hessian, where \tau is increased until the matrix becomes positive definite. This ensures the search direction always remains a descent direction.

Differentiation Methods: The function allows for independent selection of differentiation methods for the gradient and Hessian:

forward: Standard forward-difference numerical differentiation.
central: Central-difference (more accurate but slower).
complex: Complex-step differentiation (highly accurate for gradients).
richardson: Richardson extrapolation via the numDeriv package.

Value

A list containing optimization results and iteration metadata.

Pure Newton-Raphson Optimization

Description

Implements the standard Newton-Raphson algorithm for non-linear optimization without Hessian modifications or ridge adjustments.

Usage

newton_raphson(
  start,
  objective,
  gradient = NULL,
  hessian = NULL,
  lower = -Inf,
  upper = Inf,
  control = list(),
  ...
)

Arguments

start

Numeric vector. Starting values for the optimization parameters.

objective

Function. The objective function to minimize.

gradient

Function (optional). Gradient of the objective function.

hessian

Function (optional). Hessian matrix of the objective function.

lower

Numeric vector. Lower bounds for box constraints.

upper

Numeric vector. Upper bounds for box constraints.

control

List. Control parameters including convergence flags:

use_abs_f: Logical. Use absolute change in objective for convergence.
use_rel_f: Logical. Use relative change in objective for convergence.
use_abs_x: Logical. Use absolute change in parameters for convergence.
use_rel_x: Logical. Use relative change in parameters for convergence.
use_grad: Logical. Use gradient norm for convergence.
use_posdef: Logical. Verify positive definiteness at convergence.
use_pred_f: Logical. Record predicted objective decrease.
use_pred_f_avg: Logical. Record average predicted decrease.
diff_method: String. Method for numerical differentiation.

...

Additional arguments passed to objective, gradient, and Hessian functions.

Details

newton_raphson provides a classic second-order optimization approach.

Comparison with Modified Newton: Unlike modified_newton, this function does not apply dynamic ridge adjustments (Levenberg-Marquardt style) to the Hessian. If the Hessian is singular or cannot be inverted via solve(), the algorithm will terminate. This "pure" implementation is often preferred in simulation studies where the behavior of the exact Newton step is of interest.

Predicted Decrease: This function explicitly calculates the Predicted Decrease (pred\_dec), which is the expected reduction in the objective function value based on the local quadratic model:

m(p) = f + g^T p + \frac{1}{2} p^T H p

Stability and Simulations: All return values are explicitly cast to scalars (e.g., as.numeric, as.logical) to ensure stability when the function is called within large-scale simulation loops or packaged into data frames.

Value

A list containing optimization results and iteration metadata.

References

Nocedal, J., & Wright, S. J. (2006). Numerical Optimization. Springer.
Bollen, K. A. (1989). Structural Equations with Latent Variables. Wiley.