| Title: | Geometry-Adaptive Lyapunov-Assured Hybrid Optimizer with Softplus Reparameterization and Trust-Region Control |
|---|---|
| Description: | Implements the GALAHAD algorithm (Geometry-Adaptive Lyapunov-Assured Hybrid Optimizer), updated in version 2 to replace the hard-clamp positivity constraint of v1 with a numerically smooth softplus reparameterization, add rho-based trust-region adaptation (actual vs. predicted objective reduction), extend convergence detection to include both absolute and relative function-stall criteria, and enrich the per-iteration history with Armijo backtrack counts and trust-region quality ratios. Parameters constrained to be positive (rates, concentrations, scale parameters) are handled in a transformed z-space via the softplus map so that gradients remain well-defined at the constraint boundary. A two-partition API (positive / euclidean) replaces the three-way T/P/E partition of v1; the legacy form is still accepted for backwards compatibility. Designed for biological modeling problems (germination, dose-response, prion RT-QuIC, survival) where rates, concentrations, and unconstrained coefficients coexist. Developed at the Minnesota Center for Prion Research and Outreach (MNPRO), University of Minnesota. Based on Conn et al. (2000) <doi:10.1137/1.9780898719857>, Barzilai and Borwein (1988) <doi:10.1093/imanum/8.1.141>, Xu and An (2024) <doi:10.48550/arXiv.2409.14383>, Polyak (1969) <doi:10.1016/0041-5553(69)90035-4>, Nocedal and Wright (2006, ISBN:978-0-387-30303-1), and Dugas et al. (2009) <https://www.jmlr.org/papers/v10/dugas09a.html>. |
| Authors: | Richard A. Feiss [aut, cre] (ORCID: <https://orcid.org/0009-0008-0409-6042>) |
| Maintainer: | Richard A. Feiss <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 2.0.0 |
| Built: | 2026-06-06 07:33:26 UTC |
| Source: | https://github.com/cran/GALAHAD |
Minimises a smooth objective over a mixed-geometry
parameter space. Parameters declared as positive are handled
via a softplus reparameterization so that positivity is guaranteed
analytically and gradients remain smooth at the constraint boundary.
Parameters declared as euclidean are unconstrained.
The algorithm combines:
Diagonal Hessian preconditioning via a secant-based exponential moving average.
Adaptive step-size selection: Polyak (when V_star is
supplied) -> Barzilai-Borwein -> constant default.
Armijo backtracking line-search with a configurable monotone or rho-accept mode.
Trust-region projection with rho-based radius adaptation (actual vs. predicted reduction).
Dual convergence criterion: gradient + step tolerance (primary) and absolute or relative function-stall over five iterations (secondary).
GALAHAD(V, gradV, theta0, parts, control = list(), callback = NULL)GALAHAD(V, gradV, theta0, parts, control = list(), callback = NULL)
V |
Objective function: |
gradV |
Gradient function: |
theta0 |
Initial parameter vector, numeric of length |
parts |
Named list specifying the parameter partition. See section Parameter partitions above. |
control |
Optional named list of control parameters. See section Control parameters above. |
callback |
Optional function called at the end of each iteration.
Receives a list with elements |
A named list with the following elements:
thetaNumeric vector; final parameter estimate in the
original space (theta[positive] > 0 is
guaranteed).
valueScalar; at the final iterate.
grad_infSup-norm of the gradient at the final iterate.
convergedLogical; TRUE if any convergence
criterion was satisfied before max_iter.
statusCharacter; "converged" or
"max_iter".
reasonCharacter; one of "GRAD_TOL",
"FUNC_STALL_ABS", "FUNC_STALL_REL",
"MAX_ITER".
iterationsInteger; number of iterations performed.
historyA data.frame with one row per iteration
and columns iter, f, g_inf, step_norm,
df (change in ), eta, method
(step-size method: "POLYAK", "BB", or
"DEFAULT"), armijo_iters, pred_red (predicted
reduction), and rho (actual / predicted reduction).
diagnosticsList with mean_L_hat (mean diagonal
Hessian estimate at termination), final_delta (final trust
radius), enforce_monotone, use_rho_accept, and
parameterization ("softplus").
certificateList with converged and reason;
a compact convergence certificate.
The parts argument partitions the indices 1:p (where
) into two groups.
Preferred form:
positiveInteger indices of parameters constrained to
. Typical examples: rate constants, concentrations,
standard deviations, Hill coefficients. Internally reparameterized
via .
euclideanInteger indices of unconstrained parameters. Typical examples: location parameters, regression coefficients, log-EC50.
Legacy form (v1 compatibility):
list(T = ..., P = ..., E = ...) is also accepted.
Indices in T and P are mapped to positive;
indices in E are mapped to euclidean.
All indices must appear exactly once and cover 1:p.
Pass as named elements of the control list. Any omitted element
takes the default shown below.
max_iterMaximum iterations (default: 2000).
tol_gConvergence: sup-norm of gradient
tol_g (default: 1e-6).
tol_xConvergence: step norm tol_x
(default: 1e-9).
tol_fStall: absolute change in over last five
iterations tol_f (default: 1e-12).
tol_f_relStall: relative change in over last
five iterations tol_f_rel (default: 1e-9).
deltaInitial trust-region radius (default: 1.0).
delta_minMinimum trust-region radius
(default: 1e-6).
delta_maxMaximum trust-region radius
(default: 1e3).
eta0Initial / fallback step size (default: 1.0).
eta_minMinimum step size (default: 1e-8).
eta_maxMaximum step size (default: 10.0).
c1Armijo sufficient-decrease constant
(default: 1e-4).
armijo_maxMaximum Armijo backtrack steps
(default: 20).
L_estInitial diagonal Hessian estimate
(default: 1.0).
l_min, l_max
Bounds for Hessian diagonal entries
(defaults: 1e-8, 1e8).
V_starKnown or estimated minimum value of ,
used to compute Polyak step sizes. NULL disables Polyak
steps (default: NULL).
lambdaL2 regularization coefficient; adds
to the objective (default: 0).
enforce_monotoneLogical; if TRUE (default),
the Armijo condition must be satisfied. Set to FALSE together
with use_rho_accept = TRUE to use rho-accept mode instead.
use_rho_acceptLogical; if TRUE, accept a step
when rho_accept rather than requiring Armijo
decrease (default: FALSE).
rho_acceptMinimum acceptable rho for step acceptance in
rho-accept mode (default: 0.1).
rho_shrinkTrust-radius shrinks when
rho_shrink (default: 0.25).
rho_growTrust-radius grows when
rho_grow and step fills the region
(default: 0.75).
shrink_factorMultiplicative shrink factor for trust
radius (default: 0.5).
grow_factorMultiplicative growth factor for trust
radius (default: 1.5).
softplus_capUpper threshold in the softplus overflow
guard (default: 700). Rarely needs changing.
Barzilai, J., & Borwein, J. M. (1988). Two-point step size gradient methods. IMA Journal of Numerical Analysis, 8(1), 141–148. doi:10.1093/imanum/8.1.141
Conn, A. R., Gould, N. I. M., & Toint, P. L. (2000). Trust-Region Methods. SIAM. doi:10.1137/1.9780898719857
Dugas, C., Bengio, Y., Belisle, F., Nadeau, C., & Garcia, R. (2009). Incorporating functional knowledge in neural networks. Journal of Machine Learning Research, 10(42), 1239–1262. https://www.jmlr.org/papers/v10/dugas09a.html
Nocedal, J., & Wright, S. J. (2006). Numerical Optimization (2nd ed.). Springer. ISBN 978-0-387-30303-1.
Polyak, B. T. (1969). The conjugate gradient method in extremal problems. USSR Computational Mathematics and Mathematical Physics, 9(4), 94–112. doi:10.1016/0041-5553(69)90035-4
Xu, X., & An, C. (2024). A trust region method with regularized Barzilai-Borwein step-size for large-scale unconstrained optimization. arXiv preprint. doi:10.48550/arXiv.2409.14383
galahad_numgrad for finite-difference gradient
approximation; galahad_parts for a helper that builds
the parts list.
## -- Example 1: purely positive parameters (exponential decay) ---------- set.seed(1) t <- seq(0, 5, by = 0.5) y <- 3 * exp(-0.8 * t) + rnorm(length(t), sd = 0.05) obj <- function(theta) sum((y - theta[1] * exp(-theta[2] * t))^2) grd <- function(theta) { r <- y - theta[1] * exp(-theta[2] * t) dA <- -2 * sum(r * exp(-theta[2] * t)) dk <- -2 * sum(r * (-t) * theta[1] * exp(-theta[2] * t)) c(dA, dk) } fit <- GALAHAD(V = obj, gradV = grd, theta0 = c(A = 2, k = 0.5), parts = list(positive = c(1L, 2L), euclidean = integer(0))) fit$theta # close to c(3, 0.8) fit$converged fit$reason ## -- Example 2: mixed geometry (4PL dose-response) ----------------------- set.seed(42) dose <- c(0.01, 0.1, 1, 5, 10, 50, 100) y4pl <- 10 + (90 - 10) / (1 + exp(-1.5 * (log(dose) - log(5)))) + rnorm(length(dose), sd = 2) ## theta = c(bottom, hill, logEC50, top) ## hill > 0 is positive; bottom, logEC50, top are unconstrained obj4 <- function(th) { pred <- th[1] + (th[4] - th[1]) / (1 + exp(-th[2] * (log(dose) - th[3]))) sum((y4pl - pred)^2) } grd4 <- function(th) galahad_numgrad(obj4, th) fit4 <- GALAHAD(V = obj4, gradV = grd4, theta0 = c(bottom = 15, hill = 1, logEC50 = log(5), top = 80), parts = list(positive = 2L, # hill must be > 0 euclidean = c(1L, 3L, 4L))) fit4$theta fit4$converged ## -- Example 3: legacy v1 partition syntax (backwards compatible) -------- set.seed(7) p <- 10 theta_true <- abs(rnorm(p)) + 0.5 V_q <- function(th) sum((th - theta_true)^2) gV_q <- function(th) 2 * (th - theta_true) ## Old-style T / P / E partition -- still accepted in v2 fit_legacy <- GALAHAD(V = V_q, gradV = gV_q, theta0 = rep(1, p), parts = list(T = 1:3, P = 4:7, E = 8:10)) fit_legacy$converged## -- Example 1: purely positive parameters (exponential decay) ---------- set.seed(1) t <- seq(0, 5, by = 0.5) y <- 3 * exp(-0.8 * t) + rnorm(length(t), sd = 0.05) obj <- function(theta) sum((y - theta[1] * exp(-theta[2] * t))^2) grd <- function(theta) { r <- y - theta[1] * exp(-theta[2] * t) dA <- -2 * sum(r * exp(-theta[2] * t)) dk <- -2 * sum(r * (-t) * theta[1] * exp(-theta[2] * t)) c(dA, dk) } fit <- GALAHAD(V = obj, gradV = grd, theta0 = c(A = 2, k = 0.5), parts = list(positive = c(1L, 2L), euclidean = integer(0))) fit$theta # close to c(3, 0.8) fit$converged fit$reason ## -- Example 2: mixed geometry (4PL dose-response) ----------------------- set.seed(42) dose <- c(0.01, 0.1, 1, 5, 10, 50, 100) y4pl <- 10 + (90 - 10) / (1 + exp(-1.5 * (log(dose) - log(5)))) + rnorm(length(dose), sd = 2) ## theta = c(bottom, hill, logEC50, top) ## hill > 0 is positive; bottom, logEC50, top are unconstrained obj4 <- function(th) { pred <- th[1] + (th[4] - th[1]) / (1 + exp(-th[2] * (log(dose) - th[3]))) sum((y4pl - pred)^2) } grd4 <- function(th) galahad_numgrad(obj4, th) fit4 <- GALAHAD(V = obj4, gradV = grd4, theta0 = c(bottom = 15, hill = 1, logEC50 = log(5), top = 80), parts = list(positive = 2L, # hill must be > 0 euclidean = c(1L, 3L, 4L))) fit4$theta fit4$converged ## -- Example 3: legacy v1 partition syntax (backwards compatible) -------- set.seed(7) p <- 10 theta_true <- abs(rnorm(p)) + 0.5 V_q <- function(th) sum((th - theta_true)^2) gV_q <- function(th) 2 * (th - theta_true) ## Old-style T / P / E partition -- still accepted in v2 fit_legacy <- GALAHAD(V = V_q, gradV = gV_q, theta0 = rep(1, p), parts = list(T = 1:3, P = 4:7, E = 8:10)) fit_legacy$converged
Computes a central-difference numerical gradient of f at
theta. Useful when an analytical gradient is not available.
galahad_numgrad(f, theta, eps = 1e-07)galahad_numgrad(f, theta, eps = 1e-07)
f |
Scalar-valued function of a numeric vector. |
theta |
Numeric vector at which to evaluate the gradient. |
eps |
Step size scaling factor (default |
Numeric gradient vector of the same length as theta.
f <- function(th) sum((th - c(2, 3))^2) galahad_numgrad(f, c(0, 0)) # should be close to c(-4, -6)f <- function(th) sum((th - c(2, 3))^2) galahad_numgrad(f, c(0, 0)) # should be close to c(-4, -6)
Convenience constructor that validates and assembles the parts
argument for GALAHAD.
galahad_parts(positive = integer(0), euclidean = integer(0), p = NULL)galahad_parts(positive = integer(0), euclidean = integer(0), p = NULL)
positive |
Integer vector of indices (1-based) for parameters that
must be positive. May be |
euclidean |
Integer vector of indices for unconstrained parameters.
May be |
p |
Total number of parameters. If supplied, indices are validated
against |
A named list list(positive = ..., euclidean = ...).
galahad_parts(positive = c(1L, 2L), euclidean = 3L, p = 3)galahad_parts(positive = c(1L, 2L), euclidean = 3L, p = 3)