optimizer#

Optimisation wrapper for conditional transformation models.

This module contains no mathematical logic — it only orchestrates calls to negative_log_likelihood() and build_constraints().

Analogue to R’s mltoptim.R (sequential solver attempts) and the maxtry restart mechanism in mlt().

class mltpy.optimizer.OptimizationResult(theta, log_likelihood, converged, n_iter, n_restarts, solver_message, n_outer_iter=None, kkt_residual=None, rho_final=None, mu_ineq=None, lambda_eq=None, constraint_A_ineq=None, constraint_C_eq=None)[source]#

Bases: object

Result of a optimize() call.

Parameters:

theta (ndarray[tuple[Any, ...], dtype[double]]) – Optimised parameter vector [theta_basis | beta].
log_likelihood (float) – Log-likelihood value at the optimum (i.e. −nll).
converged (bool) – Whether scipy reported successful convergence on at least one attempt.
n_iter (int) – Number of iterations used by the final (best) scipy run.
n_restarts (int) – Number of restarts that were needed (0 if first attempt succeeded).
solver_message (str) – scipy’s result.message from the best attempt, unchanged.
n_outer_iter (int | None) – Number of PHR outer iterations the auglag solver used. None for SLSQP / trust-constr fits (those solvers have no outer loop). Appears in repr() so users can see at a glance how many Lagrange-multiplier updates the fit required.
kkt_residual (float | None) – Final KKT residual reported by the auglag solver — the max(‖h(θ)‖∞, ‖min(g(θ), μ/ρ)‖∞, ‖∇L_A(θ)‖∞) value at the returned theta. None for SLSQP / trust-constr fits. Useful when converged=False to judge how close the run got before exhausting its outer-iteration budget.
rho_final (float | None) – Final penalty parameter ρ from the PHR auglag solver. None for SLSQP / trust-constr fits.
mu_ineq (ndarray[tuple[Any, ...], dtype[double]] | None) – Final inequality multipliers (≥ 0), shape (m_ineq,), from the auglag solver. For shift models m_ineq = order; for interaction models m_ineq = order * q. None for SLSQP / trust-constr.
lambda_eq (ndarray[tuple[Any, ...], dtype[double]] | None) – Final equality multipliers, shape (m_eq,), from the auglag solver. Currently always shape (0,) for all built-in models (no equality constraints are imposed). None for SLSQP / trust-constr fits.
constraint_A_ineq (ndarray[tuple[Any, ...], dtype[double]] | None) – Inequality constraint matrix used during optimisation, shape (m_ineq, total_params). Set for auglag fits; None for SLSQP / trust-constr. Consumed by model.fit() to stash _A_ineq_ for downstream inference (e.g. penalty-augmented Hessian in vcov(regularize='active')).
constraint_C_eq (ndarray[tuple[Any, ...], dtype[double]] | None) – Equality constraint matrix used during optimisation, shape (m_eq, total_params). Non-None for auglag fits whenever any equality row is imposed — by lower / upper (boundary pins) or by fixed_params (arbitrary-index pins, issue #85), stacked in that order. None when no equality constraints were imposed or when the solver is not auglag.

constraint_A_ineq: ndarray[tuple[Any, ...], dtype[float64]] | None = None#

constraint_C_eq: ndarray[tuple[Any, ...], dtype[float64]] | None = None#

converged: bool#

kkt_residual: float | None = None#

lambda_eq: ndarray[tuple[Any, ...], dtype[float64]] | None = None#

log_likelihood: float#

mu_ineq: ndarray[tuple[Any, ...], dtype[float64]] | None = None#

n_iter: int#

n_outer_iter: int | None = None#

n_restarts: int#

rho_final: float | None = None#

solver_message: str#

theta: ndarray[tuple[Any, ...], dtype[float64]]#

class mltpy.optimizer.OptimizerConfig(solver='auglag', max_iter=1000, tol=1e-08, max_restarts=3, use_gradient=True, verbose=False, random_state=None, auglag_options=None, lower=None, upper=None, polish=True, fixed_params=None)[source]#

Bases: object

Settings for the optimisation run.

Parameters:

solver (Literal['auglag', 'slsqp', 'trust-constr']) – "auglag" (default), "slsqp", or "trust-constr". Auglag is the PHR augmented Lagrangian (matches R mlt / alabama::auglag and gives the best parity with the reference implementation). SLSQP and trust-constr remain opt-in alternatives — SLSQP is faster on easy problems, trust-constr handles ill-conditioned ones better.
max_iter (int) – Maximum number of iterations passed to scipy.
tol (float) – Convergence tolerance. Mapped to ftol (SLSQP) or gtol (trust-constr).
max_restarts (int) – Number of additional attempts after the first one. Analogous to maxtry in R’s mlt(). On each restart the starting point is perturbed and projected back to the feasible region.
use_gradient (bool) – If True (default), the analytical gradient from negative_log_likelihood() is passed to scipy. Set to False only for debugging.
verbose (bool) – If True, print a warning on each failed attempt.
random_state (int | Generator | None) – If an int, seeds the RNG used to perturb restart starting points so repeated fits with the same config and data are bit-identical. If a numpy.random.Generator, it is used directly. If None (default), draws are non-reproducible across runs.
auglag_options (AugLagOptions | None) – AugLagOptions controlling the PHR outer loop. Only consulted when solver="auglag"; ignored otherwise. None (default) uses AugLagOptions defaults (alabama parity).
lower (float | None) – If not None, fixes θ[0] = lower as an equality constraint (pins the lower-boundary Bernstein coefficient). Honoured by every solver: passes through to build_constraints() for SLSQP/trust-constr and build_constraint_matrices() for auglag.
upper (float | None) – If not None, fixes θ[n_params−1] = upper analogously.
polish (bool) – If True (default), run a Newton-CG polish step after auglag converges when no monotonicity constraints are active (interior-MLE fits). Uses trust-ncg seeded at auglag’s θ-hat with the analytical Hessian from hessian(). The polished θ is accepted only when NLL does not increase by more than 1e-12 and the monotonicity cone is preserved. Has no effect on slsqp / trust-constr solvers.
fixed_params (dict[int, float] | None) –
Optional {index: value} mapping that pins arbitrary entries of the full parameter vector [theta_b | beta | gamma] at the given values during optimisation. Useful for profile likelihood, score tests, and nested-model fits.
- solver="auglag" (issue #85) — each entry is appended as an equality row e_i · θ = value on the C_eq/d_eq block, stacked under any lower/upper rows. The pin holds to the auglag KKT tolerance (~1e-8); the equality row remains visible on OptimizationResult.constraint_C_eq so downstream consumers (vcov(regularize='active')) see it.
- solver="slsqp" / "trust-constr" (issue #86) — the pinned indices are eliminated from the optimisation problem entirely: scipy sees the smaller free-subvector objective and constraint matrix sliced to the free columns. The pin therefore holds to machine precision regardless of solver tolerance. constraint_C_eq is None on this path (no equality row exists).
- InteractionBasis is not yet supported — generalising to vec_C(Θ) indices needs an explicit ADR decision and raises NotImplementedError.
Indices outside [0, total_params) raise ValueError.

auglag_options: AugLagOptions | None = None#

fixed_params: dict[int, float] | None = None#

lower: float | None = None#

max_iter: int = 1000#

max_restarts: int = 3#

polish: bool = True#

random_state: int | Generator | None = None#

solver: Literal['auglag', 'slsqp', 'trust-constr'] = 'auglag'#

tol: float = 1e-08#

upper: float | None = None#

use_gradient: bool = True#

verbose: bool = False#

mltpy.optimizer.optimize(basis, y, X=None, censoring=CensoringType.NONE, config=None, base_distribution='normal', weights=None, offset=None, scaling=None)[source]#

Fit Bernstein transformation model parameters by maximising log-likelihood.

Parameters:

basis (BernsteinBasis | InteractionBasis) – BernsteinBasis instance defining the response transformation.
y (ndarray[tuple[Any, ...], dtype[double]] | CensoredData) – Observations — plain NDArray for exact data, or CensoredData for censored data.
X (ndarray[tuple[Any, ...], dtype[double]] | None) – Optional covariate matrix, shape (n, q). If given, the last q entries of the returned theta are regression coefficients.
censoring (CensoringType) – Censoring type; passed through to the likelihood.
config (OptimizerConfig | None) – Optimisation settings. Defaults to OptimizerConfig with all defaults.
weights (ndarray[tuple[Any, ...], dtype[double]] | None) – Optional per-observation weights, shape (n,). Passed unchanged to the likelihood; no normalisation is applied.
offset (ndarray[tuple[Any, ...], dtype[double]] | None) – Optional per-observation offset, shape (n,). Added to h before distribution calls on every likelihood evaluation.
base_distribution (Literal['normal', 'logistic', 'min_extreme_value', 'max_extreme_value', 'exponential', 'laplace', 'cauchy'])
scaling (ndarray[tuple[Any, ...], dtype[double]] | None)

Returns:

Contains the optimised parameters, convergence status, and diagnostics. If all restarts fail, the best result found so far is returned with converged=False. The caller (model.py) decides whether to raise or warn.

Return type:

OptimizationResult