optimizer#
Optimisation wrapper for conditional transformation models.
This module contains no mathematical logic — it only orchestrates calls to
negative_log_likelihood() and
build_constraints().
Analogue to R’s mltoptim.R (sequential solver attempts) and the maxtry
restart mechanism in mlt().
- class mltpy.optimizer.OptimizationResult(theta, log_likelihood, converged, n_iter, n_restarts, solver_message, n_outer_iter=None, kkt_residual=None, rho_final=None, mu_ineq=None, lambda_eq=None, constraint_A_ineq=None, constraint_C_eq=None)[source]#
Bases:
objectResult of a
optimize()call.- Parameters:
theta (
ndarray[tuple[Any,...],dtype[double]]) – Optimised parameter vector[theta_basis | beta].log_likelihood (
float) – Log-likelihood value at the optimum (i.e.−nll).converged (
bool) – Whether scipy reported successful convergence on at least one attempt.n_iter (
int) – Number of iterations used by the final (best) scipy run.n_restarts (
int) – Number of restarts that were needed (0 if first attempt succeeded).solver_message (
str) – scipy’sresult.messagefrom the best attempt, unchanged.n_outer_iter (
int|None) – Number of PHR outer iterations the auglag solver used.Nonefor SLSQP / trust-constr fits (those solvers have no outer loop). Appears inrepr()so users can see at a glance how many Lagrange-multiplier updates the fit required.kkt_residual (
float|None) – Final KKT residual reported by the auglag solver — themax(‖h(θ)‖∞, ‖min(g(θ), μ/ρ)‖∞, ‖∇L_A(θ)‖∞)value at the returnedtheta.Nonefor SLSQP / trust-constr fits. Useful whenconverged=Falseto judge how close the run got before exhausting its outer-iteration budget.rho_final (
float|None) – Final penalty parameter ρ from the PHR auglag solver.Nonefor SLSQP / trust-constr fits.mu_ineq (
ndarray[tuple[Any,...],dtype[double]] |None) – Final inequality multipliers (≥ 0), shape(m_ineq,), from the auglag solver. For shift modelsm_ineq = order; for interaction modelsm_ineq = order * q.Nonefor SLSQP / trust-constr.lambda_eq (
ndarray[tuple[Any,...],dtype[double]] |None) – Final equality multipliers, shape(m_eq,), from the auglag solver. Currently always shape(0,)for all built-in models (no equality constraints are imposed).Nonefor SLSQP / trust-constr fits.constraint_A_ineq (
ndarray[tuple[Any,...],dtype[double]] |None) – Inequality constraint matrix used during optimisation, shape(m_ineq, total_params). Set for auglag fits;Nonefor SLSQP / trust-constr. Consumed bymodel.fit()to stash_A_ineq_for downstream inference (e.g. penalty-augmented Hessian invcov(regularize='active')).constraint_C_eq (
ndarray[tuple[Any,...],dtype[double]] |None) – Equality constraint matrix used during optimisation, shape(m_eq, total_params). Non-Nonefor auglag fits whenever any equality row is imposed — bylower/upper(boundary pins) or byfixed_params(arbitrary-index pins, issue #85), stacked in that order.Nonewhen no equality constraints were imposed or when the solver is not auglag.
- class mltpy.optimizer.OptimizerConfig(solver='auglag', max_iter=1000, tol=1e-08, max_restarts=3, use_gradient=True, verbose=False, random_state=None, auglag_options=None, lower=None, upper=None, polish=True, fixed_params=None)[source]#
Bases:
objectSettings for the optimisation run.
- Parameters:
solver (
Literal['auglag','slsqp','trust-constr']) –"auglag"(default),"slsqp", or"trust-constr". Auglag is the PHR augmented Lagrangian (matches Rmlt/alabama::auglagand gives the best parity with the reference implementation). SLSQP and trust-constr remain opt-in alternatives — SLSQP is faster on easy problems, trust-constr handles ill-conditioned ones better.max_iter (
int) – Maximum number of iterations passed to scipy.tol (
float) – Convergence tolerance. Mapped toftol(SLSQP) orgtol(trust-constr).max_restarts (
int) – Number of additional attempts after the first one. Analogous tomaxtryin R’smlt(). On each restart the starting point is perturbed and projected back to the feasible region.use_gradient (
bool) – IfTrue(default), the analytical gradient fromnegative_log_likelihood()is passed to scipy. Set toFalseonly for debugging.verbose (
bool) – IfTrue, print a warning on each failed attempt.random_state (
int|Generator|None) – If anint, seeds the RNG used to perturb restart starting points so repeated fits with the same config and data are bit-identical. If anumpy.random.Generator, it is used directly. IfNone(default), draws are non-reproducible across runs.auglag_options (
AugLagOptions|None) –AugLagOptionscontrolling the PHR outer loop. Only consulted whensolver="auglag"; ignored otherwise.None(default) usesAugLagOptionsdefaults (alabama parity).lower (
float|None) – If notNone, fixesθ[0] = loweras an equality constraint (pins the lower-boundary Bernstein coefficient). Honoured by every solver: passes through tobuild_constraints()for SLSQP/trust-constr andbuild_constraint_matrices()for auglag.upper (
float|None) – If notNone, fixesθ[n_params−1] = upperanalogously.polish (
bool) – IfTrue(default), run a Newton-CG polish step after auglag converges when no monotonicity constraints are active (interior-MLE fits). Usestrust-ncgseeded at auglag’s θ-hat with the analytical Hessian fromhessian(). The polished θ is accepted only when NLL does not increase by more than1e-12and the monotonicity cone is preserved. Has no effect onslsqp/trust-constrsolvers.fixed_params (
dict[int,float] |None) –Optional
{index: value}mapping that pins arbitrary entries of the full parameter vector[theta_b | beta | gamma]at the given values during optimisation. Useful for profile likelihood, score tests, and nested-model fits.solver="auglag"(issue #85) — each entry is appended as an equality rowe_i · θ = valueon theC_eq/d_eqblock, stacked under anylower/upperrows. The pin holds to the auglag KKT tolerance (~1e-8); the equality row remains visible onOptimizationResult.constraint_C_eqso downstream consumers (vcov(regularize='active')) see it.solver="slsqp"/"trust-constr"(issue #86) — the pinned indices are eliminated from the optimisation problem entirely: scipy sees the smaller free-subvector objective and constraint matrix sliced to the free columns. The pin therefore holds to machine precision regardless of solver tolerance.constraint_C_eqisNoneon this path (no equality row exists).InteractionBasisis not yet supported — generalising tovec_C(Θ)indices needs an explicit ADR decision and raisesNotImplementedError.
Indices outside
[0, total_params)raiseValueError.
- mltpy.optimizer.optimize(basis, y, X=None, censoring=CensoringType.NONE, config=None, base_distribution='normal', weights=None, offset=None, scaling=None)[source]#
Fit Bernstein transformation model parameters by maximising log-likelihood.
- Parameters:
basis (
BernsteinBasis|InteractionBasis) –BernsteinBasisinstance defining the response transformation.y (
ndarray[tuple[Any,...],dtype[double]] |CensoredData) – Observations — plainNDArrayfor exact data, orCensoredDatafor censored data.X (
ndarray[tuple[Any,...],dtype[double]] |None) – Optional covariate matrix, shape (n, q). If given, the lastqentries of the returnedthetaare regression coefficients.censoring (
CensoringType) – Censoring type; passed through to the likelihood.config (
OptimizerConfig|None) – Optimisation settings. Defaults toOptimizerConfigwith all defaults.weights (
ndarray[tuple[Any,...],dtype[double]] |None) – Optional per-observation weights, shape(n,). Passed unchanged to the likelihood; no normalisation is applied.offset (
ndarray[tuple[Any,...],dtype[double]] |None) – Optional per-observation offset, shape(n,). Added tohbefore distribution calls on every likelihood evaluation.base_distribution (
Literal['normal','logistic','min_extreme_value','max_extreme_value','exponential','laplace','cauchy'])
- Returns:
Contains the optimised parameters, convergence status, and diagnostics. If all restarts fail, the best result found so far is returned with
converged=False. The caller (model.py) decides whether to raise or warn.- Return type: