variables#
Variable types and censoring classes for conditional transformation models.
- class mltpy.variables.CensoredData(exact, lower, upper, trunc_lower=None, trunc_upper=None)[source]#
Bases:
objectEncodes n observations with optional censoring and truncation.
For observation i exactly one censoring pattern is valid:
Exact:
exact[i]is finiteRight-censored:
exact[i]is NaN,lower[i]finite,upper[i]= +infLeft-censored:
exact[i]is NaN,lower[i]= -inf,upper[i]finiteInterval-censored:
exact[i]is NaN, both bounds finite
Truncation bounds constrain the observable range: only observations inside
[trunc_lower[i], trunc_upper[i]]can appear in the sample.- Parameters:
exact (
ndarray[tuple[Any,...],dtype[double]]) – Length-n array. Usenp.nanfor censored observations.lower (
ndarray[tuple[Any,...],dtype[double]]) – Length-n array of lower bounds. Use-np.inffor left-censored.upper (
ndarray[tuple[Any,...],dtype[double]]) – Length-n array of upper bounds. Use+np.inffor right-censored.trunc_lower (
ndarray[tuple[Any,...],dtype[double]] |None) – Optional length-n array of left truncation points.trunc_upper (
ndarray[tuple[Any,...],dtype[double]] |None) – Optional length-n array of right truncation points.
- classmethod interval_censored(lower, upper)[source]#
All observations interval-censored with known bounds [lower, upper].
- property is_exact_mask: ndarray[tuple[Any, ...], dtype[bool]]#
True where observation is exact.
- Type:
Boolean mask
- property is_interval_censored_mask: ndarray[tuple[Any, ...], dtype[bool]]#
True where observation is interval-censored.
- Type:
Boolean mask
- property is_left_censored_mask: ndarray[tuple[Any, ...], dtype[bool]]#
True where observation is left-censored.
- Type:
Boolean mask
- property is_right_censored_mask: ndarray[tuple[Any, ...], dtype[bool]]#
True where observation is right-censored.
- Type:
Boolean mask
- classmethod left_truncated(y, trunc_lower, censored=None)[source]#
Left-truncated (delayed-entry) data, optionally with right censoring.
Mirrors R’s
Surv(start, stop, event)counting-process encoding used by the survival package: each observation is only at risk starting fromtrunc_lower[i]. Whencensoredis given, the same boolean convention asright_censored()applies —Truemeans the actual event time is abovey[i].- Parameters:
y (
ndarray[tuple[Any,...],dtype[double]]) – Observed value (exact event time, or right-censoring threshold).trunc_lower (
ndarray[tuple[Any,...],dtype[double]]) – Length-n array of left-truncation points (delayed-entry times).censored (
ndarray[tuple[Any,...],dtype[bool]] |None) – Optional boolean array of right-censoring indicators.None(default) treats all observations as exactly observed.
- Return type:
- class mltpy.variables.CensoringType(*values)[source]#
Bases:
EnumCensoring regime for a dataset passed to the log-likelihood.
- INTERVAL = 4#
- LEFT = 2#
- NONE = 1#
- RIGHT = 3#
- class mltpy.variables.OrderedVariable(levels)[source]#
Bases:
objectOrdered categorical response with K levels and K-1 transformation cutpoints.
Used by
mltpy.tram.Polr(proportional-odds ordinal regression). A level-kobservation (1 <= k <= K) is mapped to interval-censored bounds on a synthetic integer cut scale:level 1 → (-∞, 1] level k → (k-1, k] for 1 < k < K level K → (K-1, +∞)
Combined with
mltpy.basis.OrdinalBasis, the cut positionkselects one ofK-1Bernstein-like coefficientsθ_kso thath(y_k) = θ_kexactly.- Parameters:
levels (
tuple[Any,...]) – Tuple of ordered category labels (any hashable values). Must contain at least two distinct levels.
- decode(codes)[source]#
Inverse of
encode()— map1..Kcodes back to labels.- Parameters:
codes (
ndarray[tuple[Any,...],dtype[int_]]) – Integer codes of shape(n,)with values in{1, ..., K}.- Returns:
Labels in their original dtype (object array for non-numeric).
- Return type:
- Raises:
ValueError – If any code is outside the valid range, or if any floating-point code is not integer-valued (e.g.
1.7).TypeError – If
codeshas a non-numeric dtype (object, complex, …).
- classmethod from_labels(y, levels=None)[source]#
Coerce raw observations into
(OrderedVariable, CensoredData).Level inference order:
If
levelsis given explicitly, use it.Else if
yis a pandas ordered Categorical, usey.cat.categories.Else: sorted unique values (deterministic ordering).
Validates that every observation lies in the resolved level set.
- Parameters:
- Returns:
variablecarries the level vocabulary;censored_datahas one row per observation with synthetic integer-cut bounds suitable forOrdinalBasisand the interval-censored likelihood path.- Return type:
- Raises:
ValueError – If
levelsis empty or not unique, or if any observation inyis missing from the resolved level set.