variables#

Variable types and censoring classes for conditional transformation models.

class mltpy.variables.CensoredData(exact, lower, upper, trunc_lower=None, trunc_upper=None)[source]#

Bases: object

Encodes n observations with optional censoring and truncation.

For observation i exactly one censoring pattern is valid:

  • Exact: exact[i] is finite

  • Right-censored: exact[i] is NaN, lower[i] finite, upper[i] = +inf

  • Left-censored: exact[i] is NaN, lower[i] = -inf, upper[i] finite

  • Interval-censored: exact[i] is NaN, both bounds finite

Truncation bounds constrain the observable range: only observations inside [trunc_lower[i], trunc_upper[i]] can appear in the sample.

Parameters:
exact: ndarray[tuple[Any, ...], dtype[float64]]#
classmethod from_exact(y)[source]#

All observations exact (no censoring).

Parameters:

y (ndarray[tuple[Any, ...], dtype[double]])

Return type:

CensoredData

classmethod interval_censored(lower, upper)[source]#

All observations interval-censored with known bounds [lower, upper].

Parameters:
Return type:

CensoredData

property is_exact_mask: ndarray[tuple[Any, ...], dtype[bool]]#

True where observation is exact.

Type:

Boolean mask

property is_interval_censored_mask: ndarray[tuple[Any, ...], dtype[bool]]#

True where observation is interval-censored.

Type:

Boolean mask

property is_left_censored_mask: ndarray[tuple[Any, ...], dtype[bool]]#

True where observation is left-censored.

Type:

Boolean mask

property is_right_censored_mask: ndarray[tuple[Any, ...], dtype[bool]]#

True where observation is right-censored.

Type:

Boolean mask

classmethod left_censored(y, censored)[source]#

Left-censored data.

Parameters:
Return type:

CensoredData

classmethod left_truncated(y, trunc_lower, censored=None)[source]#

Left-truncated (delayed-entry) data, optionally with right censoring.

Mirrors R’s Surv(start, stop, event) counting-process encoding used by the survival package: each observation is only at risk starting from trunc_lower[i]. When censored is given, the same boolean convention as right_censored() applies — True means the actual event time is above y[i].

Parameters:
Return type:

CensoredData

lower: ndarray[tuple[Any, ...], dtype[float64]]#
property n: int#

Number of observations.

property n_censored: int#
property n_exact: int#
classmethod right_censored(y, censored)[source]#

Right-censored data.

Parameters:
Return type:

CensoredData

trunc_lower: ndarray[tuple[Any, ...], dtype[float64]] | None = None#
trunc_upper: ndarray[tuple[Any, ...], dtype[float64]] | None = None#
upper: ndarray[tuple[Any, ...], dtype[float64]]#
class mltpy.variables.CensoringType(*values)[source]#

Bases: Enum

Censoring regime for a dataset passed to the log-likelihood.

INTERVAL = 4#
LEFT = 2#
NONE = 1#
RIGHT = 3#
class mltpy.variables.OrderedVariable(levels)[source]#

Bases: object

Ordered categorical response with K levels and K-1 transformation cutpoints.

Used by mltpy.tram.Polr (proportional-odds ordinal regression). A level-k observation (1 <= k <= K) is mapped to interval-censored bounds on a synthetic integer cut scale:

level 1   → (-∞, 1]
level k   → (k-1, k]      for 1 < k < K
level K   → (K-1, +∞)

Combined with mltpy.basis.OrdinalBasis, the cut position k selects one of K-1 Bernstein-like coefficients θ_k so that h(y_k) = θ_k exactly.

Parameters:

levels (tuple[Any, ...]) – Tuple of ordered category labels (any hashable values). Must contain at least two distinct levels.

property K: int#

Number of levels.

decode(codes)[source]#

Inverse of encode() — map 1..K codes back to labels.

Parameters:

codes (ndarray[tuple[Any, ...], dtype[int_]]) – Integer codes of shape (n,) with values in {1, ..., K}.

Returns:

Labels in their original dtype (object array for non-numeric).

Return type:

ndarray[tuple[Any, ...], dtype[Any]]

Raises:
  • ValueError – If any code is outside the valid range, or if any floating-point code is not integer-valued (e.g. 1.7).

  • TypeError – If codes has a non-numeric dtype (object, complex, …).

encode(y)[source]#

Map labels to 1-based integer codes 1..K.

Parameters:

y (Sequence[Any] | ndarray[tuple[Any, ...], dtype[Any]]) – Sequence of category labels.

Returns:

Integer codes of shape (n,).

Return type:

ndarray[tuple[Any, ...], dtype[int_]]

Raises:

ValueError – If any label is not in levels.

classmethod from_labels(y, levels=None)[source]#

Coerce raw observations into (OrderedVariable, CensoredData).

Level inference order:

  1. If levels is given explicitly, use it.

  2. Else if y is a pandas ordered Categorical, use y.cat.categories.

  3. Else: sorted unique values (deterministic ordering).

Validates that every observation lies in the resolved level set.

Parameters:
Returns:

variable carries the level vocabulary; censored_data has one row per observation with synthetic integer-cut bounds suitable for OrdinalBasis and the interval-censored likelihood path.

Return type:

tuple[OrderedVariable, CensoredData]

Raises:

ValueError – If levels is empty or not unique, or if any observation in y is missing from the resolved level set.

levels: tuple[Any, ...]#