tanat.visualization.sequence.type.distribution package#

Submodules#

tanat.visualization.sequence.type.distribution.builder module#

DistributionVizBuilder: state-occupancy distribution visualizer for a SequencePool.

class tanat.visualization.sequence.type.distribution.builder.DistributionVizBuilder(settings: Any | None = None, *, allow_large: bool = False)[source]#

Bases: BaseSequenceVizBuilder

Builds state-occupancy distribution charts from a StateSequencePool.

Each time bin on the x-axis counts (or normalises) how many sequences are in each state at that point in time, using occupancy-based binning: a segment contributes to every bin it overlaps, not just the one containing its start.

Only compatible with state sequence types.

Typical usage via SequenceVisualizer:

SequenceVisualizer.distribution(mode="percentage", bin_size="1d") \
    .title("State distribution over time") \
    .colors("Set2") \
    .draw(pool, entity_feature="status") \
    .show()
SETTINGS_CLASS[source]#

alias of DistributionSettings

__init__(settings: Any | None = None, *, allow_large: bool = False) None[source]#
marker(*, alpha: float | None = None, line_width: float | None = None) DistributionVizBuilder[source]#

Configure fill visual properties. Chainable.

Parameters:
  • alpha – Opacity of the filled areas (0-1).

  • line_width – Width of the area boundary line.

x_axis(*, show: bool | None = None, label: str | None = None, rotation: int | None = None, limit_min: float | None = None, limit_max: float | None = None, autofmt_xdate: bool | None = None) DistributionVizBuilder[source]#

Configure the time (x) axis. Chainable.

Parameters:
  • show – Hide the axis entirely when False.

  • label – Axis label text.

  • rotation – Tick label rotation in degrees.

  • limit_min – Left bound (zooms into a time window).

  • limit_max – Right bound (zooms into a time window).

  • autofmt_xdate – Auto-rotate date tick labels.

y_axis(*, show: bool | None = None, label: str | None = None, rotation: int | None = None, limit_min: float | None = None, limit_max: float | None = None) DistributionVizBuilder[source]#

Configure the value (y) axis. Chainable.

Parameters:
  • show – Hide the axis entirely when False.

  • label – Axis label text.

  • rotation – Tick label rotation in degrees.

  • limit_min – Minimum y value.

  • limit_max – Maximum y value.

tanat.visualization.sequence.type.distribution.data module#

Pure-Polars data preparation for DistributionVizBuilder.

tanat.visualization.sequence.type.distribution.data.aggregate_distribution(lf: LazyFrame, mode: str, *, facet_col: str | None = None) LazyFrame[source]#

Count label occurrences per time bin and compute the requested metric.

After grouping by [__TIME_BIN__, __LABEL__] (or [facet_col, __TIME_BIN__, __LABEL__] when facet_col is set), the raw count is either kept as-is ("count") or normalised within each bin ("proportion" / "percentage") using a window expression so no additional collect is required.

Parameters:
  • lf – LazyFrame with __TIME_BIN__ and __LABEL__ columns.

  • mode – One of "count", "proportion", or "percentage".

  • facet_col – When set, include this column as the leading group-by dimension so that counts and normalisations are computed per facet × bin rather than globally.

Returns:

LazyFrame with columns __TIME_BIN__, __LABEL__, __VALUE__ (plus facet_col when set).

tanat.visualization.sequence.type.distribution.data.assign_time_bins(lf: pl.LazyFrame, bin_size: str | int | float, *, is_datetime: bool, display_unit: DisplayUnit | None = None) pl.LazyFrame[source]#

Add __TIME_BIN__ via occupancy-based binning.

A segment contributes to every bin it overlaps, i.e., every bin b where __START__ <= b < __END__. This avoids the start-bin artefact where a long segment spanning many bins would be counted in only the first one.

Implementation note: a single .collect() is performed to read the global temporal bounds (two scalars). The cross-join and filter remain lazy.

Parameters:
  • lf – LazyFrame with __START__, __END__, and __LABEL__ columns.

  • bin_size – Polars duration string (e.g. "1d" or "12h") for datetime pools, or a numeric step for timestep pools.

  • is_datetimeTrue when the pool uses Datetime time columns.

  • display_unit – When set (relative mode with a datetime pool), bin_size is parsed and converted to the target unit via _parse_bin_size_to_unit() and np.arange generates numeric bins. None (default) keeps the original datetime / numeric behaviour.

Returns:

LazyFrame with __TIME_BIN__ added and one row per (original row, bin) pair.

tanat.visualization.sequence.type.distribution.settings module#

DistributionVizBuilder settings.

class tanat.visualization.sequence.type.distribution.settings.DistributionAesthetics(*, mode: DistributionMode = 'percentage', bin_size: str | int | float = '1d', stacked: bool = True, time_mode: TimeMode = 'absolute', display_unit: DisplayUnit | None = None)[source]#

Bases: object

Visual aesthetics for the distribution chart.

mode[source]#

What each stacked area represents.

  • "count": raw number of IDs occupying each state per bin.

  • "proportion": fraction of IDs in each state (sums to 1 per bin).

  • "percentage": same as proportion expressed as 0-100 (default).

Type:

Literal[‘count’, ‘proportion’, ‘percentage’]

bin_size[source]#

Width of each time bin.

  • Datetime pools: a Polars duration string such as "1d", "12h", "1w", "1mo".

  • Timestep pools: a numeric step value (int or float).

Type:

str | int | float

stacked[source]#

When True (default) render a stacked area chart. When False render overlapping transparent fills, one per label.

Type:

bool

time_mode[source]#

"absolute" (default) uses real timestamps on the x-axis. "relative" aligns all sequences to their per-ID T0 reference date; the x-axis shows a numeric offset from T0. Requires set_t0() to have been called (or uses the lazy default of position=0).

Type:

Literal[‘absolute’, ‘relative’]

display_unit[source]#

Target unit for the x-axis when time_mode="relative" and the pool is datetime-based. One of "days" (default), "hours", "minutes", or "seconds". Ignored for timestep pools (a UserWarning is emitted if explicitly set).

Type:

Literal[‘days’, ‘hours’, ‘minutes’, ‘seconds’] | None

__init__(*args: Any, **kwargs: Any) None[source]#
bin_size: str | int | float = '1d'[source]#
display_unit: Literal['days', 'hours', 'minutes', 'seconds'] | None = None[source]#
mode: Literal['count', 'proportion', 'percentage'] = 'percentage'[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

stacked: bool = True[source]#
time_mode: Literal['absolute', 'relative'] = 'absolute'[source]#
class tanat.visualization.sequence.type.distribution.settings.DistributionMarkerSettings(*, alpha: float = 0.7, line_width: float = 0.8)[source]#

Bases: object

Marker visual properties for the distribution chart.

alpha[source]#

Opacity of the filled areas (0-1).

Type:

float

line_width[source]#

Width of the area boundary line.

Type:

float

__init__(*args: Any, **kwargs: Any) None[source]#
alpha: float = 0.7[source]#
line_width: float = 0.8[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

class tanat.visualization.sequence.type.distribution.settings.DistributionSettings(*, title: TitleSettings = <factory>, colors: str | dict | list | None = None, figsize: tuple[float, float]=(10.0, 5.0), grid: GridSettings = <factory>, x_axis: XAxisSettings = <factory>, y_axis: YAxisSettings = <factory>, legend: LegendSettings = <factory>, facet: FacetSettings = <factory>, aesthetics: DistributionAesthetics = <factory>, null_handling: NullHandling = <factory>, marker: DistributionMarkerSettings = <factory>)[source]#

Bases: BaseVizSettings

Complete settings bundle for DistributionVizBuilder.

aesthetics[source]#

High-level visual choices (mode, bin_size, stacked, time_mode).

Type:

tanat.visualization.sequence.type.distribution.settings.DistributionAesthetics

marker[source]#

Low-level fill/line properties.

Type:

tanat.visualization.sequence.type.distribution.settings.DistributionMarkerSettings

legend[source]#

Legend visibility and placement. Shown by default.

Type:

tanat.visualization.style.legend.LegendSettings

__init__(*args: Any, **kwargs: Any) None[source]#
aesthetics: DistributionAesthetics[source]#
legend: LegendSettings[source]#
marker: DistributionMarkerSettings[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

null_handling: NullHandling[source]#

Module contents#

Distribution visualization package.

class tanat.visualization.sequence.type.distribution.DistributionAesthetics(*, mode: DistributionMode = 'percentage', bin_size: str | int | float = '1d', stacked: bool = True, time_mode: TimeMode = 'absolute', display_unit: DisplayUnit | None = None)[source]#

Bases: object

Visual aesthetics for the distribution chart.

mode[source]#

What each stacked area represents.

  • "count": raw number of IDs occupying each state per bin.

  • "proportion": fraction of IDs in each state (sums to 1 per bin).

  • "percentage": same as proportion expressed as 0-100 (default).

Type:

Literal[‘count’, ‘proportion’, ‘percentage’]

bin_size[source]#

Width of each time bin.

  • Datetime pools: a Polars duration string such as "1d", "12h", "1w", "1mo".

  • Timestep pools: a numeric step value (int or float).

Type:

str | int | float

stacked[source]#

When True (default) render a stacked area chart. When False render overlapping transparent fills, one per label.

Type:

bool

time_mode[source]#

"absolute" (default) uses real timestamps on the x-axis. "relative" aligns all sequences to their per-ID T0 reference date; the x-axis shows a numeric offset from T0. Requires set_t0() to have been called (or uses the lazy default of position=0).

Type:

Literal[‘absolute’, ‘relative’]

display_unit[source]#

Target unit for the x-axis when time_mode="relative" and the pool is datetime-based. One of "days" (default), "hours", "minutes", or "seconds". Ignored for timestep pools (a UserWarning is emitted if explicitly set).

Type:

Literal[‘days’, ‘hours’, ‘minutes’, ‘seconds’] | None

__init__(*args: Any, **kwargs: Any) None[source]#
bin_size: str | int | float = '1d'[source]#
display_unit: Literal['days', 'hours', 'minutes', 'seconds'] | None = None[source]#
mode: Literal['count', 'proportion', 'percentage'] = 'percentage'[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

stacked: bool = True[source]#
time_mode: Literal['absolute', 'relative'] = 'absolute'[source]#
class tanat.visualization.sequence.type.distribution.DistributionMarkerSettings(*, alpha: float = 0.7, line_width: float = 0.8)[source]#

Bases: object

Marker visual properties for the distribution chart.

alpha[source]#

Opacity of the filled areas (0-1).

Type:

float

line_width[source]#

Width of the area boundary line.

Type:

float

__init__(*args: Any, **kwargs: Any) None[source]#
alpha: float = 0.7[source]#
line_width: float = 0.8[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

class tanat.visualization.sequence.type.distribution.DistributionSettings(*, title: TitleSettings = <factory>, colors: str | dict | list | None = None, figsize: tuple[float, float]=(10.0, 5.0), grid: GridSettings = <factory>, x_axis: XAxisSettings = <factory>, y_axis: YAxisSettings = <factory>, legend: LegendSettings = <factory>, facet: FacetSettings = <factory>, aesthetics: DistributionAesthetics = <factory>, null_handling: NullHandling = <factory>, marker: DistributionMarkerSettings = <factory>)[source]#

Bases: BaseVizSettings

Complete settings bundle for DistributionVizBuilder.

aesthetics[source]#

High-level visual choices (mode, bin_size, stacked, time_mode).

Type:

tanat.visualization.sequence.type.distribution.settings.DistributionAesthetics

marker[source]#

Low-level fill/line properties.

Type:

tanat.visualization.sequence.type.distribution.settings.DistributionMarkerSettings

legend[source]#

Legend visibility and placement. Shown by default.

Type:

tanat.visualization.style.legend.LegendSettings

__init__(*args: Any, **kwargs: Any) None[source]#
aesthetics: DistributionAesthetics[source]#
legend: LegendSettings[source]#
marker: DistributionMarkerSettings[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

null_handling: NullHandling[source]#
class tanat.visualization.sequence.type.distribution.DistributionVizBuilder(settings: Any | None = None, *, allow_large: bool = False)[source]#

Bases: BaseSequenceVizBuilder

Builds state-occupancy distribution charts from a StateSequencePool.

Each time bin on the x-axis counts (or normalises) how many sequences are in each state at that point in time, using occupancy-based binning: a segment contributes to every bin it overlaps, not just the one containing its start.

Only compatible with state sequence types.

Typical usage via SequenceVisualizer:

SequenceVisualizer.distribution(mode="percentage", bin_size="1d") \
    .title("State distribution over time") \
    .colors("Set2") \
    .draw(pool, entity_feature="status") \
    .show()
SETTINGS_CLASS[source]#

alias of DistributionSettings

__init__(settings: Any | None = None, *, allow_large: bool = False) None[source]#
marker(*, alpha: float | None = None, line_width: float | None = None) DistributionVizBuilder[source]#

Configure fill visual properties. Chainable.

Parameters:
  • alpha – Opacity of the filled areas (0-1).

  • line_width – Width of the area boundary line.

x_axis(*, show: bool | None = None, label: str | None = None, rotation: int | None = None, limit_min: float | None = None, limit_max: float | None = None, autofmt_xdate: bool | None = None) DistributionVizBuilder[source]#

Configure the time (x) axis. Chainable.

Parameters:
  • show – Hide the axis entirely when False.

  • label – Axis label text.

  • rotation – Tick label rotation in degrees.

  • limit_min – Left bound (zooms into a time window).

  • limit_max – Right bound (zooms into a time window).

  • autofmt_xdate – Auto-rotate date tick labels.

y_axis(*, show: bool | None = None, label: str | None = None, rotation: int | None = None, limit_min: float | None = None, limit_max: float | None = None) DistributionVizBuilder[source]#

Configure the value (y) axis. Chainable.

Parameters:
  • show – Hide the axis entirely when False.

  • label – Axis label text.

  • rotation – Tick label rotation in degrees.

  • limit_min – Minimum y value.

  • limit_max – Maximum y value.