tanat.visualization.sequence.type.distribution package#
Submodules#
tanat.visualization.sequence.type.distribution.builder module#
DistributionVizBuilder: state-occupancy distribution visualizer for a SequencePool.
- class tanat.visualization.sequence.type.distribution.builder.DistributionVizBuilder(settings: Any | None = None, *, allow_large: bool = False)[source]#
Bases:
BaseSequenceVizBuilderBuilds state-occupancy distribution charts from a StateSequencePool.
Each time bin on the x-axis counts (or normalises) how many sequences are in each state at that point in time, using occupancy-based binning: a segment contributes to every bin it overlaps, not just the one containing its start.
Only compatible with
statesequence types.Typical usage via
SequenceVisualizer:SequenceVisualizer.distribution(mode="percentage", bin_size="1d") \ .title("State distribution over time") \ .colors("Set2") \ .draw(pool, entity_feature="status") \ .show()
- SETTINGS_CLASS[source]#
alias of
DistributionSettings
- marker(*, alpha: float | None = None, line_width: float | None = None) DistributionVizBuilder[source]#
Configure fill visual properties. Chainable.
- Parameters:
alpha – Opacity of the filled areas (0-1).
line_width – Width of the area boundary line.
- x_axis(*, show: bool | None = None, label: str | None = None, rotation: int | None = None, limit_min: float | None = None, limit_max: float | None = None, autofmt_xdate: bool | None = None) DistributionVizBuilder[source]#
Configure the time (x) axis. Chainable.
- Parameters:
show – Hide the axis entirely when
False.label – Axis label text.
rotation – Tick label rotation in degrees.
limit_min – Left bound (zooms into a time window).
limit_max – Right bound (zooms into a time window).
autofmt_xdate – Auto-rotate date tick labels.
- y_axis(*, show: bool | None = None, label: str | None = None, rotation: int | None = None, limit_min: float | None = None, limit_max: float | None = None) DistributionVizBuilder[source]#
Configure the value (y) axis. Chainable.
- Parameters:
show – Hide the axis entirely when
False.label – Axis label text.
rotation – Tick label rotation in degrees.
limit_min – Minimum y value.
limit_max – Maximum y value.
tanat.visualization.sequence.type.distribution.data module#
Pure-Polars data preparation for DistributionVizBuilder.
- tanat.visualization.sequence.type.distribution.data.aggregate_distribution(lf: LazyFrame, mode: str, *, facet_col: str | None = None) LazyFrame[source]#
Count label occurrences per time bin and compute the requested metric.
After grouping by
[__TIME_BIN__, __LABEL__](or[facet_col, __TIME_BIN__, __LABEL__]when facet_col is set), the raw count is either kept as-is ("count") or normalised within each bin ("proportion"/"percentage") using a window expression so no additional collect is required.- Parameters:
lf – LazyFrame with
__TIME_BIN__and__LABEL__columns.mode – One of
"count","proportion", or"percentage".facet_col – When set, include this column as the leading group-by dimension so that counts and normalisations are computed per facet × bin rather than globally.
- Returns:
LazyFrame with columns
__TIME_BIN__,__LABEL__,__VALUE__(plus facet_col when set).
- tanat.visualization.sequence.type.distribution.data.assign_time_bins(lf: pl.LazyFrame, bin_size: str | int | float, *, is_datetime: bool, display_unit: DisplayUnit | None = None) pl.LazyFrame[source]#
Add
__TIME_BIN__via occupancy-based binning.A segment contributes to every bin it overlaps, i.e., every bin b where
__START__ <= b < __END__. This avoids the start-bin artefact where a long segment spanning many bins would be counted in only the first one.Implementation note: a single
.collect()is performed to read the global temporal bounds (two scalars). The cross-join and filter remain lazy.- Parameters:
lf – LazyFrame with
__START__,__END__, and__LABEL__columns.bin_size – Polars duration string (e.g.
"1d"or"12h") for datetime pools, or a numeric step for timestep pools.is_datetime –
Truewhen the pool usesDatetimetime columns.display_unit – When set (relative mode with a datetime pool), bin_size is parsed and converted to the target unit via
_parse_bin_size_to_unit()andnp.arangegenerates numeric bins.None(default) keeps the original datetime / numeric behaviour.
- Returns:
LazyFrame with
__TIME_BIN__added and one row per (original row, bin) pair.
tanat.visualization.sequence.type.distribution.settings module#
DistributionVizBuilder settings.
- class tanat.visualization.sequence.type.distribution.settings.DistributionAesthetics(*, mode: DistributionMode = 'percentage', bin_size: str | int | float = '1d', stacked: bool = True, time_mode: TimeMode = 'absolute', display_unit: DisplayUnit | None = None)[source]#
Bases:
objectVisual aesthetics for the distribution chart.
- mode[source]#
What each stacked area represents.
"count": raw number of IDs occupying each state per bin."proportion": fraction of IDs in each state (sums to 1 per bin)."percentage": same as proportion expressed as 0-100 (default).
- Type:
Literal[‘count’, ‘proportion’, ‘percentage’]
- bin_size[source]#
Width of each time bin.
Datetime pools: a Polars duration string such as
"1d","12h","1w","1mo".Timestep pools: a numeric step value (
intorfloat).
- Type:
str | int | float
- stacked[source]#
When
True(default) render a stacked area chart. WhenFalserender overlapping transparent fills, one per label.- Type:
bool
- time_mode[source]#
"absolute"(default) uses real timestamps on the x-axis."relative"aligns all sequences to their per-ID T0 reference date; the x-axis shows a numeric offset from T0. Requiresset_t0()to have been called (or uses the lazy default ofposition=0).- Type:
Literal[‘absolute’, ‘relative’]
- display_unit[source]#
Target unit for the x-axis when
time_mode="relative"and the pool is datetime-based. One of"days"(default),"hours","minutes", or"seconds". Ignored for timestep pools (aUserWarningis emitted if explicitly set).- Type:
Literal[‘days’, ‘hours’, ‘minutes’, ‘seconds’] | None
- class tanat.visualization.sequence.type.distribution.settings.DistributionMarkerSettings(*, alpha: float = 0.7, line_width: float = 0.8)[source]#
Bases:
objectMarker visual properties for the distribution chart.
- class tanat.visualization.sequence.type.distribution.settings.DistributionSettings(*, title: TitleSettings = <factory>, colors: str | dict | list | None = None, figsize: tuple[float, float]=(10.0, 5.0), grid: GridSettings = <factory>, x_axis: XAxisSettings = <factory>, y_axis: YAxisSettings = <factory>, legend: LegendSettings = <factory>, facet: FacetSettings = <factory>, aesthetics: DistributionAesthetics = <factory>, null_handling: NullHandling = <factory>, marker: DistributionMarkerSettings = <factory>)[source]#
Bases:
BaseVizSettingsComplete settings bundle for
DistributionVizBuilder.- aesthetics: DistributionAesthetics[source]#
- legend: LegendSettings[source]#
- model_dump(*, mode='python', **dump_kwargs)[source]#
Dump settings to a dict via Pydantic serialization.
- null_handling: NullHandling[source]#
Module contents#
Distribution visualization package.
- class tanat.visualization.sequence.type.distribution.DistributionAesthetics(*, mode: DistributionMode = 'percentage', bin_size: str | int | float = '1d', stacked: bool = True, time_mode: TimeMode = 'absolute', display_unit: DisplayUnit | None = None)[source]#
Bases:
objectVisual aesthetics for the distribution chart.
- mode[source]#
What each stacked area represents.
"count": raw number of IDs occupying each state per bin."proportion": fraction of IDs in each state (sums to 1 per bin)."percentage": same as proportion expressed as 0-100 (default).
- Type:
Literal[‘count’, ‘proportion’, ‘percentage’]
- bin_size[source]#
Width of each time bin.
Datetime pools: a Polars duration string such as
"1d","12h","1w","1mo".Timestep pools: a numeric step value (
intorfloat).
- Type:
str | int | float
- stacked[source]#
When
True(default) render a stacked area chart. WhenFalserender overlapping transparent fills, one per label.- Type:
bool
- time_mode[source]#
"absolute"(default) uses real timestamps on the x-axis."relative"aligns all sequences to their per-ID T0 reference date; the x-axis shows a numeric offset from T0. Requiresset_t0()to have been called (or uses the lazy default ofposition=0).- Type:
Literal[‘absolute’, ‘relative’]
- display_unit[source]#
Target unit for the x-axis when
time_mode="relative"and the pool is datetime-based. One of"days"(default),"hours","minutes", or"seconds". Ignored for timestep pools (aUserWarningis emitted if explicitly set).- Type:
Literal[‘days’, ‘hours’, ‘minutes’, ‘seconds’] | None
- class tanat.visualization.sequence.type.distribution.DistributionMarkerSettings(*, alpha: float = 0.7, line_width: float = 0.8)[source]#
Bases:
objectMarker visual properties for the distribution chart.
- class tanat.visualization.sequence.type.distribution.DistributionSettings(*, title: TitleSettings = <factory>, colors: str | dict | list | None = None, figsize: tuple[float, float]=(10.0, 5.0), grid: GridSettings = <factory>, x_axis: XAxisSettings = <factory>, y_axis: YAxisSettings = <factory>, legend: LegendSettings = <factory>, facet: FacetSettings = <factory>, aesthetics: DistributionAesthetics = <factory>, null_handling: NullHandling = <factory>, marker: DistributionMarkerSettings = <factory>)[source]#
Bases:
BaseVizSettingsComplete settings bundle for
DistributionVizBuilder.- aesthetics: DistributionAesthetics[source]#
- legend: LegendSettings[source]#
- model_dump(*, mode='python', **dump_kwargs)[source]#
Dump settings to a dict via Pydantic serialization.
- null_handling: NullHandling[source]#
- class tanat.visualization.sequence.type.distribution.DistributionVizBuilder(settings: Any | None = None, *, allow_large: bool = False)[source]#
Bases:
BaseSequenceVizBuilderBuilds state-occupancy distribution charts from a StateSequencePool.
Each time bin on the x-axis counts (or normalises) how many sequences are in each state at that point in time, using occupancy-based binning: a segment contributes to every bin it overlaps, not just the one containing its start.
Only compatible with
statesequence types.Typical usage via
SequenceVisualizer:SequenceVisualizer.distribution(mode="percentage", bin_size="1d") \ .title("State distribution over time") \ .colors("Set2") \ .draw(pool, entity_feature="status") \ .show()
- SETTINGS_CLASS[source]#
alias of
DistributionSettings
- marker(*, alpha: float | None = None, line_width: float | None = None) DistributionVizBuilder[source]#
Configure fill visual properties. Chainable.
- Parameters:
alpha – Opacity of the filled areas (0-1).
line_width – Width of the area boundary line.
- x_axis(*, show: bool | None = None, label: str | None = None, rotation: int | None = None, limit_min: float | None = None, limit_max: float | None = None, autofmt_xdate: bool | None = None) DistributionVizBuilder[source]#
Configure the time (x) axis. Chainable.
- Parameters:
show – Hide the axis entirely when
False.label – Axis label text.
rotation – Tick label rotation in degrees.
limit_min – Left bound (zooms into a time window).
limit_max – Right bound (zooms into a time window).
autofmt_xdate – Auto-rotate date tick labels.
- y_axis(*, show: bool | None = None, label: str | None = None, rotation: int | None = None, limit_min: float | None = None, limit_max: float | None = None) DistributionVizBuilder[source]#
Configure the value (y) axis. Chainable.
- Parameters:
show – Hide the axis entirely when
False.label – Axis label text.
rotation – Tick label rotation in degrees.
limit_min – Minimum y value.
limit_max – Maximum y value.