tanat.visualization.sequence.base package#

Submodules#

tanat.visualization.sequence.base.builder module#

BaseSequenceVizBuilder: abstract base for all sequence visualization builders.

class tanat.visualization.sequence.base.builder.BaseSequenceVizBuilder(settings: dict | BaseVizSettings | None = None, *, allow_large: bool = False)[source]#

Bases: ABC, CachableSettings, Registrable

Abstract base class for sequence visualization builders.

The draw() method orchestrates the full pipeline: prepare data → create figure → render → apply styling.

MAX_CATEGORY: int = 30[source]#

MAX_FACET: int = 20[source]#

SETTINGS_CLASS: type[BaseVizSettings] | None = None[source]#

__init__(settings: dict | BaseVizSettings | None = None, *, allow_large: bool = False) → None[source]#

colors(spec: str | dict | list) → BaseSequenceVizBuilder[source]#

Set the color specification. Chainable.

Parameters:: spec – A matplotlib colormap name (str), a mapping of {label: color} (dict), or an ordered list of colors (list). None falls back to the matplotlib default cycle.

draw(sequence_or_pool: SequencePool | Sequence, *, entity_feature: str) → VisualizationResult[source]#

Full pipeline: prepare data, create figure, render, return result.

Parameters:

sequence_or_pool – A SequencePool or an individual Sequence (e.g. obtained via pool[42]).
entity_feature – Entity feature column to use as category labels.

Returns:

A VisualizationResult.

facet(by: str, *, is_static: bool = False, cols: int = 3, share_x: bool = True, share_y: bool = True, figsize_per_facet: tuple[float, float] = (5.0, 4.0), title_template: str = '{by} = {value}') → BaseSequenceVizBuilder[source]#

Enable faceted (small-multiples) view. Chainable.

Calling this method updates the settings and clears the data cache, so any previously cached prepare_data() result is discarded.

Parameters:

by – Feature name to split by.
is_static – True → static feature; False (default) → entity feature.
cols – Number of columns in the facet grid (default 3).
share_x – Share the x-axis scale across facets (default True).
share_y – Share the y-axis scale across facets (default True).
figsize_per_facet – Width × height of each cell in inches.
title_template – Format string for each facet title. Placeholders: {by}, {value}, {index}.

figsize(width: float, height: float) → BaseSequenceVizBuilder[source]#

Set the figure dimensions. Chainable.

Parameters:

width – Figure width in inches.
height – Figure height in inches.

grid(*, show: bool = True, color: str | None = None, linewidth: float | None = None, axis: str | None = None) → BaseSequenceVizBuilder[source]#

Configure the background grid. Chainable.

Grid lines are always rendered behind markers and bars.

Parameters:

show – Display grid lines when True (default). Pass False to explicitly hide a grid that was previously enabled.
color – Line color, any matplotlib color string (default "lightgrey").
linewidth – Line width in points (default 0.8).
axis – Which axis to draw lines for: "both" (default), "x", or "y".

legend(*, show: bool = True, location: str = 'best', title: str | None = None) → BaseSequenceVizBuilder[source]#

Configure legend display. Chainable.

Parameters:

show – Display the legend when True (default).
location – Matplotlib location string, e.g. "upper right".
title – Optional legend title.

legend_off() → BaseSequenceVizBuilder[source]#: Hide the legend. Convenience shortcut for .legend(show=False). Chainable.

prepare_data(sequence_or_pool: SequencePool | Sequence, *, entity_feature: str) → DataFrame[source]#

Prepare the aggregated Polars DataFrame for rendering.

Parameters:

sequence_or_pool – A SequencePool or an individual Sequence (e.g. obtained via pool[42]).
entity_feature – Entity feature column to use as category labels.

Returns:

Polars DataFrame with columns [__LABEL__, __VALUE__], plus __COLOR__ when a color spec has been set via colors().

Raises:

TypeError – If sequence_or_pool is not a SequencePool or a Sequence.
KeyError – If entity_feature is not a declared entity feature.

title(text: str, *, fontsize: int | None = None, fontweight: str | None = None, pad: float | None = None) → BaseSequenceVizBuilder[source]#

Set the figure title. Chainable.

Parameters:

text – Title string displayed above the chart.
fontsize – Font size in points. None uses the matplotlib default.
fontweight – Font weight, e.g. "bold" or "normal".
pad – Spacing between the title and the chart in points.

tanat.visualization.sequence.base.exceptions module#

Shared visualization exceptions for sequence builders.

exception tanat.visualization.sequence.base.exceptions.IncompatibleDisplayUnitError(display_unit: str | None, *, is_datetime: bool)[source]#

Bases: TanaTException, ValueError

Raised when display_unit is incompatible with the pool’s time index.

Two incompatible situations:

A datetime pool requires an explicit display_unit to convert raw millisecond durations to a human-readable unit.
A numeric timestep pool does not support display_unit; durations are already in raw timestep values.

__init__(display_unit: str | None, *, is_datetime: bool) → None[source]#

exception tanat.visualization.sequence.base.exceptions.UnsupportedSequenceTypeError(found_type: str, *, compatible_types: frozenset[str] | None = None)[source]#

Bases: TanaTException, ValueError

Raised when the pool type is not compatible with a visualization builder.

Each builder declares its _COMPATIBLE_TYPES class attribute. Passing a pool of a different type raises this exception.

__init__(found_type: str, *, compatible_types: frozenset[str] | None = None) → None[source]#

tanat.visualization.sequence.base.literals module#

Shared Literal type aliases for sequence visualization builders.

tanat.visualization.sequence.base.settings module#

Shared base settings for sequence visualization builders.

class tanat.visualization.sequence.base.settings.NullHandling(*, na_time_index: NaTimeIndex = 'drop', na_label: NaLabel = 'drop')[source]#

Bases: object

How the pipeline handles null values before rendering.

This is a data-preparation concern, not a visual one: the chosen strategies decide which rows survive into the chart, but they do not change any visual property.

na_time_index[source]#

Strategy for null values in time index columns (__TIME__, __START__, __END__). "drop" (default) removes affected rows with a UserWarning; "raise" raises ValueError immediately.

Type:: Literal[‘drop’, ‘raise’]

na_label[source]#

Strategy for null values in the entity feature label. "drop" (default) removes rows with a UserWarning; "raise" raises ValueError immediately; "category" replaces nulls with the string "N/A", making missing values an explicit visible category.

Type:: Literal[‘drop’, ‘raise’, ‘category’]

__init__(*args: Any, **kwargs: Any) → None[source]#

model_dump(*, mode='python', **dump_kwargs)[source]#: Dump settings to a dict via Pydantic serialization.

na_label: Literal['drop', 'raise', 'category'] = 'drop'[source]#

na_time_index: Literal['drop', 'raise'] = 'drop'[source]#

tanat.visualization.sequence.base.utils module#

Shared pure-Polars utility functions for sequence visualization data preparation.

tanat.visualization.sequence.base.utils.handle_null_labels(lf: LazyFrame, strategy: Literal['drop', 'raise', 'category']) → LazyFrame[source]#

Handle null values in the __LABEL__ column.

Parameters:

lf – Input LazyFrame (must contain __LABEL__).
strategy – "drop" removes null-label rows with a UserWarning; "raise" raises ValueError immediately; "category" replaces nulls with the string "N/A".

Returns:

LazyFrame with null labels handled.

Raises:

ValueError – If strategy is "raise" and nulls are found, or if "drop" would remove every row.

tanat.visualization.sequence.base.utils.handle_null_time_index(lf: LazyFrame, strategy: Literal['drop', 'raise']) → LazyFrame[source]#

Handle null values in time index columns (__TIME__, __START__, __END__).

Inspects the schema to determine which internal time columns are present and applies the chosen strategy uniformly.

Parameters:

lf – Input LazyFrame (columns already renamed to internal names).
strategy – "drop" removes null rows with a UserWarning; "raise" raises ValueError immediately.

Returns:

LazyFrame with null time index rows handled.

Raises:

ValueError – If strategy is "raise" and nulls are found, or if "drop" would remove every row.

tanat.visualization.sequence.base.utils.rename_id_column(lf: LazyFrame, id_col: str) → LazyFrame[source]#

Rename the sequence ID column to __ID__.

Parameters:

lf – Input LazyFrame.
id_col – Name of the column that holds the sequence identifier.

Returns:

LazyFrame with id_col renamed to __ID__.

tanat.visualization.sequence.base.utils.rename_time_index_columns(lf: LazyFrame, time_cols: list[str]) → LazyFrame[source]#

Rename the two time index boundary columns to __START__ / __END__.

Used by builders that operate on interval or state pools (barplot-duration, distribution, spanplot). The timeline builder uses its own variant because events have a single __TIME__ column.

Parameters:

lf – Input LazyFrame.
time_cols – Exactly two time index column names [start_col, end_col].

Returns:

LazyFrame with start_col → __START__ and end_col → __END__.

Raises:

ValueError – If time_cols does not contain exactly two entries.

tanat.visualization.sequence.base.utils.resolve_display_unit(display_unit: Literal['days', 'hours', 'minutes', 'seconds'] | None, *, is_datetime: bool) → Literal['days', 'hours', 'minutes', 'seconds'] | None[source]#

Resolve display_unit for relative time mode.

datetime + None → "days" (silent default)
datetime + explicit → pass through unchanged
timestep + None → None (numeric offsets, no conversion)
timestep + explicit → emits UserWarning, returns None

Parameters:

display_unit – Requested unit, or None for the default.
is_datetime – True when the pool uses Datetime time columns.

Returns:

Resolved DisplayUnit value, or None for timestep pools.

tanat.visualization.sequence.base.utils.resolve_label(lf: LazyFrame, feature: str) → LazyFrame[source]#

Rename feature to the internal __LABEL__ column.

Parameters:

lf – Input LazyFrame.
feature – Name of the entity feature column to use as label.

Returns:

LazyFrame with feature renamed to __LABEL__.

tanat.visualization.sequence.base.utils.shift_time_to_relative(lf: pl.LazyFrame, sequence_or_pool: SequencePool | Sequence, time_col: str, end_col: str | None = None, *, display_unit: DisplayUnit = 'days') → pl.LazyFrame[source]#

Subtract per-ID T0 from temporal columns, converting to relative time.

Joins the pool’s T0 table on __ID__ and replaces time_col (and optionally end_col) with col - _T0_. Rows whose _T0_ is null are dropped with a UserWarning stating the count of dropped IDs. Raises ValueError when all T0 values are null.

For datetime pools the shifted columns are further converted to Float64 in the requested display_unit so that the x-axis shows a human-readable numeric scale. For timestep pools the numeric offset is left unchanged and display_unit is ignored.

Parameters:

lf – LazyFrame with __ID__ and the named time columns already renamed to their internal names (e.g. __TIME__, __START__).
sequence_or_pool – Source pool or sequence (provides _get_t0_df() and settings).
time_col – Internal time column name ("__TIME__" or "__START__").
end_col – Optional end column name ("__END__" for interval/state pools).
display_unit – Target unit for datetime pools ("days" by default). Ignored for timestep pools.

Returns:

LazyFrame with time columns shifted to offsets from T0. Datetime pools yield Float64 in display_unit; timestep pools yield a numeric offset.

Raises:

ValueError – If every sequence has a null T0 value.

Module contents#

Sequence visualization base.

class tanat.visualization.sequence.base.BaseSequenceVizBuilder(settings: dict | BaseVizSettings | None = None, *, allow_large: bool = False)[source]#

Bases: ABC, CachableSettings, Registrable

Abstract base class for sequence visualization builders.

The draw() method orchestrates the full pipeline: prepare data → create figure → render → apply styling.

MAX_CATEGORY: int = 30[source]#

MAX_FACET: int = 20[source]#

SETTINGS_CLASS: type[BaseVizSettings] | None = None[source]#

__init__(settings: dict | BaseVizSettings | None = None, *, allow_large: bool = False) → None[source]#

allow_large: bool[source]#

colors(spec: str | dict | list) → BaseSequenceVizBuilder[source]#

Set the color specification. Chainable.

Parameters:: spec – A matplotlib colormap name (str), a mapping of {label: color} (dict), or an ordered list of colors (list). None falls back to the matplotlib default cycle.

draw(sequence_or_pool: SequencePool | Sequence, *, entity_feature: str) → VisualizationResult[source]#

Full pipeline: prepare data, create figure, render, return result.

Parameters:

sequence_or_pool – A SequencePool or an individual Sequence (e.g. obtained via pool[42]).
entity_feature – Entity feature column to use as category labels.

Returns:

A VisualizationResult.

Enable faceted (small-multiples) view. Chainable.

Calling this method updates the settings and clears the data cache, so any previously cached prepare_data() result is discarded.

Parameters:

by – Feature name to split by.
is_static – True → static feature; False (default) → entity feature.
cols – Number of columns in the facet grid (default 3).
share_x – Share the x-axis scale across facets (default True).
share_y – Share the y-axis scale across facets (default True).
figsize_per_facet – Width × height of each cell in inches.
title_template – Format string for each facet title. Placeholders: {by}, {value}, {index}.

figsize(width: float, height: float) → BaseSequenceVizBuilder[source]#

Set the figure dimensions. Chainable.

Parameters:

width – Figure width in inches.
height – Figure height in inches.

grid(*, show: bool = True, color: str | None = None, linewidth: float | None = None, axis: str | None = None) → BaseSequenceVizBuilder[source]#

Configure the background grid. Chainable.

Grid lines are always rendered behind markers and bars.

Parameters:

show – Display grid lines when True (default). Pass False to explicitly hide a grid that was previously enabled.
color – Line color, any matplotlib color string (default "lightgrey").
linewidth – Line width in points (default 0.8).
axis – Which axis to draw lines for: "both" (default), "x", or "y".

legend(*, show: bool = True, location: str = 'best', title: str | None = None) → BaseSequenceVizBuilder[source]#

Configure legend display. Chainable.

Parameters:

show – Display the legend when True (default).
location – Matplotlib location string, e.g. "upper right".
title – Optional legend title.

legend_off() → BaseSequenceVizBuilder[source]#: Hide the legend. Convenience shortcut for .legend(show=False). Chainable.

prepare_data(sequence_or_pool: SequencePool | Sequence, *, entity_feature: str) → DataFrame[source]#

Prepare the aggregated Polars DataFrame for rendering.

Parameters:

sequence_or_pool – A SequencePool or an individual Sequence (e.g. obtained via pool[42]).
entity_feature – Entity feature column to use as category labels.

Returns:

Polars DataFrame with columns [__LABEL__, __VALUE__], plus __COLOR__ when a color spec has been set via colors().

Raises:

TypeError – If sequence_or_pool is not a SequencePool or a Sequence.
KeyError – If entity_feature is not a declared entity feature.

title(text: str, *, fontsize: int | None = None, fontweight: str | None = None, pad: float | None = None) → BaseSequenceVizBuilder[source]#

Set the figure title. Chainable.

Parameters:

text – Title string displayed above the chart.
fontsize – Font size in points. None uses the matplotlib default.
fontweight – Font weight, e.g. "bold" or "normal".
pad – Spacing between the title and the chart in points.