tanat.visualization.sequence.base package#

Submodules#

tanat.visualization.sequence.base.builder module#

BaseSequenceVizBuilder: abstract base for all sequence visualization builders.

class tanat.visualization.sequence.base.builder.BaseSequenceVizBuilder(settings: dict | BaseVizSettings | None = None, *, allow_large: bool = False)[source]#

Bases: ABC, CachableSettings, Registrable

Abstract base class for sequence visualization builders.

The draw() method orchestrates the full pipeline: prepare data → create figure → render → apply styling.

MAX_CATEGORY: int = 30[source]#
MAX_FACET: int = 20[source]#
SETTINGS_CLASS: type[BaseVizSettings] | None = None[source]#
__init__(settings: dict | BaseVizSettings | None = None, *, allow_large: bool = False) None[source]#
colors(spec: str | dict | list) BaseSequenceVizBuilder[source]#

Set the color specification. Chainable.

Parameters:

spec – A matplotlib colormap name (str), a mapping of {label: color} (dict), or an ordered list of colors (list). None falls back to the matplotlib default cycle.

draw(sequence_or_pool: SequencePool | Sequence, *, entity_feature: str) VisualizationResult[source]#

Full pipeline: prepare data, create figure, render, return result.

Parameters:
  • sequence_or_pool – A SequencePool or an individual Sequence (e.g. obtained via pool[42]).

  • entity_feature – Entity feature column to use as category labels.

Returns:

A VisualizationResult.

facet(by: str, *, is_static: bool = False, cols: int = 3, share_x: bool = True, share_y: bool = True, figsize_per_facet: tuple[float, float] = (5.0, 4.0), title_template: str = '{by} = {value}') BaseSequenceVizBuilder[source]#

Enable faceted (small-multiples) view. Chainable.

Calling this method updates the settings and clears the data cache, so any previously cached prepare_data() result is discarded.

Parameters:
  • by – Feature name to split by.

  • is_staticTrue → static feature; False (default) → entity feature.

  • cols – Number of columns in the facet grid (default 3).

  • share_x – Share the x-axis scale across facets (default True).

  • share_y – Share the y-axis scale across facets (default True).

  • figsize_per_facet – Width × height of each cell in inches.

  • title_template – Format string for each facet title. Placeholders: {by}, {value}, {index}.

figsize(width: float, height: float) BaseSequenceVizBuilder[source]#

Set the figure dimensions. Chainable.

Parameters:
  • width – Figure width in inches.

  • height – Figure height in inches.

grid(*, show: bool = True, color: str | None = None, linewidth: float | None = None, axis: str | None = None) BaseSequenceVizBuilder[source]#

Configure the background grid. Chainable.

Grid lines are always rendered behind markers and bars.

Parameters:
  • show – Display grid lines when True (default). Pass False to explicitly hide a grid that was previously enabled.

  • color – Line color, any matplotlib color string (default "lightgrey").

  • linewidth – Line width in points (default 0.8).

  • axis – Which axis to draw lines for: "both" (default), "x", or "y".

legend(*, show: bool = True, location: str = 'best', title: str | None = None) BaseSequenceVizBuilder[source]#

Configure legend display. Chainable.

Parameters:
  • show – Display the legend when True (default).

  • location – Matplotlib location string, e.g. "upper right".

  • title – Optional legend title.

legend_off() BaseSequenceVizBuilder[source]#

Hide the legend. Convenience shortcut for .legend(show=False). Chainable.

prepare_data(sequence_or_pool: SequencePool | Sequence, *, entity_feature: str) DataFrame[source]#

Prepare the aggregated Polars DataFrame for rendering.

Parameters:
  • sequence_or_pool – A SequencePool or an individual Sequence (e.g. obtained via pool[42]).

  • entity_feature – Entity feature column to use as category labels.

Returns:

Polars DataFrame with columns [__LABEL__, __VALUE__], plus __COLOR__ when a color spec has been set via colors().

Raises:
  • TypeError – If sequence_or_pool is not a SequencePool or a Sequence.

  • KeyError – If entity_feature is not a declared entity feature.

title(text: str, *, fontsize: int | None = None, fontweight: str | None = None, pad: float | None = None) BaseSequenceVizBuilder[source]#

Set the figure title. Chainable.

Parameters:
  • text – Title string displayed above the chart.

  • fontsize – Font size in points. None uses the matplotlib default.

  • fontweight – Font weight, e.g. "bold" or "normal".

  • pad – Spacing between the title and the chart in points.

tanat.visualization.sequence.base.exceptions module#

Shared visualization exceptions for sequence builders.

exception tanat.visualization.sequence.base.exceptions.IncompatibleDisplayUnitError(display_unit: str | None, *, is_datetime: bool)[source]#

Bases: TanaTException, ValueError

Raised when display_unit is incompatible with the pool’s time index.

Two incompatible situations:

  • A datetime pool requires an explicit display_unit to convert raw millisecond durations to a human-readable unit.

  • A numeric timestep pool does not support display_unit; durations are already in raw timestep values.

__init__(display_unit: str | None, *, is_datetime: bool) None[source]#
exception tanat.visualization.sequence.base.exceptions.UnsupportedSequenceTypeError(found_type: str, *, compatible_types: frozenset[str] | None = None)[source]#

Bases: TanaTException, ValueError

Raised when the pool type is not compatible with a visualization builder.

Each builder declares its _COMPATIBLE_TYPES class attribute. Passing a pool of a different type raises this exception.

__init__(found_type: str, *, compatible_types: frozenset[str] | None = None) None[source]#

tanat.visualization.sequence.base.literals module#

Shared Literal type aliases for sequence visualization builders.

tanat.visualization.sequence.base.settings module#

Shared base settings for sequence visualization builders.

class tanat.visualization.sequence.base.settings.NullHandling(*, na_time_index: NaTimeIndex = 'drop', na_label: NaLabel = 'drop')[source]#

Bases: object

How the pipeline handles null values before rendering.

This is a data-preparation concern, not a visual one: the chosen strategies decide which rows survive into the chart, but they do not change any visual property.

na_time_index[source]#

Strategy for null values in time index columns (__TIME__, __START__, __END__). "drop" (default) removes affected rows with a UserWarning; "raise" raises ValueError immediately.

Type:

Literal[‘drop’, ‘raise’]

na_label[source]#

Strategy for null values in the entity feature label. "drop" (default) removes rows with a UserWarning; "raise" raises ValueError immediately; "category" replaces nulls with the string "N/A", making missing values an explicit visible category.

Type:

Literal[‘drop’, ‘raise’, ‘category’]

__init__(*args: Any, **kwargs: Any) None[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

na_label: Literal['drop', 'raise', 'category'] = 'drop'[source]#
na_time_index: Literal['drop', 'raise'] = 'drop'[source]#

tanat.visualization.sequence.base.utils module#

Shared pure-Polars utility functions for sequence visualization data preparation.

tanat.visualization.sequence.base.utils.handle_null_labels(lf: LazyFrame, strategy: Literal['drop', 'raise', 'category']) LazyFrame[source]#

Handle null values in the __LABEL__ column.

Parameters:
  • lf – Input LazyFrame (must contain __LABEL__).

  • strategy"drop" removes null-label rows with a UserWarning; "raise" raises ValueError immediately; "category" replaces nulls with the string "N/A".

Returns:

LazyFrame with null labels handled.

Raises:

ValueError – If strategy is "raise" and nulls are found, or if "drop" would remove every row.

tanat.visualization.sequence.base.utils.handle_null_time_index(lf: LazyFrame, strategy: Literal['drop', 'raise']) LazyFrame[source]#

Handle null values in time index columns (__TIME__, __START__, __END__).

Inspects the schema to determine which internal time columns are present and applies the chosen strategy uniformly.

Parameters:
  • lf – Input LazyFrame (columns already renamed to internal names).

  • strategy"drop" removes null rows with a UserWarning; "raise" raises ValueError immediately.

Returns:

LazyFrame with null time index rows handled.

Raises:

ValueError – If strategy is "raise" and nulls are found, or if "drop" would remove every row.

tanat.visualization.sequence.base.utils.rename_id_column(lf: LazyFrame, id_col: str) LazyFrame[source]#

Rename the sequence ID column to __ID__.

Parameters:
  • lf – Input LazyFrame.

  • id_col – Name of the column that holds the sequence identifier.

Returns:

LazyFrame with id_col renamed to __ID__.

tanat.visualization.sequence.base.utils.rename_time_index_columns(lf: LazyFrame, time_cols: list[str]) LazyFrame[source]#

Rename the two time index boundary columns to __START__ / __END__.

Used by builders that operate on interval or state pools (barplot-duration, distribution, spanplot). The timeline builder uses its own variant because events have a single __TIME__ column.

Parameters:
  • lf – Input LazyFrame.

  • time_cols – Exactly two time index column names [start_col, end_col].

Returns:

LazyFrame with start_col__START__ and end_col__END__.

Raises:

ValueError – If time_cols does not contain exactly two entries.

tanat.visualization.sequence.base.utils.resolve_display_unit(display_unit: Literal['days', 'hours', 'minutes', 'seconds'] | None, *, is_datetime: bool) Literal['days', 'hours', 'minutes', 'seconds'] | None[source]#

Resolve display_unit for relative time mode.

  • datetime + None"days" (silent default)

  • datetime + explicit → pass through unchanged

  • timestep + NoneNone (numeric offsets, no conversion)

  • timestep + explicit → emits UserWarning, returns None

Parameters:
  • display_unit – Requested unit, or None for the default.

  • is_datetimeTrue when the pool uses Datetime time columns.

Returns:

Resolved DisplayUnit value, or None for timestep pools.

tanat.visualization.sequence.base.utils.resolve_label(lf: LazyFrame, feature: str) LazyFrame[source]#

Rename feature to the internal __LABEL__ column.

Parameters:
  • lf – Input LazyFrame.

  • feature – Name of the entity feature column to use as label.

Returns:

LazyFrame with feature renamed to __LABEL__.

tanat.visualization.sequence.base.utils.shift_time_to_relative(lf: pl.LazyFrame, sequence_or_pool: SequencePool | Sequence, time_col: str, end_col: str | None = None, *, display_unit: DisplayUnit = 'days') pl.LazyFrame[source]#

Subtract per-ID T0 from temporal columns, converting to relative time.

Joins the pool’s T0 table on __ID__ and replaces time_col (and optionally end_col) with col - _T0_. Rows whose _T0_ is null are dropped with a UserWarning stating the count of dropped IDs. Raises ValueError when all T0 values are null.

For datetime pools the shifted columns are further converted to Float64 in the requested display_unit so that the x-axis shows a human-readable numeric scale. For timestep pools the numeric offset is left unchanged and display_unit is ignored.

Parameters:
  • lf – LazyFrame with __ID__ and the named time columns already renamed to their internal names (e.g. __TIME__, __START__).

  • sequence_or_pool – Source pool or sequence (provides _get_t0_df() and settings).

  • time_col – Internal time column name ("__TIME__" or "__START__").

  • end_col – Optional end column name ("__END__" for interval/state pools).

  • display_unit – Target unit for datetime pools ("days" by default). Ignored for timestep pools.

Returns:

LazyFrame with time columns shifted to offsets from T0. Datetime pools yield Float64 in display_unit; timestep pools yield a numeric offset.

Raises:

ValueError – If every sequence has a null T0 value.

Module contents#

Sequence visualization base.

class tanat.visualization.sequence.base.BaseSequenceVizBuilder(settings: dict | BaseVizSettings | None = None, *, allow_large: bool = False)[source]#

Bases: ABC, CachableSettings, Registrable

Abstract base class for sequence visualization builders.

The draw() method orchestrates the full pipeline: prepare data → create figure → render → apply styling.

MAX_CATEGORY: int = 30[source]#
MAX_FACET: int = 20[source]#
SETTINGS_CLASS: type[BaseVizSettings] | None = None[source]#
__init__(settings: dict | BaseVizSettings | None = None, *, allow_large: bool = False) None[source]#
allow_large: bool[source]#
colors(spec: str | dict | list) BaseSequenceVizBuilder[source]#

Set the color specification. Chainable.

Parameters:

spec – A matplotlib colormap name (str), a mapping of {label: color} (dict), or an ordered list of colors (list). None falls back to the matplotlib default cycle.

draw(sequence_or_pool: SequencePool | Sequence, *, entity_feature: str) VisualizationResult[source]#

Full pipeline: prepare data, create figure, render, return result.

Parameters:
  • sequence_or_pool – A SequencePool or an individual Sequence (e.g. obtained via pool[42]).

  • entity_feature – Entity feature column to use as category labels.

Returns:

A VisualizationResult.

facet(by: str, *, is_static: bool = False, cols: int = 3, share_x: bool = True, share_y: bool = True, figsize_per_facet: tuple[float, float] = (5.0, 4.0), title_template: str = '{by} = {value}') BaseSequenceVizBuilder[source]#

Enable faceted (small-multiples) view. Chainable.

Calling this method updates the settings and clears the data cache, so any previously cached prepare_data() result is discarded.

Parameters:
  • by – Feature name to split by.

  • is_staticTrue → static feature; False (default) → entity feature.

  • cols – Number of columns in the facet grid (default 3).

  • share_x – Share the x-axis scale across facets (default True).

  • share_y – Share the y-axis scale across facets (default True).

  • figsize_per_facet – Width × height of each cell in inches.

  • title_template – Format string for each facet title. Placeholders: {by}, {value}, {index}.

figsize(width: float, height: float) BaseSequenceVizBuilder[source]#

Set the figure dimensions. Chainable.

Parameters:
  • width – Figure width in inches.

  • height – Figure height in inches.

grid(*, show: bool = True, color: str | None = None, linewidth: float | None = None, axis: str | None = None) BaseSequenceVizBuilder[source]#

Configure the background grid. Chainable.

Grid lines are always rendered behind markers and bars.

Parameters:
  • show – Display grid lines when True (default). Pass False to explicitly hide a grid that was previously enabled.

  • color – Line color, any matplotlib color string (default "lightgrey").

  • linewidth – Line width in points (default 0.8).

  • axis – Which axis to draw lines for: "both" (default), "x", or "y".

legend(*, show: bool = True, location: str = 'best', title: str | None = None) BaseSequenceVizBuilder[source]#

Configure legend display. Chainable.

Parameters:
  • show – Display the legend when True (default).

  • location – Matplotlib location string, e.g. "upper right".

  • title – Optional legend title.

legend_off() BaseSequenceVizBuilder[source]#

Hide the legend. Convenience shortcut for .legend(show=False). Chainable.

prepare_data(sequence_or_pool: SequencePool | Sequence, *, entity_feature: str) DataFrame[source]#

Prepare the aggregated Polars DataFrame for rendering.

Parameters:
  • sequence_or_pool – A SequencePool or an individual Sequence (e.g. obtained via pool[42]).

  • entity_feature – Entity feature column to use as category labels.

Returns:

Polars DataFrame with columns [__LABEL__, __VALUE__], plus __COLOR__ when a color spec has been set via colors().

Raises:
  • TypeError – If sequence_or_pool is not a SequencePool or a Sequence.

  • KeyError – If entity_feature is not a declared entity feature.

title(text: str, *, fontsize: int | None = None, fontweight: str | None = None, pad: float | None = None) BaseSequenceVizBuilder[source]#

Set the figure title. Chainable.

Parameters:
  • text – Title string displayed above the chart.

  • fontsize – Font size in points. None uses the matplotlib default.

  • fontweight – Font weight, e.g. "bold" or "normal".

  • pad – Spacing between the title and the chart in points.