tanat.sequence.type package#

Subpackages#

Module contents#

Sequence subtypes.

class tanat.sequence.type.EventEntity(id_value, store: str | Path | SequenceStore, features: list[str] | None = None, *, rank: int, store_index: int, cast_recipe: SequenceCastRecipe | dict | None = None, virtual_id: str | None = None, parent_metadata: SequenceMetadata | None = None)[source]#

Bases: Entity

Entity representing one event row (single timestamp).

class tanat.sequence.type.EventSequence(id_value, store: str | Path | SequenceStore, *, id_column: str = 'id', time_column: str = 'time', entity_features: list[str] | None = None, static_features: list[str] | None = None)[source]#

Bases: Sequence

A single event sequence (one timestamp per entity row).

SETTINGS_CLASS[source]#

alias of EventSequenceSettings

__init__(id_value, store: str | Path | SequenceStore, *, id_column: str = 'id', time_column: str = 'time', entity_features: list[str] | None = None, static_features: list[str] | None = None) None[source]#

Create an event sequence for id_value.

Parameters:
  • id_value – Sequence identifier.

  • store – Store path, name, or SequenceStore instance.

  • id_column – User-facing name for the sequence ID column.

  • time_column – User-facing name for the event timestamp column.

  • entity_features – Subset of entity feature names to expose. None → all available from the store.

  • static_features – Static feature names to expose. None → all available. [] → none.

class tanat.sequence.type.EventSequencePool(store: str | Path | SequenceStore, *, id_column: str = 'id', time_column: str = 'time', entity_features: list[str] | None = None, static_features: list[str] | None = None, cast_recipe: SequenceCastRecipe | dict | None = None)[source]#

Bases: SequencePool

Pool of event sequences (single timestamp per entity row).

SETTINGS_CLASS[source]#

alias of EventSequenceSettings

__init__(store: str | Path | SequenceStore, *, id_column: str = 'id', time_column: str = 'time', entity_features: list[str] | None = None, static_features: list[str] | None = None, cast_recipe: SequenceCastRecipe | dict | None = None) None[source]#

Create an event sequence pool backed by store.

Parameters:
  • store – Store path, name, or SequenceStore instance.

  • id_column – User-facing name for the sequence ID column.

  • time_column – User-facing name for the event timestamp column.

  • entity_features – Subset of entity feature names to expose. None → all available from the store.

  • static_features – Static feature names to expose. None → all available. [] → none.

  • cast_recipe – Optional cast recipe (or dict) applied at read time. Normalised via SequenceCastRecipe.coerce() and probed eagerly.

Raises:

TypeError – If cast_recipe is not a SequenceCastRecipe, dict, or None.

as_event() EventSequencePool[source]#

Return this pool unchanged . Source and target types are identical.

A warning is emitted to signal the no-op conversion.

Returns:

self (no copy, no I/O).

as_interval(duration: Duration, *, start_column: str = 'start', end_column: str = 'end', destination: str | Path | None = None, overwrite: bool = False) IntervalSequencePool[source]#

Convert this event pool to an interval pool by computing _t_end.

Each event timestamp becomes _t_start; _t_end is computed as _t_start + duration. The resulting time index is stored as a virtual override (ephemeral) or written to a new persistent store.

Parameters:
  • duration

    Interval length added to each event timestamp. Can be:

    • A timedelta or numeric scalar: applied uniformly to every event.

    • A str: name of an entity feature column whose values provide per-row durations.

  • start_column – User-facing name for the start column. Defaults to "start".

  • end_column – User-facing name for the end column. Defaults to "end".

  • destinationNone → ephemeral result; path → new persistent store.

  • overwrite – Replace destination if it already exists.

Returns:

A new IntervalSequencePool.

as_state(*, end_value: datetime | int | float | str | None = None, start_column: str = 'start', end_column: str = 'end', destination: str | Path | None = None, overwrite: bool = False) StateSequencePool[source]#

Convert this event pool to a state pool by computing _t_end.

Each event timestamp becomes _t_start; _t_end is taken from the next event in the same sequence (shift(-1).over(_seq_id)).

Parameters:
  • end_value – Sentinel for _t_end of the last event per sequence. None leaves the last row with _t_end = null. A str names a static feature column whose per-sequence value fills the last _t_end.

  • start_column – User-facing name for the start column. Defaults to "start".

  • end_column – User-facing name for the end column. Defaults to "end".

  • destinationNone → ephemeral result; path → new persistent store.

  • overwrite – Replace destination if it already exists.

Returns:

A new StateSequencePool.

classmethod builder() EventSequenceStoreBuilder[source]#

Return a fluent builder for constructing an event sequence store.

class tanat.sequence.type.EventSequenceSettings(*, id_column: str, entity_features: list[str], static_features: list[str] = <factory>, time_column: str)[source]#

Bases: SequenceSettings

Settings for event sequences (single timestamp column).

__init__(*args: Any, **kwargs: Any) None[source]#
get_time_columns() list[str][source]#

Returns time index columns for Event sequences [time].

id_column: str[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

time_column: str[source]#
class tanat.sequence.type.IntervalEntity(id_value, store: str | Path | SequenceStore, features: list[str] | None = None, *, rank: int, store_index: int, cast_recipe: SequenceCastRecipe | dict | None = None, virtual_id: str | None = None, parent_metadata: SequenceMetadata | None = None)[source]#

Bases: Entity

Entity representing one interval row (start/end timestamps).

Unlike StateEntity, intervals are not required to be contiguous: gaps between intervals are allowed, and two intervals may overlap in time.

class tanat.sequence.type.IntervalSequence(id_value, store: str | Path | SequenceStore, *, id_column: str = 'id', start_column: str = 'start', end_column: str = 'end', entity_features: list[str] | None = None, static_features: list[str] | None = None)[source]#

Bases: Sequence

A single interval sequence.

Unlike state sequences, intervals are not required to be contiguous: gaps between intervals are allowed, and two intervals may overlap in time.

SETTINGS_CLASS[source]#

alias of IntervalSequenceSettings

__init__(id_value, store: str | Path | SequenceStore, *, id_column: str = 'id', start_column: str = 'start', end_column: str = 'end', entity_features: list[str] | None = None, static_features: list[str] | None = None) None[source]#

Create an interval sequence for id_value.

Parameters:
  • id_value – Sequence identifier.

  • store – Store path, name, or SequenceStore instance.

  • id_column – User-facing name for the sequence ID column.

  • start_column – User-facing name for the interval start column.

  • end_column – User-facing name for the interval end column.

  • entity_features – Subset of entity feature names to expose. None → all available from the store.

  • static_features – Static feature names to expose. None → all available. [] → none.

class tanat.sequence.type.IntervalSequencePool(store: str | Path | SequenceStore, *, id_column: str = 'id', start_column: str = 'start', end_column: str = 'end', entity_features: list[str] | None = None, static_features: list[str] | None = None, cast_recipe: SequenceCastRecipe | dict | None = None)[source]#

Bases: SequencePool

Pool of interval sequences.

Unlike state sequences, intervals are not required to be contiguous: gaps between intervals are allowed, and two intervals may overlap in time.

SETTINGS_CLASS[source]#

alias of IntervalSequenceSettings

__init__(store: str | Path | SequenceStore, *, id_column: str = 'id', start_column: str = 'start', end_column: str = 'end', entity_features: list[str] | None = None, static_features: list[str] | None = None, cast_recipe: SequenceCastRecipe | dict | None = None) None[source]#

Create an interval sequence pool backed by store.

Parameters:
  • store – Store path, name, or SequenceStore instance.

  • id_column – User-facing name for the sequence ID column.

  • start_column – User-facing name for the interval start column.

  • end_column – User-facing name for the interval end column.

  • entity_features – Subset of entity feature names to expose. None → all available from the store.

  • static_features – Static feature names to expose. None → all available. [] → none.

  • cast_recipe – Optional cast recipe (or dict) applied at read time. Normalised via SequenceCastRecipe.coerce() and probed eagerly.

Raises:

TypeError – If cast_recipe is not a SequenceCastRecipe, dict, or None.

as_event(anchor: Literal['start', 'end', 'middle'], *, time_column: str = 'time', destination: str | Path | None = None, overwrite: bool = False) EventSequencePool[source]#

Convert this interval pool to an event pool by anchoring to one timestamp.

Parameters:
  • anchor"start", "end", or "middle" - selects which timestamp (or their midpoint) becomes the event timestamp.

  • time_column – User-facing name for the event timestamp. Defaults to "time".

  • destinationNone → ephemeral result; path → new persistent store.

  • overwrite – Replace destination if it already exists.

Returns:

A new EventSequencePool.

as_interval() IntervalSequencePool[source]#

Return this pool unchanged - source and target types are identical.

A warning is emitted to signal the no-op conversion.

Returns:

self (no copy, no I/O).

as_state() NoReturn[source]#

Not supported: interval → state conversion is ambiguous.

Intervals may overlap or contain gaps; neither property can be resolved into contiguous non-overlapping states without domain-specific merge / fill logic. Apply a manual Polars transformation instead.

Raises:

NotImplementedError – Always.

classmethod builder(*, sort_anchor: Literal['start', 'end', 'middle'] = 'start') IntervalSequenceStoreBuilder[source]#

Return a fluent builder for constructing an interval sequence store.

Parameters:

sort_anchor – Intra-sequence sort column - "start" (default), "end" for right-censored datasets, or "middle" to sort by the interval midpoint (T_START + T_END) / 2.

class tanat.sequence.type.IntervalSequenceSettings(*, id_column: str, entity_features: list[str], static_features: list[str] = <factory>, start_column: str, end_column: str)[source]#

Bases: SequenceSettings

Settings for interval sequences (start + end timestamp columns).

Unlike state sequences, intervals are not required to be contiguous: gaps between intervals are allowed, and two intervals may overlap in time.

__init__(*args: Any, **kwargs: Any) None[source]#
end_column: str[source]#
get_time_columns() list[str][source]#

Returns time index columns for Interval sequences [start, end].

id_column: str[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

start_column: str[source]#
class tanat.sequence.type.StateEntity(id_value, store: str | Path | SequenceStore, features: list[str] | None = None, *, rank: int, store_index: int, cast_recipe: SequenceCastRecipe | dict | None = None, virtual_id: str | None = None, parent_metadata: SequenceMetadata | None = None)[source]#

Bases: Entity

Entity representing one state row (start/end).

States are contiguous and non-overlapping: the end of one state is always the start of the next, with no gaps in between.

class tanat.sequence.type.StateSequence(id_value, store: str | Path | SequenceStore, *, id_column: str = 'id', start_column: str = 'start', end_column: str = 'end', entity_features: list[str] | None = None, static_features: list[str] | None = None)[source]#

Bases: Sequence

A single state sequence.

States are contiguous and non-overlapping: the end of one state is always the start of the next, with no gaps in between.

SETTINGS_CLASS[source]#

alias of StateSequenceSettings

__init__(id_value, store: str | Path | SequenceStore, *, id_column: str = 'id', start_column: str = 'start', end_column: str = 'end', entity_features: list[str] | None = None, static_features: list[str] | None = None) None[source]#

Create a state sequence for id_value.

Parameters:
  • id_value – Sequence identifier.

  • store – Store path, name, or SequenceStore instance.

  • id_column – User-facing name for the sequence ID column.

  • start_column – User-facing name for the state start column.

  • end_column – User-facing name for the state end column.

  • entity_features – Subset of entity feature names to expose. None → all available from the store.

  • static_features – Static feature names to expose. None → all available. [] → none.

filter_entities(criterion, *, inplace=False, verbose=True)[source]#

Not supported on state sequences.

States are contiguous and non-overlapping by definition: removing individual rows would leave temporal gaps and break the invariant T_END[i] == T_START[i+1].

class tanat.sequence.type.StateSequencePool(store: str | Path | SequenceStore, *, id_column: str = 'id', start_column: str = 'start', end_column: str = 'end', entity_features: list[str] | None = None, static_features: list[str] | None = None, cast_recipe: SequenceCastRecipe | dict | None = None)[source]#

Bases: SequencePool

Pool of state sequences.

States are contiguous and non-overlapping: the end of one state is always the start of the next, with no gaps in between.

SETTINGS_CLASS[source]#

alias of StateSequenceSettings

__init__(store: str | Path | SequenceStore, *, id_column: str = 'id', start_column: str = 'start', end_column: str = 'end', entity_features: list[str] | None = None, static_features: list[str] | None = None, cast_recipe: SequenceCastRecipe | dict | None = None) None[source]#

Create a state sequence pool backed by store.

Parameters:
  • store – Store path, name, or SequenceStore instance.

  • id_column – User-facing name for the sequence ID column.

  • start_column – User-facing name for the state start column.

  • end_column – User-facing name for the state end column.

  • entity_features – Subset of entity feature names to expose. None → all available from the store.

  • static_features – Static feature names to expose. None → all available. [] → none.

  • cast_recipe – Optional cast recipe (or dict) applied at read time. Normalised via SequenceCastRecipe.coerce() and probed eagerly.

Raises:

TypeError – If cast_recipe is not a SequenceCastRecipe, dict, or None.

as_event(anchor: Literal['start', 'end', 'middle'], *, time_column: str = 'time', destination: str | Path | None = None, overwrite: bool = False) EventSequencePool[source]#

Convert this state pool to an event pool by anchoring to one timestamp.

Parameters:
  • anchor"start", "end", or "middle" - selects which timestamp (or their midpoint) becomes the event timestamp.

  • time_column – User-facing name for the event timestamp. Defaults to "time".

  • destinationNone → ephemeral result; path → new persistent store.

  • overwrite – Replace destination if it already exists.

Returns:

A new EventSequencePool.

as_interval(*, start_column: str | None = None, end_column: str | None = None, destination: str | Path | None = None, overwrite: bool = False) IntervalSequencePool[source]#

Convert this state pool to an interval pool.

States and intervals share the same (_t_start, _t_end) physical layout - no temporal recomputation needed.

Parameters:
  • start_column – User-facing name for the start column. None inherits this pool’s current setting.

  • end_column – User-facing name for the end column. None inherits this pool’s current setting.

  • destinationNone → ephemeral result; path → new persistent store.

  • overwrite – Replace destination if it already exists.

Returns:

A new IntervalSequencePool.

as_state() StateSequencePool[source]#

Return this pool unchanged - source and target types are identical.

A warning is emitted to signal the no-op conversion.

Returns:

self (no copy, no I/O).

classmethod builder(*, end_value: datetime | int | float | None = None, validate_continuity: bool = True) StateSequenceStoreBuilder[source]#

Return a fluent builder for constructing a state sequence store.

Parameters:
  • end_value – Sentinel for T_END of the last state in each sequence when end_column is not provided at source registration time. None → leaves the last T_END as null.

  • validate_continuity – When end_column is provided, verify that states are truly contiguous (T_END[i] == T_START[i+1]) before writing. Defaults to True. Set to False on large datasets where the cost of a full collect() is unacceptable.

filter_entities(criterion, *, inplace=False, verbose=True)[source]#

Not supported on state pools.

States are contiguous and non-overlapping by definition: removing individual rows would leave temporal gaps and break the invariant T_END[i] == T_START[i+1].

class tanat.sequence.type.StateSequenceSettings(*, id_column: str, entity_features: list[str], static_features: list[str] = <factory>, start_column: str, end_column: str)[source]#

Bases: SequenceSettings

Settings for state sequences (start + end timestamp columns).

States are contiguous and non-overlapping: the end of one state is always the start of the next, with no gaps in between.

__init__(*args: Any, **kwargs: Any) None[source]#
end_column: str[source]#
get_time_columns() list[str][source]#

Returns time index columns for State sequences [start, end].

model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

start_column: str[source]#