tanat.metric.sequence.type.dtw package#

Submodules#

tanat.metric.sequence.type.dtw.kernels module#

Numba kernels for DTWSequenceMetric.

All functions are @njit (no Python objects). They operate on int32-encoded feature arrays produced by the entity metric’s prepare_batch_data.

Uses a 2-row rolling DP with optional Sakoe-Chiba band (window 0). Pass window = -1 to disable the band constraint (full DTW).

tanat.metric.sequence.type.dtw.kernels.compute_dtw_matrix(result, start, end, arrays_a, lengths_a, arrays_b, lengths_b, dist_kernel, context, window, normalize, symmetric)[source]#

Parallel DTW matrix kernel.

Processes rows [start, end).

tanat.metric.sequence.type.dtw.kernels.compute_dtw_pair(arr_a, arr_b, len_a, len_b, dist_kernel, context, window, normalize)[source]#

Compute DTW distance for a single pair of int32-encoded sequences.

Uses a 2-row rolling DP with optional Sakoe-Chiba band.

Parameters:
  • arr_a – int32-encoded sequence A.

  • arr_b – int32-encoded sequence B.

  • len_a – Length of A.

  • len_b – Length of B.

  • dist_kernel – Numba entity distance kernel.

  • context – Opaque context tuple forwarded to dist_kernel.

  • window – Sakoe-Chiba band half-width. -1 = no constraint.

  • normalize – When True, divide by len_a + len_b.

Returns:

float32 DTW distance, or nan when either sequence is empty.

tanat.metric.sequence.type.dtw.metric module#

DTWSequenceMetric: Dynamic Time Warping between sequences.

class tanat.metric.sequence.type.dtw.metric.DTWSequenceMetric(entity_metric: EntityMetric | str = 'hamming', window: int | None = None, normalize: bool = False, *, store_path: str | Path | None = None, chunk_size: int = 500, resume: bool = True, dtype: str = 'float32')[source]#

Bases: SequenceMetric

Dynamic Time Warping distance between two sequences.

Uses a space-optimised 2-row DP. The Sakoe-Chiba band is applied when window is set, limiting the warping path to stay within window cells of the diagonal.

Empty-sequence behaviour:

  • Both emptynan (no alignment possible).

  • One emptynan (no alignment possible).

When normalize=True, divides the raw DTW cost by len_a + len_b (an approximation that does not require path backtracking).

Example:

dtw = DTWSequenceMetric(window=3, normalize=True)
d   = dtw(seq_a, seq_b)
dm  = dtw.compute_matrix(pool)
MEMMAP_SUPPORT: bool = True[source]#

Set to True in subclasses that implement disk-backed (memmap) computation. When False, passing store_path or an instance-level StorageOptions raises NotImplementedError early with a clear message.

SETTINGS_CLASS[source]#

alias of DTWSettings

__init__(entity_metric: EntityMetric | str = 'hamming', window: int | None = None, normalize: bool = False, *, store_path: str | Path | None = None, chunk_size: int = 500, resume: bool = True, dtype: str = 'float32') None[source]#
validate_composition(seq_a: Sequence, seq_b: Sequence | None = None) None[source]#

Probe the first entity of each sequence through the entity metric.

class tanat.metric.sequence.type.dtw.metric.DTWSettings(*, entity_metric: EntityMetric = 'hamming', window: int | None = None, normalize: bool = False)[source]#

Bases: object

Settings for DTWSequenceMetric.

Parameters:
  • entity_metric – Entity-level distance metric. Default: "hamming".

  • window – Sakoe-Chiba band width (number of cells off the diagonal). None means no constraint (full DTW). Must be > 0 when set.

  • normalize – When True, divide the DTW cost by len_a + len_b (approximation that avoids O(n×m) backtracking). Default: False.

__init__(*args: Any, **kwargs: Any) None[source]#
entity_metric: EntityMetric = 'hamming'[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

normalize: bool = False[source]#
window: int | None = None[source]#

Module contents#

DTWSequenceMetric package.

class tanat.metric.sequence.type.dtw.DTWSequenceMetric(entity_metric: EntityMetric | str = 'hamming', window: int | None = None, normalize: bool = False, *, store_path: str | Path | None = None, chunk_size: int = 500, resume: bool = True, dtype: str = 'float32')[source]#

Bases: SequenceMetric

Dynamic Time Warping distance between two sequences.

Uses a space-optimised 2-row DP. The Sakoe-Chiba band is applied when window is set, limiting the warping path to stay within window cells of the diagonal.

Empty-sequence behaviour:

  • Both emptynan (no alignment possible).

  • One emptynan (no alignment possible).

When normalize=True, divides the raw DTW cost by len_a + len_b (an approximation that does not require path backtracking).

Example:

dtw = DTWSequenceMetric(window=3, normalize=True)
d   = dtw(seq_a, seq_b)
dm  = dtw.compute_matrix(pool)
MEMMAP_SUPPORT: bool = True[source]#

Set to True in subclasses that implement disk-backed (memmap) computation. When False, passing store_path or an instance-level StorageOptions raises NotImplementedError early with a clear message.

SETTINGS_CLASS[source]#

alias of DTWSettings

__init__(entity_metric: EntityMetric | str = 'hamming', window: int | None = None, normalize: bool = False, *, store_path: str | Path | None = None, chunk_size: int = 500, resume: bool = True, dtype: str = 'float32') None[source]#
validate_composition(seq_a: Sequence, seq_b: Sequence | None = None) None[source]#

Probe the first entity of each sequence through the entity metric.

class tanat.metric.sequence.type.dtw.DTWSettings(*, entity_metric: EntityMetric = 'hamming', window: int | None = None, normalize: bool = False)[source]#

Bases: object

Settings for DTWSequenceMetric.

Parameters:
  • entity_metric – Entity-level distance metric. Default: "hamming".

  • window – Sakoe-Chiba band width (number of cells off the diagonal). None means no constraint (full DTW). Must be > 0 when set.

  • normalize – When True, divide the DTW cost by len_a + len_b (approximation that avoids O(n×m) backtracking). Default: False.

__init__(*args: Any, **kwargs: Any) None[source]#
entity_metric: EntityMetric = 'hamming'[source]#
model_dump(*, mode='python', **dump_kwargs)[source]#

Dump settings to a dict via Pydantic serialization.

normalize: bool = False[source]#
window: int | None = None[source]#