Dynamic Time Warping (synctoolbox.dtw)¶

synctoolbox.dtw.core.compute_warping_path(C: ndarray, step_sizes: ndarray = np.array([[1, 0], [0, 1], [1, 1]], np.int64), step_weights: ndarray = np.array([1.0, 1.0, 1.0], np.float64), implementation: str = 'synctoolbox')[source]¶

Applies DTW on cost matrix C.

Parameters

C (np.ndarray (np.float32 / np.float64) [shape=(N, M)]) – Cost matrix
step_sizes (np.ndarray (np.int64) [shape=(2, S)]) – Array of step sizes
step_weights (np.ndarray (np.float64) [shape=(2, S)]) – Array of step weights
implementation (str) – Choose among synctoolbox and librosa. (default: synctoolbox)

Returns

D (np.ndarray (np.float64) [shape=(N, M)]) – Accumulated cost matrix
E (np.ndarray (np.int64) [shape=(N, M)]) – Step index matrix
wp (np.ndarray (np.int64) [shape=(2, M)]) – Warping path

synctoolbox.dtw.mrmsdtw.sync_via_mrmsdtw(f_chroma1: ndarray, f_chroma2: ndarray, f_onset1: Optional[ndarray] = None, f_onset2: Optional[ndarray] = None, input_feature_rate: int = 50, step_sizes: ndarray = np.array([[1, 0], [0, 1], [1, 1]], np.int32), step_weights: ndarray = np.array([1.0, 1.0, 1.0], np.float64), threshold_rec: int = 10000, win_len_smooth: ndarray = np.array([201, 101, 21, 1]), downsamp_smooth: ndarray = np.array([50, 25, 5, 1]), verbose: bool = False, dtw_implementation: str = 'synctoolbox', normalize_chroma: bool = True, chroma_norm_ord: int = 2, chroma_norm_threshold: float = 0.001, visualization_title: str = 'MrMsDTW result', alpha=0.5) → ndarray[source]¶

Compute memory-restricted multi-scale DTW (MrMsDTW) using chroma and (optionally) onset features. MrMsDTW is performed on multiple levels that get progressively finer, with rectangular constraint regions defined by the alignment found on the previous, coarser level. If onset features are provided, these are used on the finest level in addition to chroma to provide higher synchronization accuracy.

Parameters

f_chroma1 (np.ndarray [shape=(12, N)]) – Chroma feature matrix of the first sequence
f_chroma2 (np.ndarray [shape=(12, M)]) – Chroma feature matrix of the second sequence
f_onset1 (np.ndarray [shape=(L, N)]) – Onset feature matrix of the first sequence (optional, default: None)
f_onset2 (np.ndarray [shape=(L, M)]) – Onset feature matrix of the second sequence (optional, default: None)
input_feature_rate (int) – Input feature rate of the chroma features (default: 50)
step_sizes (np.ndarray) – DTW step sizes (default: np.array([[1, 0], [0, 1], [1, 1]]))
step_weights (np.ndarray) – DTW step weights (np.array([1.0, 1.0, 1.0]))
threshold_rec (int) – Defines the maximum area that is spanned by the rectangle of two consecutive elements in the alignment (default: 10000)
win_len_smooth (np.ndarray) – Window lengths for chroma feature smoothing (default: np.array([201, 101, 21, 1]))
downsamp_smooth (np.ndarray) – Downsampling factors (default: np.array([50, 25, 5, 1]))
verbose (bool) – Set True for visualization (default: False)
dtw_implementation (str) – DTW implementation, librosa or synctoolbox (default: synctoolbox)
normalize_chroma (bool) – Set True to normalize input chroma features after each downsampling and smoothing operation.
chroma_norm_ord (int) – Order of chroma normalization, relevant if normalize_chroma is True. (default: 2)
chroma_norm_threshold (float) – If the norm falls below threshold for a feature vector, then the normalized feature vector is set to be the unit vector. Relevant, if normalize_chroma is True (default: 0.001)
visualization_title (str) – Title for the visualization plots. Only relevant if ‘verbose’ is True (default: “MrMsDTW result”)
alpha (float) – Coefficient for the Chroma cost matrix in the finest scale of the MrMsDTW algorithm. C = alpha * C_Chroma + (1 - alpha) * C_act (default: 0.5)

Returns

alignment (np.ndarray [shape=(2, T)]) – Resulting warping path which indicates synchronized indices.

synctoolbox.dtw.mrmsdtw.sync_via_mrmsdtw_with_anchors(f_chroma1: ndarray, f_chroma2: ndarray, f_onset1: Optional[ndarray] = None, f_onset2: Optional[ndarray] = None, input_feature_rate: int = 50, step_sizes: ndarray = np.array([[1, 0], [0, 1], [1, 1]], np.int32), step_weights: ndarray = np.array([1.0, 1.0, 1.0], np.float64), threshold_rec: int = 10000, win_len_smooth: ndarray = np.array([201, 101, 21, 1]), downsamp_smooth: ndarray = np.array([50, 25, 5, 1]), verbose: bool = False, dtw_implementation: str = 'synctoolbox', normalize_chroma: bool = True, chroma_norm_ord: int = 2, chroma_norm_threshold: float = 0.001, visualization_title: str = 'MrMsDTW result', anchor_pairs: Optional[List[Tuple]] = None, linear_inp_idx: List[int] = [], alpha=0.5) → ndarray[source]¶

Compute memory-restricted multi-scale DTW (MrMsDTW) using chroma and (optionally) onset features. MrMsDTW is performed on multiple levels that get progressively finer, with rectangular constraint regions defined by the alignment found on the previous, coarser level. If onset features are provided, these are used on the finest level in addition to chroma to provide higher synchronization accuracy.

Parameters

f_chroma1 (np.ndarray [shape=(12, N)]) – Chroma feature matrix of the first sequence
f_chroma2 (np.ndarray [shape=(12, M)]) – Chroma feature matrix of the second sequence
f_onset1 (np.ndarray [shape=(L, N)]) – Onset feature matrix of the first sequence (optional, default: None)
f_onset2 (np.ndarray [shape=(L, M)]) – Onset feature matrix of the second sequence (optional, default: None)
input_feature_rate (int) – Input feature rate of the chroma features (default: 50)
step_sizes (np.ndarray) – DTW step sizes (default: np.array([[1, 0], [0, 1], [1, 1]]))
step_weights (np.ndarray) – DTW step weights (np.array([1.0, 1.0, 1.0]))
threshold_rec (int) – Defines the maximum area that is spanned by the rectangle of two consecutive elements in the alignment (default: 10000)
win_len_smooth (np.ndarray) – Window lengths for chroma feature smoothing (default: np.array([201, 101, 21, 1]))
downsamp_smooth (np.ndarray) – Downsampling factors (default: np.array([50, 25, 5, 1]))
verbose (bool) – Set True for visualization (default: False)
dtw_implementation (str) – DTW implementation, librosa or synctoolbox (default: synctoolbox)
normalize_chroma (bool) – Set True to normalize input chroma features after each downsampling and smoothing operation.
chroma_norm_ord (int) – Order of chroma normalization, relevant if normalize_chroma is True. (default: 2)
chroma_norm_threshold (float) – If the norm falls below threshold for a feature vector, then the normalized feature vector is set to be the unit vector. Relevant, if normalize_chroma is True (default: 0.001)
visualization_title (str) – Title for the visualization plots. Only relevant if ‘verbose’ is True (default: “MrMsDTW result”)
anchor_pairs (List[Tuple]) – Anchor pairs given in seconds. Note that * (0, 0) and (<audio-len1>, <audio-len2>) are not allowed. * Anchors must be monotonously increasing.
linear_inp_idx (List[int]) – List of the indices of intervals created by anchor pairs, for which MrMsDTW shouldn’t be run, e.g., if the interval only involves silence.

0 ap1 ap2 ap3 | | | | | idx0 | idx1 | idx2 | idx3 OR idx-1 | | | |

Note that index -1 corresponds to the last interval, which begins with the last anchor pair until the end of the audio files.
alpha (float) – Coefficient for the Chroma cost matrix in the finest scale of the MrMsDTW algorithm. C = alpha * C_Chroma + (1 - alpha) * C_act (default: 0.5)

Returns

wp (np.ndarray [shape=(2, T)]) – Resulting warping path which indicates synchronized indices.