Pitch Features (synctoolbox.feature.pitch)¶

synctoolbox.feature.pitch.audio_to_pitch_features(f_audio: ndarray, Fs: float = 22050, feature_rate: int = 50, midi_min: int = 21, midi_max: int = 108, tuning_offset: int = 0, verbose: bool = False, visualization_title: str = 'Pitch features', visualization_log_gamma: float = 100.0) → ndarray[source]¶

Computes pitch-based features via an IIR filterbank aggregated as STMSP (short-time mean-square power). The signal is decomposed into subbands that correspond to MIDI pitches between midi_min and midi_max. In the output array, each row corresponds to one MIDI pitch. Per convention, the output has size 128xN. Only the rows between midi_min and midi_max are filled, the rest contains zeros.

Parameters

f_audio (np.ndarray) – One dimensional audio array (mono)
Fs (float) – Sampling rate of f_audio (in Hz)
feature_rate (int) – Features per second
midi_min (int) – Minimum MIDI index (indices below midi_min are filled with zero in the output)
midi_max (int) – Maximum MIDI index (indices above midi_max are filled with zero in the output)
tuning_offset (int) – Tuning offset used to shift the filterbank (in cents)
verbose (bool) – Set True to activate the visualization of features
visualization_title (str) – Title for the visualization plot. Only relevant if verbose is True
visualization_log_gamma (float) – Log compression gamma parameter for visualization. (relevant only if verbose is True.

Returns

f_pitch (np.ndarray [shape=(128, N)]) – Matrix containing the extracted pitch-based features