Pitch Features (synctoolbox.feature.pitch)

synctoolbox.feature.pitch.audio_to_pitch_features(f_audio: ndarray, Fs: float = 22050, feature_rate: int = 50, midi_min: int = 21, midi_max: int = 108, tuning_offset: int = 0, verbose: bool = False, visualization_title: str = 'Pitch features', visualization_log_gamma: float = 100.0) ndarray[source]

Computes pitch-based features via an IIR filterbank aggregated as STMSP (short-time mean-square power). The signal is decomposed into subbands that correspond to MIDI pitches between midi_min and midi_max. In the output array, each row corresponds to one MIDI pitch. Per convention, the output has size 128xN. Only the rows between midi_min and midi_max are filled, the rest contains zeros.

Parameters
  • f_audio (np.ndarray) – One dimensional audio array (mono)

  • Fs (float) – Sampling rate of f_audio (in Hz)

  • feature_rate (int) – Features per second

  • midi_min (int) – Minimum MIDI index (indices below midi_min are filled with zero in the output)

  • midi_max (int) – Maximum MIDI index (indices above midi_max are filled with zero in the output)

  • tuning_offset (int) – Tuning offset used to shift the filterbank (in cents)

  • verbose (bool) – Set True to activate the visualization of features

  • visualization_title (str) – Title for the visualization plot. Only relevant if verbose is True

  • visualization_log_gamma (float) – Log compression gamma parameter for visualization. (relevant only if verbose is True.

Returns

f_pitch (np.ndarray [shape=(128, N)]) – Matrix containing the extracted pitch-based features