Pitch Onset Features (synctoolbox.feature.pitch_onset)

synctoolbox.feature.pitch_onset.audio_to_pitch_onset_features(f_audio: ndarray, Fs: float = 22050, midi_min: int = 21, midi_max: int = 108, tuning_offset: int = 0, manual_offset: float = -25, verbose: bool = False, visualization_title: str = 'Pitch onset features', visualization_log_gamma: float = 100.0) dict[source]

Computes pitch onset features based on an IIR filterbank. The signal is decomposed into subbands that correspond to MIDI pitches between midi_min and midi_max. After that, onsets for each MIDI pitch are calculated.

Parameters
  • f_audio (np.ndarray) – One dimensional audio array (mono)

  • Fs (float) – Sampling rate of f_audio (in Hz)

  • midi_min (int) – Minimum MIDI index (indices below midi_min are filled with zero in the output)

  • midi_max (int) – Maximum MIDI index (indices above midi_max are filled with zero in the output)

  • tuning_offset (int) – Tuning offset used to shift the filterbank (in cents)

  • manual_offset (int) – Offset applied to all onsets (in ms). The procedure in this function finds onsets by looking at peaks, i.e., positions of maximum increase in energy. However, the actual onsets usually happen before such a peak (prior to the maximum increase in energy). Thus, an offset is applied to all onset positions. The default (-25ms) has been found to work well empirically.

  • verbose (bool) – Set True to activate the visualization of features

  • visualization_title (str) – Title for the visualization plot. Only relevant if verbose is True

  • visualization_log_gamma (float) – Log compression gamma parameter for visualization. (relevant only if verbose is True.

Returns

f_peaks (dict) –

A dictionary of onset peaks:
  • Each key corresponds to the midi pitch number

  • Each value f_peaks[midi_pitch] is an array of doubles of size 2xN:
    • First row give the positions of the peaks in milliseconds.

    • Second row contains the corresponding magnitudes of the peaks.