Musically Informed Audio Decomposition (libfmp.c8)

The FMP notebooks provide detailed textbook-like explanations of central techniques and algorithms implemented in the libfmp. The part of FMP related to this module is available at the following URL:

https://www.audiolabs-erlangen.de/resources/MIR/FMP/C8/C8.html

libfmp.c8.c8s1_hps.convert_l_hertz_to_bins(L_p_Hz, Fs=22050, N=1024, H=512)[source]

Convert filter length parameter from Hertz to frequency bins

Notebook: C8/C8S1_HPS.ipynb

Parameters
  • L_p_Hz (float) – Filter length (in Hertz)

  • Fs (scalar) – Sample rate (Default value = 22050)

  • N (int) – Window size (Default value = 1024)

  • H (int) – Hop size (Default value = 512)

Returns

L_p (int) – Filter length (in frequency bins)

libfmp.c8.c8s1_hps.convert_l_sec_to_frames(L_h_sec, Fs=22050, N=1024, H=512)[source]

Convert filter length parameter from seconds to frame indices

Notebook: C8/C8S1_HPS.ipynb

Parameters
  • L_h_sec (float) – Filter length (in seconds)

  • Fs (scalar) – Sample rate (Default value = 22050)

  • N (int) – Window size (Default value = 1024)

  • H (int) – Hop size (Default value = 512)

Returns

L_h (int) – Filter length (in samples)

libfmp.c8.c8s1_hps.experiment_hps_parameter(fn_wav, param_list)[source]

Script for running an HPS experiment over a parameter list, such as [[1024, 256, 0.1, 100], ...]

Notebook: C8/C8S1_HPS.ipynb

Parameters
  • fn_wav (str) – Path to wave file

  • param_list (list) – List of parameters

libfmp.c8.c8s1_hps.experiment_hrps_parameter(fn_wav, param_list)[source]

Script for running an HRPS experiment over a parameter list, such as [[1024, 256, 0.1, 100], ...]

Parameters
  • fn_wav (str) – Path to wave file

  • param_list (list) – List of parameters

libfmp.c8.c8s1_hps.generate_audio_tag_html_list(list_x, Fs, width='150', height='40')[source]

Generates audio tag for html needed to be shown in table

Notebook: C8/C8S1_HPS.ipynb

Parameters
  • list_x (list) – List of waveforms

  • Fs (scalar) – Sample rate

  • width (str) – Width in px (Default value = ‘150’)

  • height (str) – Height in px (Default value = ‘40’)

Returns

audio_tag_html_list (list) – List of HTML strings with audio tags

libfmp.c8.c8s1_hps.hps(x, Fs, N, H, L_h, L_p, L_unit='physical', mask='binary', eps=0.001, detail=False)[source]

Harmonic-percussive separation (HPS) algorithm

Notebook: C8/C8S1_HPS.ipynb

Parameters
  • x (np.ndarray) – Input signal

  • Fs (scalar) – Sampling rate of x

  • N (int) – Frame length

  • H (int) – Hopsize

  • L_h (float) – Horizontal median filter length given in seconds or frames

  • L_p (float) – Percussive median filter length given in Hertz or bins

  • L_unit (str) – Adjusts unit, either ‘pyhsical’ or ‘indices’ (Default value = ‘physical’)

  • mask (str) – Either ‘binary’ or ‘soft’ (Default value = ‘binary’)

  • eps (float) – Parameter used in soft maskig (Default value = 0.001)

  • detail (bool) – Returns detailed information (Default value = False)

Returns
  • x_h (np.ndarray) – Harmonic signal

  • x_p (np.ndarray) – Percussive signal

  • details (dict) – Dictionary containing detailed information; returned if detail=True

libfmp.c8.c8s1_hps.hrps(x, Fs, N, H, L_h, L_p, beta=2.0, L_unit='physical', detail=False)[source]

Harmonic-residual-percussive separation (HRPS) algorithm

Notebook: C8/C8S1_HRPS.ipynb

Parameters
  • x (np.ndarray) – Input signal

  • Fs (scalar) – Sampling rate of x

  • N (int) – Frame length

  • H (int) – Hopsize

  • L_h (float) – Horizontal median filter length given in seconds or frames

  • L_p (float) – Percussive median filter length given in Hertz or bins

  • beta (float) – Separation factor (Default value = 2.0)

  • L_unit (str) – Adjusts unit, either ‘pyhsical’ or ‘indices’ (Default value = ‘physical’)

  • detail (bool) – Returns detailed information (Default value = False)

Returns
  • x_h (np.ndarray) – Harmonic signal

  • x_p (np.ndarray) – Percussive signal

  • x_r (np.ndarray) – Residual signal

  • details (dict) – Dictionary containing detailed information; returned if “detail=True”

libfmp.c8.c8s1_hps.make_integer_odd(n)[source]

Convert integer into odd integer

Notebook: C8/C8S1_HPS.ipynb

Parameters

n (int) – Integer

Returns

n (int) – Odd integer

libfmp.c8.c8s1_hps.median_filter_horizontal(x, filter_len)[source]

Apply median filter in horizontal direction

Notebook: C8/C8S1_HPS.ipynb

Parameters
  • x (np.ndarray) – Input matrix

  • filter_len (int) – Filter length

Returns

x_h (np.ndarray) – Filtered matrix

libfmp.c8.c8s1_hps.median_filter_vertical(x, filter_len)[source]

Apply median filter in vertical direction

Notebook: C8/C8S1_HPS.ipynb

Parameters
  • x – Input matrix

  • filter_len (int) – Filter length

Returns

x_p (np.ndarray) – Filtered matrix

libfmp.c8.c8s2_f0.cents_to_hz(F_cent, F_ref=55.0)[source]

Converts frequency in cents to Hz

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • F_cent (float or np.ndarray) – Frequency in cents

  • F_ref (float) – Reference frequency in Hz (Default value = 55.0)

Returns

F (float or np.ndarray) – Frequency in Hz

libfmp.c8.c8s2_f0.compute_traj_from_audio(x, Fs=22050, N=1024, H=128, R=10.0, F_min=55.0, F_max=1760.0, num_harm=10, freq_smooth_len=11, alpha=0.9, gamma=0.0, constraint_region=None, tol=5, score_low=0.01, score_high=1.0)[source]

Compute F0 contour from audio signal

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • x (np.ndarray) – Audio signal

  • Fs (scalar) – Sampling frequency (Default value = 22050)

  • N (int) – Window length in samples (Default value = 1024)

  • H (int) – Hopsize in samples (Default value = 128)

  • R (float) – Frequency resolution in cents (Default value = 10.0)

  • F_min (float) – Lower frequency bound (reference frequency) (Default value = 55.0)

  • F_max (float) – Upper frequency bound (Default value = 1760.0)

  • num_harm (int) – Number of harmonics (Default value = 10)

  • freq_smooth_len (int) – Filter length for vertical smoothing (Default value = 11)

  • alpha (float) – Weighting parameter for harmonics (Default value = 0.9)

  • gamma (float) – Logarithmic compression factor (Default value = 0.0)

  • constraint_region (np.ndarray) – Constraint regions, row-format: (t_start_sec, t_end_sec, f_start_hz, f_end,hz) (Default value = None)

  • tol (int) – Tolerance parameter for transition matrix (Default value = 5)

  • score_low (float) – Score (low) for transition matrix (Default value = 0.01)

  • score_high (float) – Score (high) for transition matrix (Default value = 1.0)

Returns
  • traj (np.ndarray) – F0 contour, time in seconds in 1st column, frequency in Hz in 2nd column

  • Z (np.ndarray) – Salience representation

  • T_coef (np.ndarray) – Time axis

  • F_coef_hertz (np.ndarray) – Frequency axis in Hz

  • F_coef_cents (np.ndarray) – Frequency axis in cents

libfmp.c8.c8s2_f0.compute_trajectory_cr(Z, T_coef, F_coef_hertz, constraint_region=None, tol=5, score_low=0.01, score_high=1.0)[source]

Trajectory tracking with constraint regions

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • Z (np.ndarray) – Salience representation

  • T_coef (np.ndarray) – Time axis

  • F_coef_hertz (np.ndarray) – Frequency axis in Hz

  • constraint_region (np.ndarray) – Constraint regions, row-format: (t_start_sec, t_end_sec, f_start_hz, f_end_hz) (Default value = None)

  • tol (int) – Tolerance parameter for transition matrix (Default value = 5)

  • score_low (float) – Score (low) for transition matrix (Default value = 0.01)

  • score_high (float) – Score (high) for transition matrix (Default value = 1.0)

Returns

eta (np.ndarray) – Trajectory indices, unvoiced frames are indicated with -1

libfmp.c8.c8s2_f0.compute_trajectory_dp(Z, T)[source]

Trajectory tracking using dynamic programming

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • Z – Salience representation

  • T – Transisition matrix

Returns

eta_DP (np.ndarray) – Trajectory indices

libfmp.c8.c8s2_f0.convert_ann_to_constraint_region(ann, tol_freq_cents=300.0)[source]

Convert score annotations to constraint regions

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • ann (list) – Score annotations [[start_time, end_time, MIDI_pitch], …

  • tol_freq_cents (float) – Tolerance in pitch directions specified in cents (Default value = 300.0)

Returns

constraint_region (np.ndarray) – Constraint regions

libfmp.c8.c8s2_f0.convert_trajectory_to_mask_bin(traj, F_coef, n_harmonics=1, tol_bin=0)[source]

Computes binary mask from F0 trajectory

Notebook: C8/C8S2_MelodyExtractSep.ipynb

Parameters
  • traj (np.ndarray) – F0 trajectory (time in seconds in 1st column, frequency in Hz in 2nd column)

  • F_coef (np.ndarray) – Frequency axis

  • n_harmonics (int) – Number of harmonics (Default value = 1)

  • tol_bin (int) – Tolerance in frequency bins (Default value = 0)

Returns

mask (np.ndarray) – Binary mask

libfmp.c8.c8s2_f0.convert_trajectory_to_mask_cent(traj, F_coef, n_harmonics=1, tol_cent=0.0)[source]

Computes binary mask from F0 trajectory

Notebook: C8/C8S2_MelodyExtractSep.ipynb

Parameters
  • traj (np.ndarray) – F0 trajectory (time in seconds in 1st column, frequency in Hz in 2nd column)

  • F_coef (np.ndarray) – Frequency axis

  • n_harmonics (int) – Number of harmonics (Default value = 1)

  • tol_cent (float) – Tolerance in cents (Default value = 0.0)

Returns

mask (np.ndarray) – Binary mask

libfmp.c8.c8s2_f0.define_transition_matrix(B, tol=0, score_low=0.01, score_high=1.0)[source]

Generate transition matrix

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • B (int) – Number of bins

  • tol (int) – Tolerance parameter for transition matrix (Default value = 0)

  • score_low (float) – Score (low) for transition matrix (Default value = 0.01)

  • score_high (float) – Score (high) for transition matrix (Default value = 1.0)

Returns

T (np.ndarray) – Transition matrix

libfmp.c8.c8s2_f0.hz_to_cents(F, F_ref=55.0)[source]

Converts frequency in Hz to cents

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • F (float or np.ndarray) – Frequency value in Hz

  • F_ref (float) – Reference frequency in Hz (Default value = 55.0)

Returns

F_cent (float or np.ndarray) – Frequency in cents

libfmp.c8.c8s2_f0.separate_melody_accompaniment(x, Fs, N, H, traj, n_harmonics=10, tol_cent=50.0)[source]

F0-based melody-accompaniement separation

Notebook: C8/C8S2_MelodyExtractSep.ipynb

Parameters
  • x (np.ndarray) – Audio signal

  • Fs (scalar) – Sampling frequency

  • N (int) – Window size in samples

  • H (int) – Hopsize in samples

  • traj (np.ndarray) – F0 traj (time in seconds in 1st column, frequency in Hz in 2nd column)

  • n_harmonics (int) – Number of harmonics (Default value = 10)

  • tol_cent (float) – Tolerance in cents (Default value = 50.0)

Returns
  • x_mel (np.ndarray) – Reconstructed audio signal for melody

  • x_acc (np.ndarray) – Reconstructed audio signal for accompaniement

libfmp.c8.c8s2_f0.sonify_trajectory_with_sinusoid(traj, audio_len, Fs=22050, amplitude=0.3, smooth_len=11)[source]

Sonification of trajectory with sinusoidal

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • traj (np.ndarray) – F0 trajectory (time in seconds, frequency in Hz)

  • audio_len (int) – Desired audio length in samples

  • Fs (scalar) – Sampling rate (Default value = 22050)

  • amplitude (float) – Amplitude (Default value = 0.3)

  • smooth_len (int) – Length of amplitude smoothing filter (Default value = 11)

Returns

x_soni (np.ndarray) – Sonification

libfmp.c8.c8s2_f0.visualize_salience_traj_constraints(Z, T_coef, F_coef_cents, F_ref=55.0, colorbar=True, cmap='gray_r', figsize=(7, 4), traj=None, constraint_region=None, ax=None)[source]

Visualize salience representation with optional F0-trajectory and constraint regions

Notebook: C8/C8S2_FundFreqTracking.ipynb

Parameters
  • Z – Salience representation

  • T_coef – Time axis

  • F_coef_cents – Frequency axis in cents

  • F_ref – Reference frequency (Default value = 55.0)

  • colorbar – Show or hide colorbar (Default value = True)

  • cmap – Color map (Default value = ‘gray_r’)

  • figsize – Figure size (Default value = (7, 4))

  • traj – F0 trajectory (time in seconds, frequency in Hz) (Default value = None)

  • constraint_region – Constraint regions, row-format: (t_start_sec, t_end_sec, f_start_hz, f_end,hz) (Default value = None)

  • ax – Handle to existing axis (Default value = None)

Returns
  • fig – Handle to figure

  • ax – Handle to cent axis

  • ax_f – Handle to frequency axis

libfmp.c8.c8s2_salience.compute_if(X, Fs, N, H)[source]

Instantenous frequency (IF) estamation

Parameters
  • X (np.ndarray) – STFT

  • Fs (scalar) – Sampling rate

  • N (int) – Window size in samples

  • H (int) – Hop size in samples

Returns

F_coef_IF (np.ndarray) – Matrix of IF values

libfmp.c8.c8s2_salience.compute_salience_rep(x, Fs, N, H, R, F_min=55.0, F_max=1760.0, num_harm=10, freq_smooth_len=11, alpha=1.0, gamma=0.0)[source]

Salience representation [FMP, Eq. (8.56)]

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • x (np.ndarray) – Audio signal

  • Fs (scalar) – Sampling frequency

  • N (int) – Window length in samples

  • H (int) – Hopsize in samples

  • R (float) – Frequency resolution in cents

  • F_min (float) – Lower frequency bound (reference frequency) (Default value = 55.0)

  • F_max (float) – Upper frequency bound (Default value = 1760.0)

  • num_harm (int) – Number of harmonics (Default value = 10)

  • freq_smooth_len (int) – Filter length for vertical smoothing (Default value = 11)

  • alpha (float) – Weighting parameter (Default value = 1.0)

  • gamma (float) – Logarithmic compression factor (Default value = 0.0)

Returns
  • Z (np.ndarray) – Salience representation

  • F_coef_hertz (np.ndarray) – Frequency axis in Hz

  • F_coef_cents (np.ndarray) – Frequency axis in cents

libfmp.c8.c8s2_salience.compute_y_lf_bin(Y, Fs, N, R=10.0, F_min=55.0, F_max=1760.0)[source]

Log-frequency Spectrogram with variable frequency resolution using binning

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • Y (np.ndarray) – Magnitude spectrogram

  • Fs (scalar) – Sampling rate in Hz

  • N (int) – Window length in samples

  • R (float) – Frequency resolution in cents (Default value = 10.0)

  • F_min (float) – Lower frequency bound (reference frequency) (Default value = 55.0)

  • F_max (float) – Upper frequency bound (is included) (Default value = 1760.0)

Returns
  • Y_LF_bin (np.ndarray) – Binned log-frequency spectrogram

  • F_coef_hertz (np.ndarray) – Frequency axis in Hz

  • F_coef_cents (np.ndarray) – Frequency axis in cents

libfmp.c8.c8s2_salience.compute_y_lf_if_bin(X, Fs, N, H, R=10, F_min=55.0, F_max=1760.0, gamma=0.0)[source]

Binned Log-frequency Spectrogram with variable frequency resolution based on instantaneous frequency

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • X (np.ndarray) – Complex spectrogram

  • Fs (scalar) – Sampling rate in Hz

  • N (int) – Window length in samples

  • H (int) – Hopsize in samples

  • R (float) – Frequency resolution in cents (Default value = 10)

  • F_min (float) – Lower frequency bound (reference frequency) (Default value = 55.0)

  • F_max (float) – Upper frequency bound (Default value = 1760.0)

  • gamma (float) – Logarithmic compression factor (Default value = 0.0)

Returns
  • Y_LF_IF_bin (np.ndarray) – Binned log-frequency spectrogram using instantaneous frequency

  • F_coef_hertz (np.ndarray) – Frequency axis in Hz

  • F_coef_cents (np.ndarray) – Frequency axis in cents

libfmp.c8.c8s2_salience.f_coef(k, Fs, N)[source]

STFT center frequency

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • k (int) – Coefficient number

  • Fs (scalar) – Sampling rate in Hz

  • N (int) – Window length in samples

Returns

freq (float) – STFT center frequency

libfmp.c8.c8s2_salience.frequency_to_bin_index(F, R=10.0, F_ref=55.0)[source]
Binning function with variable frequency resolution
Note: Indexing starts with 0 (opposed to [FMP, Eq. (8.49)])

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • F (float) – Frequency in Hz

  • R (float) – Frequency resolution in cents (Default value = 10.0)

  • F_ref (float) – Reference frequency in Hz (Default value = 55.0)

Returns

bin_index (int) – Index for bin (starting with index 0)

libfmp.c8.c8s2_salience.harmonic_summation(Y, num_harm=10, alpha=1.0)[source]

Harmonic summation for spectrogram [FMP, Eq. (8.54)]

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • Y (np.ndarray) – Magnitude spectrogram

  • num_harm (int) – Number of harmonics (Default value = 10)

  • alpha (float) – Weighting parameter (Default value = 1.0)

Returns

Y_HS (np.ndarray) – Spectrogram after harmonic summation

libfmp.c8.c8s2_salience.harmonic_summation_lf(Y_LF_bin, R, num_harm=10, alpha=1.0)[source]

Harmonic summation for log-frequency spectrogram [FMP, Eq. (8.55)]

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • Y_LF_bin (np.ndarray) – Log-frequency spectrogram

  • R (float) – Frequency resolution in cents

  • num_harm (int) – Number of harmonics (Default value = 10)

  • alpha (float) – Weighting parameter (Default value = 1.0)

Returns

Y_LF_bin_HS (np.ndarray) – Log-frequency spectrogram after harmonic summation

libfmp.c8.c8s2_salience.p_bin(b, freq, R=10.0, F_ref=55.0)[source]

Computes binning mask [FMP, Eq. (8.50)]

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • b (int) – Bin index

  • freq (float) – Center frequency

  • R (float) – Frequency resolution in cents (Default value = 10.0)

  • F_ref (float) – Reference frequency in Hz (Default value = 55.0)

Returns

mask (float) – Binning mask

libfmp.c8.c8s2_salience.p_bin_if(b, F_coef_IF, R=10.0, F_ref=55.0)[source]

Computes binning mask for instantaneous frequency binning [FMP, Eq. (8.52)]

Notebook: C8/C8S2_SalienceRepresentation.ipynb

Parameters
  • b (int) – Bin index

  • F_coef_IF (float) – Instantaneous frequencies

  • R (float) – Frequency resolution in cents (Default value = 10.0)

  • F_ref (float) – Reference frequency in Hz (Default value = 55.0)

Returns

mask (np.ndarray) – Binning mask

libfmp.c8.c8s2_salience.principal_argument(v)[source]

Principal argument function

Parameters

v (float or np.ndarray) – Value (or vector of values)

Returns

w (float or np.ndarray) – Principle value of v

libfmp.c8.c8s3_nmf.init_nmf_activation_score(N, annotation, frame_res, tol_note=[0.2, 0.5], pitch_set=None)[source]

Initializes activation matrix for given score annotations

Notebook: C8/C8S3_NMFSpecFac.ipynb

Parameters
  • N (int) – Number of frames

  • annotation (list) – Annotation data

  • frame_res (time) – Time resolution

  • tol_note (list or np.ndarray) – Tolerance (seconds) for beginning and end of a note (Default value = [0.2, 0.5])

  • pitch_set (np.ndarray) – Set of occurring pitches (Default value = None)

Returns
  • H (np.ndarray) – Nonnegative matrix of size R x N

  • pitch_set (np.ndarray) – Set of occurring pitches

libfmp.c8.c8s3_nmf.init_nmf_activation_score_onset(N, annotation, frame_res, tol_note=[0.2, 0.5], tol_onset=[0.3, 0.1], pitch_set=None)[source]

Initializes activation matrix with onsets for given score annotations

Notebook: C8/C8S3_NMFSpecFac.ipynb

Parameters
  • N (int) – Number of frames

  • annotation (list) – Annotation data

  • frame_res (float) – Time resolution

  • tol_note (list or np.ndarray) – Tolerance (seconds) for beginning and end of a note (Default value = [0.2, 0.5])

  • tol_onset (list or np.ndarray) – Tolerance (seconds) for beginning and end of an onset (Default value = [0.3, 0.1])

  • pitch_set (np.ndarray) – Set of occurring pitches (Default value = None)

Returns
  • H (np.ndarray) – Nonnegative matrix of size (2R) x N

  • pitch_set (np.ndarray) – Set of occurring pitches

  • label_pitch (np.ndarray) – Pitch labels for the templates

libfmp.c8.c8s3_nmf.init_nmf_template_pitch(K, pitch_set, freq_res, tol_pitch=0.05)[source]

Initializes template matrix for a given set of pitches

Notebook: C8/C8S3_NMFSpecFac.ipynb

Parameters
  • K (int) – Number of frequency points

  • pitch_set (np.ndarray) – Set of fundamental pitches

  • freq_res (float) – Frequency resolution

  • tol_pitch (float) – Relative frequency tolerance for the harmonics (Default value = 0.05)

Returns

W (np.ndarray) – Nonnegative matrix of size K x R with R = len(pitch_set)

libfmp.c8.c8s3_nmf.init_nmf_template_pitch_onset(K, pitch_set, freq_res, tol_pitch=0.05)[source]

Initializes template matrix with onsets for a given set of pitches

Notebook: C8/C8S3_NMFSpecFac.ipynb

Parameters
  • K (int) – Number of frequency points

  • pitch_set (np.ndarray) – Set of fundamental pitches

  • freq_res (float) – Frequency resolution

  • tol_pitch (float) – Relative frequency tolerance for the harmonics (Default value = 0.05)

Returns

W (np.ndarray) – Nonnegative matrix of size K x (2R) with R = len(pitch_set)

libfmp.c8.c8s3_nmf.nmf(V, R, thresh=0.001, L=1000, W=None, H=None, norm=False, report=False)[source]

NMF algorithm with Euclidean distance

Notebook: C8/C8S3_NMFbasic.ipynb

Parameters
  • V (np.ndarray) – Nonnegative matrix of size K x N

  • R (int) – Rank parameter

  • thresh (float) – Threshold used as stop criterion (Default value = 0.001)

  • L (int) – Maximal number of iteration (Default value = 1000)

  • W (np.ndarray) – Nonnegative matrix of size K x R used for initialization (Default value = None)

  • H (np.ndarray) – Nonnegative matrix of size R x N used for initialization (Default value = None)

  • norm (bool) – Applies max-normalization of columns of final W (Default value = False)

  • report (bool) – Reports errors during runtime (Default value = False)

Returns
  • W (np.ndarray) – Nonnegative matrix of size K x R

  • H (np.ndarray) – Nonnegative matrix of size R x N

  • V_approx (np.ndarray) – Nonnegative matrix W*H of size K x N

  • V_approx_err (float) – Error between V and V_approx

  • H_W_error (np.ndarray) – History of errors of subsequent H and W matrices

libfmp.c8.c8s3_nmf.pitch_from_annotation(annotation)[source]

Extract set of occurring pitches from annotation

Notebook: C8/C8S3_NMFSpecFac.ipynb

Parameters

annotation (list) – Annotation data

Returns

pitch_set (np.ndarray) – Set of occurring pitches

libfmp.c8.c8s3_nmf.plot_nmf_factors(W, H, V, Fs, N_fft, H_fft, freq_max, label_pitch=None, title_W='W', title_H='H', title_V='V', figsize=(13, 3))[source]

Plots the factore of an NMF-based spectral decomposition

Notebook: C8/C8S3_NMFSpecFac.ipynb

Parameters
  • W – Template matrix

  • H – Activation matrix

  • V – Reconstructed input matrix

  • Fs – Sampling frequency

  • N_fft – FFT length

  • H_fft – Hopsize

  • freq_max – Maximum frequency to be plotted

  • label_pitch – Labels for the different pitches (Default value = None)

  • title_W – Title for imshow of matrix W (Default value = ‘W’)

  • title_H – Title for imshow of matrix H (Default value = ‘H’)

  • title_V – Title for imshow of matrix V (Default value = ‘V’)

  • figsize – Size of the figure (Default value = (13, 3))

libfmp.c8.c8s3_nmf.split_annotation_lh_rh(ann)[source]

Splitting of the annotation data in left and right hand

Notebook: C8/C8S3_NMFAudioDecomp.ipynb

Parameters

ann (list) – Annotation data

Returns
  • ann_lh (list) – Annotation data for left hand

  • ann_rh (list) – Annotation data for right hand

libfmp.c8.c8s3_nmf.template_pitch(K, pitch, freq_res, tol_pitch=0.05)[source]

Defines spectral template for a given pitch

Notebook: C8/C8S3_NMFSpecFac.ipynb

Parameters
  • K (int) – Number of frequency points

  • pitch (float) – Fundamental pitch

  • freq_res (float) – Frequency resolution

  • tol_pitch (float) – Relative frequency tolerance for the harmonics (Default value = 0.05)

Returns

template (np.ndarray) – Nonnegative template vector of size K