sample.evaluation.metrics module¶

Metrics for model evaluation

class sample.evaluation.metrics.CochleagramLoss(fs: float, analytic: Optional[str] = 'ir', method: Optional[str] = None, stride: Optional[int] = None, p: float = 1, **kwargs)¶

Bases: object

Class for computing losses on cochleagrams (lower is better)

Parameters:

fs (float) – Sample frequency
postprocessing (callable) – If not None, then apply this function to the cochleagram matrix. Default is hwr(), if the cochleagram is real, otherwise it is numpy.abs()
method (str) – Convolution method (either "auto", "fft", "direct", or "overlap-add")
stride (int) – Time-step for output signal. Can’t be used in conjunction with method
analytic (str) –
Compute the analytic signal of the cochleagram:
- if "input", then compute the analytic signal of the input (fast, accurate in the middle, bad boundary conditions)
- if "ir" (suggested), then compute the analytic signal of the IRs (fast, tends to underestimate amplitude, good boundary conditions)
- if "output", then compute the analytic signal of the output (slowest, most accurate)
p (float) – Exponent for the lp-norm
**kwargs – Keyword arguments for sample.psycho.GammatoneFilterbank

cochleagram(x: ndarray) → ndarray¶

Compute cochleagram for one input

Parameters:: x (array) – Input signal
Returns:: Cochleagram
Return type:: array

lp_distance(x: ndarray, y: ndarray)¶

Compute the distance between two vectors as the lp-norm of their difference

Parameters:

x (array) – First vector
y (array) – Second vector

Returns:

The distance

Return type:

float

class sample.evaluation.metrics.MultiScaleSpectralLoss(spectral_loss: Callable, stfts: Iterable[Dict[str, Any]])¶

Bases: object

Class for computing multiscale losses on the STFT online and in parallel

Parameters:

spectral_loss (callable) – Base function for computing a spectral loss
stfts (iterable of dict) – Multiple dictionaries of keyword arguments for spectral_loss()

sample.evaluation.metrics.lin_log_spectral_loss(x, y, n: int = 2048, olap: float = 0.75, w: Optional[ndarray] = None, wtype: str = 'hamming', wsize: Optional[int] = None, alpha: Optional[float] = None, norm_p: float = 1.0, floor_db: float = -60, **kwargs)¶

Compute a sum of linear and log loss on the STFT (lower is better)

Parameters:

x (array) – First audio input
y (array) – Second audio input
w – Analysis window. Defaults to None (if None, the default_window is used)
n (int) – FFT size. Defaults to 2048
olap (float) – Window overlap, as a fraction of the window size
alpha (float) – Weight of the log-difference
norm_p (float) – Exponent patameter for norm. Default is 1.0
floor_db (float) – Minimum magnitude for STFT in dB
kwargs – Keyword arguments for scipy.signal.stft()

Returns:

loss value

Return type:

float

sample.evaluation.metrics.lp_distance(x, y, p: float = 1)¶

Compute the distance between two vectors as the lp-norm of their difference

Parameters:

x (array) – First vector
y (array) – Second vector
p (float) – Exponent for the lp-norm

Returns:

The distance

Return type:

float

sample.evaluation.metrics.multiscale_spectral_loss(x, y, *args, spectral_loss: ~typing.Callable = <function lin_log_spectral_loss>, stfts: ~typing.Iterable[~typing.Dict[str, ~typing.Any]] = ({'n': 64}, {'n': 128}, {'n': 256}, {'n': 512}, {'n': 1024}, {'n': 2048}), **kwargs) → float¶

Compute a multiscale spectral loss

Parameters:

x (array) – First audio input
y (array) – Second audio input
args – Additional positional arguments for the base loss function
spectral_loss (callable) – Base function for computing a spectral loss. Default is lin_log_spectral_loss()
stfts (iterable of dict) – Multiple dictionaries of
pool (multiprocessing.Pool) – If not None, then compute the losses in parallel processes from this pool
njobs (int) – If not None, then compute the losses in parallel using a process pool with njobs workers
kwargs – Additional keyword arguments for the base loss function

Returns:

Sum of loss values

Return type:

float