Spectral features

This section contains the documentation for:

class aubio.dct(size=1024)

Compute Discrete Fourier Transforms of Type-II.

Parameters:

size (int) – size of the DCT to compute

Example

>>> d = aubio.dct(16)
>>> d.size
16
>>> x = aubio.fvec(np.ones(d.size))
>>> d(x)
array([4., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
      dtype=float32)
>>> d.rdo(d(x))
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
      dtype=float32)

References

DCT-II in Discrete Cosine Transform on Wikipedia.

class aubio.fft(size=1024)

Compute Fast Fourier Transforms.

Parameters:

size (int) – size of the FFT to compute

Example

>>> x = aubio.fvec(512)
>>> f = aubio.fft(512)
>>> c = f(x); c
aubio cvec of 257 elements
>>> x2 = f.rdo(c); x2.shape
(512,)
rdo()

synthesis of spectral grain

win_s

size of the window

class aubio.filterbank(n_filters=40, win_s=1024)

Create a bank of spectral filters. Each instance is a callable that holds a matrix of coefficients.

See also set_mel_coeffs(), set_mel_coeffs_htk(), set_mel_coeffs_slaney(), set_triangle_bands(), and set_coeffs().

Parameters:
  • n_filters (int) – Number of filters to create.

  • win_s (int) – Size of the input spectrum to process.

Examples

>>> f = aubio.filterbank(128, 1024)
>>> f.set_mel_coeffs(44100, 0, 10000)
>>> c = aubio.cvec(1024)
>>> f(c).shape
(128, )
get_coeffs()

Get coefficients matrix of filterbank.

Returns:

Array of shape (n_filters, win_s/2+1) containing the coefficients.

Return type:

array_like

get_norm()

Get norm parameter of filterbank.

Returns:

Norm parameter.

Return type:

float

get_power()

Get power applied to filterbank.

Returns:

Power parameter.

Return type:

float

set_coeffs(coeffs)

Set coefficients of filterbank.

Parameters:

coeffs (fmat) – Array of shape (n_filters, win_s/2+1) containing the coefficients.

set_mel_coeffs(samplerate, fmin, fmax)

Set coefficients of filterbank to linearly spaced mel scale.

Parameters:
  • samplerate (float) – Sampling-rate of the expected input.

  • fmin (float) – Lower frequency boundary of the first filter.

  • fmax (float) – Upper frequency boundary of the last filter.

See also

hztomel

set_mel_coeffs_htk(samplerate, fmin, fmax)

Set coefficients of the filters to be linearly spaced in the HTK mel scale.

Parameters:
  • samplerate (float) – Sampling-rate of the expected input.

  • fmin (float) – Lower frequency boundary of the first filter.

  • fmax (float) – Upper frequency boundary of the last filter.

See also

hztomel

set_mel_coeffs_slaney(samplerate)

Set coefficients of filterbank to match Slaney’s Auditory Toolbox.

The filter coefficients will be set as in Malcolm Slaney’s implementation. The filterbank should have been created with n_filters = 40.

This is approximately equivalent to using set_mel_coeffs() with fmin = 400./3., fmax = 6853.84.

Parameters:

samplerate (float) – Sampling-rate of the expected input.

References

Malcolm Slaney, Auditory Toolbox Version 2, Technical Report #1998-010

set_norm(norm)

Set norm parameter. If set to 0, the filters will not be normalized. If set to 1, the filters will be normalized to one. Default to 1.

This function should be called before set_triangle_bands(), set_mel_coeffs(), set_mel_coeffs_htk(), or set_mel_coeffs_slaney().

Parameters:

norm (int) – 0 to disable, 1 to enable

set_power(power)

Set power applied to input spectrum of filterbank.

Parameters:

power (float) – Power to raise input spectrum to before computing the filters.

set_triangle_bands(freqs, samplerate)

Set triangular bands. The coefficients will be set to triangular overlapping windows using the boundaries specified by freqs.

freqs should contain n_filters + 2 frequencies in Hz, ordered by value, from smallest to largest. The first element should be greater or equal to zero; the last element should be smaller or equal to samplerate / 2.

Parameters:
  • freqs (fvec) – List of frequencies, in Hz.

  • samplerate (float) – Sampling-rate of the expected input.

Example

>>> fb = aubio.filterbank(n_filters=100, win_s=2048)
>>> samplerate = 44100; freqs = np.linspace(0, 20200, 102)
>>> fb.set_triangle_bands(aubio.fvec(freqs), samplerate)
n_filters

number of filters

win_s

size of the window

class aubio.mfcc(buf_size=1024, n_filters=40, n_coeffs=13, samplerate=44100)

Compute Mel Frequency Cepstrum Coefficients (MFCC).

mfcc creates a callable which takes a cvec as input.

If n_filters = 40, the filterbank will be initialized with filterbank.set_mel_coeffs_slaney(). Otherwise, if n_filters is greater than 0, it will be initialized with filterbank.set_mel_coeffs() using fmin = 0, fmax = samplerate/.

Example

>>> buf_size = 2048; n_filters = 128; n_coeffs = 13; samplerate = 44100
>>> mf = aubio.mfcc(buf_size, n_filters, n_coeffs, samplerate)
>>> fftgrain = aubio.cvec(buf_size)
>>> mf(fftgrain).shape
(13,)
class aubio.pvoc(win_s=512, hop_s=256)

Phase vocoder.

pvoc creates callable object implements a phase vocoder [1], using the tricks detailed in [2].

The call function takes one input of type fvec and of size hop_s, and returns a cvec of length win_s//2+1.

Parameters:
  • win_s (int) – number of channels in the phase-vocoder.

  • hop_s (int) – number of samples expected between each call

Examples

>>> x = aubio.fvec(256)
>>> pv = aubio.pvoc(512, 256)
>>> pv(x)
aubio cvec of 257 elements

Default values for hop_s and win_s are provided:

>>> pv = aubio.pvoc()
>>> pv.win_s, pv.hop_s
512, 256

A cvec can be resynthesised using rdo():

>>> pv = aubio.pvoc(512, 256)
>>> y = aubio.cvec(512)
>>> x_reconstructed = pv.rdo(y)
>>> x_reconstructed.shape
(256,)

References

rdo(fftgrain)

Read a new spectral grain and resynthesise the next hop_s output samples.

Parameters:

fftgrain (cvec) – new input cvec to synthesize from, should be of size win_s/2+1

Returns:

re-synthesised output of shape (hop_s,)

Return type:

fvec

Example

>>> pv = aubio.pvoc(2048, 512)
>>> out = pv.rdo(aubio.cvec(2048))
>>> out.shape
(512,)
set_window(window_type)

Set window function

Parameters:

window_type (str) – the window type to use for this phase vocoder

Raises:

ValueError – If an unknown window type was given.

See also

window

create a window.

hop_s

Interval between two analysis, in samples.

Type:

int

win_s

Size of phase vocoder analysis windows, in samples.

Type:

int

class aubio.specdesc(method='default', buf_size=1024)

Spectral description functions. Creates a callable that takes a cvec as input, typically created by pvoc for overlap and windowing, and returns a single float.

method can be any of the values listed below. If default is used the hfc function will be selected.

Onset novelty functions:

  • energy: local energy,

  • hfc: high frequency content,

  • complex: complex domain,

  • phase: phase-based method,

  • wphase: weighted phase deviation,

  • specdiff: spectral difference,

  • kl: Kullback-Liebler,

  • mkl: modified Kullback-Liebler,

  • specflux: spectral flux.

Spectral shape functions:

  • centroid: spectral centroid (barycenter of the norm vector),

  • spread: variance around centroid,

  • skewness: third order moment,

  • kurtosis: a measure of the flatness of the spectrum,

  • slope: decreasing rate of the amplitude,

  • decrease: perceptual based measurement of the decreasing rate,

  • rolloff: 95th energy percentile.

Parameters:
  • method (str) – Onset novelty or spectral shape function.

  • buf_size (int) – Length of the input frame.

Example

>>> win_s = 1024; hop_s = win_s // 2
>>> pv = aubio.pvoc(win_s, hop_s)
>>> sd = aubio.specdesc("mkl", win_s)
>>> sd(pv(aubio.fvec(hop_s))).shape
(1,)

References

Detailed description in aubio API documentation.

class aubio.tss(buf_size=1024, hop_size=512)

Transient/Steady-state separation.