Yaafe core features¶
Yaafe core audio features.
Available features¶
AmplitudeModulation¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
AmplitudeModulation
¶ Tremelo and Grain description, according to [SE2005] and [AE2001].
AmplitudeModulation uses
Envelope
to describe tremolo and grain. Analyzed frequency ranges are :- Tremolo : 4 - 8 Hz
- Grain : 10 - 40 Hz
For each of these ranges, it computes :
- Frequency of maximum energy in range
- Difference of the energy of this frequency and the mean energy over all frequencies
- Difference of the energy of this frequency and the mean energy in range
- Product of the two first values.
[AE2001] A.Eronen, Automatic musical instrument recognition. Master’s Thesis, Tempere University of Technology, 2001. - Parameters:
EnDecim
(default=200): Decimation factor to compute envelopeblockSize
(default=32768): output frames sizestepSize
(default=16384): step between consecutive frames
Declaration example:
AmplitudeModulation EnDecim=200 blockSize=32768 stepSize=16384
See also
AutoCorrelation¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
AutoCorrelation
¶ Compute autocorrelation coefficients ac on each frames.
- Parameters:
ACNbCoeffs
(default=49): Number of autocorrelation coefficients to keepblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
AutoCorrelation ACNbCoeffs=49 blockSize=1024 stepSize=512
See also
BeatHistogramSummary¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
BeatHistogramSummary
¶ Compute the beat histogram according to [GT2002], but using
OnsetDetectionFunction
as onset detection function.[GT2002] Georges Tzanetakis, Musical Genre Classification of Audio Signals, IEEE Transactions on speech and audio processing, vol. 10, No. 5, July 2002.
- Parameters:
ACPNbPeaks
(default=3): Number of autocorrelation peaks to keepBHSBeatFrameSize
(default=128): Number of frames over which autocorrelation peaks is computedBHSBeatFrameStep
(default=64): Number of frames to skip between two consecutive autocorrelation peaks computationBHSHistogramFrameSize
(default=40): Number of beat frames over which histogram is computedBHSHistogramFrameStep
(default=40): Number of beat frames to skip between two consecutive histogram computationFFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneHInf
(default=40): Minimal BPM to take into considerationHNbBins
(default=80): Nb bins of histogramHSup
(default=200): Maximal BPM to tage into considerationNMANbFrames
(default=5000): Number of frames to normalize together, -1 means all framesblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
BeatHistogramSummary ACPNbPeaks=3 BHSBeatFrameSize=128 BHSBeatFrameStep=64 BHSHistogramFrameSize=40 BHSHistogramFrameStep=40 FFTLength=0 FFTWindow=Hanning HInf=40 HNbBins=80 HSup=200 NMANbFrames=5000 blockSize=1024 stepSize=512
See also
CQT¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
CQT
¶ Compute the Constant-Q transform according to [CS2010] with improvements from [JPCQT].
[CS2010] C.Schörkhuber and A.Klapuri, CONSTANT-Q TRANSFORM TOOLBOX FOR MUSIC PROCESSING, 7th Sound and Music Conference (SMC‘2010), 2010, Barcelona. [JPCQT] J.Prado, Calcul rapide de la transformée à Q constant, http://perso.telecom-paristech.fr/~prado/cqt/cqt_modif.pdf - Parameters:
CQTAlign
(default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the rightCQTBinsPerOctave
(default=36): Number of bins per octave to considerCQTMinFreq
(default=73.42): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.CQTNbOctaves
(default=3): Number of octaves to consider for analysisstepSize
(default=512): step between consecutive frames
Declaration example:
CQT CQTAlign=c CQTBinsPerOctave=36 CQTMinFreq=73.42 CQTNbOctaves=3 stepSize=512
See also
CQT2¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
CQT2
¶ Compute the Constant-Q transform according to Blankertz’s implementation [BB], with improvments from [JP2010].
[BB] B.Blankertz, The Constant Q Transform, http://wwwmath.uni-muenster.de/logik/Personen/blankertz/constQ/constQ.html [JP2010] J.Prado, Transformée à Q constant, technical report 2010D004, Institut TELECOM, TELECOM ParisTech, CNRS LTCI, 2010. - Parameters:
CQTAlign
(default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the rightCQTBinsPerOctave
(default=3): Number of bins per octave to considerCQTMaxFreq
(default=0.5): Maximum frequency. 0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.CQTMinFreq
(default=97.999): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.stepSize
(default=512): step between consecutive frames
Declaration example:
CQT2 CQTAlign=c CQTBinsPerOctave=3 CQTMaxFreq=0.5 CQTMinFreq=97.999 stepSize=512
See also
Chords¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
Chords
¶ Chords recognize chords from chromagrams, according to L.Oudre’s algorithm [LO2011].
[LO2011] Oudre, L. and Grenier, Y. and Fevotte, C., Chord recognition by fitting rescaled chroma vectors to chord templates, IEEE Transactions on Audio, Speech and Language Processing, vol. 19, pages 2222 - 2233, Sep. 2011. - Parameters:
ChordsSmoothing
(default=1.5s): Chords smoothing durationChordsUse7
(default=0): If 1 then use 7th chords to enrich chord dictionnary, else use only major an minor chordsstepSize
(default=512): step between consecutive frames
Declaration example:
Chords ChordsSmoothing=1.5s ChordsUse7=0 stepSize=512
Chroma¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
Chroma
¶ Chroma compute short-term chromagram according to [BP2005].
[BP2005] Bello, J.P. and Pickens, J. A Robust Mid-level Representation for Harmonic Content in Music Signals. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR-05), London, UK. September 2005. - Parameters:
CQTAlign
(default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the rightCQTBinsPerOctave
(default=36): Number of bins per octave to considerCQTMinFreq
(default=73.42): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.CQTNbOctaves
(default=3): Number of octaves to consider for analysisCTInitDuration
(default=15): Duration on which perform chroma bias initialisation, in seconds.ChromaSmoothing
(default=0.75s): Chroma smoothing durationstepSize
(default=512): step between consecutive frames
Declaration example:
Chroma CQTAlign=c CQTBinsPerOctave=36 CQTMinFreq=73.42 CQTNbOctaves=3 CTInitDuration=15 ChromaSmoothing=0.75s stepSize=512
See also
Chroma2¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
Chroma2
¶ Chroma2 compute short-term pitch profile according to [ZK2006].
[ZK2006] - Zhu and M.S. Kankanhalli. Precise pitch profile feature extraction from musical audio for key detection. IEEE Transactions on Multimedia, 2006.
- Parameters:
CQTAlign
(default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the rightCQTBinsPerOctave
(default=48): Number of bins per octave to considerCQTMinFreq
(default=27.5): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.CQTNbOctaves
(default=7): Number of octaves to consider for analysisCZBinsPerSemitone
(default=1): number of bins per semitone for the PCPCZNbCQTBinsAggregatedToPCPBin
(default=-1): number of CQT bins which are aggregated for each PCP bin. if -1 then use CQTBinsPerOctave / 24CZTuning
(default=440): frequency of the A4, in Hz.stepSize
(default=512): step between consecutive frames
Declaration example:
Chroma2 CQTAlign=c CQTBinsPerOctave=48 CQTMinFreq=27.5 CQTNbOctaves=7 CZBinsPerSemitone=1 CZNbCQTBinsAggregatedToPCPBin=-1 CZTuning=440 stepSize=512
See also
ComplexDomainOnsetDetection¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
ComplexDomainOnsetDetection
¶ Compute onset detection using a complex domain spectral flux method [CD2003].
[CD2003] C.Duxbury et al., Complex domain onset detection for musical signals, Proc. of the 6th Int. Conference on Digital Audio Effects (DAFx-03), London, UK, September 8-11, 2003 - Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
ComplexDomainOnsetDetection FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
Energy¶
Envelope¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
Envelope
¶ Extract amplitude envelope using hilbert transform, low-pass filtering and decimation.
- Parameters:
EnDecim
(default=200): Decimation factor to compute envelopeblockSize
(default=32768): output frames sizestepSize
(default=16384): step between consecutive frames
Declaration example:
Envelope EnDecim=200 blockSize=32768 stepSize=16384
See also
EnvelopeShapeStatistics¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
EnvelopeShapeStatistics
¶ Centroid, spread, skewness and kurtosis of each frame’s amplitude envelope. For more details about moments, see Shape Statistics.
- Parameters:
EnDecim
(default=200): Decimation factor to compute envelopeblockSize
(default=32768): output frames sizestepSize
(default=16384): step between consecutive frames
Declaration example:
EnvelopeShapeStatistics EnDecim=200 blockSize=32768 stepSize=16384
See also
Frames¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
Frames
¶ Segment input signal into frames.
First frame has zeros on left half so that it is centered on time 0s, then consecutive frames are equally spaced. Consequently, frame i (starting from 0) is centered on sample i * stepSize.
- Parameters:
blockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
Frames blockSize=1024 stepSize=512
LPC¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
LPC
¶ Compute the Linear Predictor Coefficients (LPC) of a signal frame. It uses autocorrelation and Levinson-Durbin algorithm. see [JM1975].
[JM1975] Makoul J., Linear Prediction: A tutorial Review, Proc. IEEE, Vol. 63, pp. 561-580, 1975. - Parameters:
LPCNbCoeffs
(default=2): Number of Linear Predictor Coefficients to computeblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
LPC LPCNbCoeffs=2 blockSize=1024 stepSize=512
See also
LSF¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
LSF
¶ Compute the Line Spectral Frequency (LSF) coefficients of a signal frame. Algorithm was adapted from ([TB2006], [SH1976]).
[TB2006] Tom Backstrom, Carlo Magi, Properties of line spectrum pair polynomials–A review, Signal Processing, Volume 86, Issue 11, Special Section: Distributed Source Coding, November 2006, Pages 3286-3298, ISSN 0165-1684, DOI: 10.1016/j.sigpro.2006.01.010. [SH1976] Schussler, H., A stability theorem for discrete systems, Acoustics, Speech and Signal Processing, IEEE Transactions on , vol.24, no.1, pp. 87-89, Feb 1976 - Parameters:
LSFDisplacement
(default=1): LSF Displacement parameter: 1 for classical LSF, 0 for Schussler polynomials, >1 is a generalizationLSFNbCoeffs
(default=10): Number of Line Spectral Frequencies to computeblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
LSF LSFDisplacement=1 LSFNbCoeffs=10 blockSize=1024 stepSize=512
See also
Loudness¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
Loudness
¶ The loudness coefficients are the energy in each Bark band, normalized by the overall sum. see [GP2004] and [MG1997] for more details.
[MG1997] Moore, Glasberg, et al., A Model for the Prediction of Thresholds Loudness and Partial Loudness., J. Audio Eng. Soc. 45: 224-240, 1997. - Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneLMode
(default=Relative): “Specific” computes loudness without normalization, “Relative” normalize each band so that they sum to 1, “Total” just returns the sum of Loudness in all bands.blockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
Loudness FFTLength=0 FFTWindow=Hanning LMode=Relative blockSize=1024 stepSize=512
See also
MFCC¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
MFCC
¶ Compute the Mel-frequencies cepstrum coefficients [DM1980].
Mel filter bank is built as 40 log-spaced filters according to the following mel-scale:
Each filter is a triangular filter with height
. Then MFCCs are computed as following, using DCT II:[DM1980] (1, 2) S.B. Davis and P.Mermelstrin, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech and Signal Processing, 28 :357-366, 1980. - Parameters:
CepsIgnoreFirstCoeff
(default=1): 0 keeps the first cepstral coeffcient, 1 ignore itCepsNbCoeffs
(default=13): Number of cepstral coefficient to keep.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneMelMaxFreq
(default=6854.0): Maximum frequency of the mel filter bankMelMinFreq
(default=130.0): Minimum frequency of the mel filter bankMelNbFilters
(default=40): Number of mel filtersblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
MFCC CepsIgnoreFirstCoeff=1 CepsNbCoeffs=13 FFTWindow=Hanning MelMaxFreq=6854.0 MelMinFreq=130.0 MelNbFilters=40 blockSize=1024 stepSize=512
See also
MagnitudeSpectrum¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
MagnitudeSpectrum
¶ Compute frame’s magnitude spectrum, using an analysis window (Hanning or Hamming), or not.
- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
MagnitudeSpectrum FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
MelSpectrum¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
MelSpectrum
¶ Compute the Mel-frequencies spectrum [DM1980].
Mel filter bank is built as 40 log-spaced filters according to the following mel-scale:
Each filter is a triangular filter with height
.- Parameters:
FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneMelMaxFreq
(default=6854.0): Maximum frequency of the mel filter bankMelMinFreq
(default=130.0): Minimum frequency of the mel filter bankMelNbFilters
(default=40): Number of mel filtersblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
MelSpectrum FFTWindow=Hanning MelMaxFreq=6854.0 MelMinFreq=130.0 MelNbFilters=40 blockSize=1024 stepSize=512
See also
OBSI¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
OBSI
¶ Compute Octave band signal intensity using a trigular octave filter bank ([SE2005]).
[SE2005] (1, 2) S.Essid, Classification automatique des signaux audio-frequences: reconnaissance des instruments de musique. PhD, UPMC, 2005. - Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneOBSIMinFreq
(default=27.5): Minimum frequency for OBSI filter.blockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
OBSI FFTLength=0 FFTWindow=Hanning OBSIMinFreq=27.5 blockSize=1024 stepSize=512
See also
OBSIR¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
OBSIR
¶ Compute log of
OBSI
ratio between consecutive octave.- Parameters:
DiffNbCoeffs
(default=0): Maximum number of coeffs to keep. 0 keeps N-1 value (with N the input feature size)FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneOBSIMinFreq
(default=27.5): Minimum frequency for OBSI filter.blockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
OBSIR DiffNbCoeffs=0 FFTLength=0 FFTWindow=Hanning OBSIMinFreq=27.5 blockSize=1024 stepSize=512
See also
OnsetDetectionFunction¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
OnsetDetectionFunction
¶ Compute onset detection function (spectral energy flux) according to [MA2005] method.
[MA2005] M.Alonso, G.Richard, B.David, EXTRACTING NOTE ONSETS FROM MUSICAL RECORDINGS, International Conference on Multimedia and Expo (IEEE-ICME‘05), Amsterdam, The Netherlands, 2005.
- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneNMANbFrames
(default=5000): Number of frames to normalize together, -1 means all framesblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
OnsetDetectionFunction FFTLength=0 FFTWindow=Hanning NMANbFrames=5000 blockSize=1024 stepSize=512
See also
PerceptualSharpness¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
PerceptualSharpness
¶ Compute the sharpness of
Loudness
coefficients, according to [GP2004].- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
PerceptualSharpness FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
PerceptualSpread¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
PerceptualSpread
¶ Compute the spread of
Loudness
coefficients, according to [GP2004].- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
PerceptualSpread FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
SpectralCrestFactorPerBand¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralCrestFactorPerBand
¶ Compute spectral crest factor per log-spaced band of 1/4 octave.
- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralCrestFactorPerBand FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
SpectralDecrease¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralDecrease
¶ Compute spectral decrease accoding to [GP2004].
- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralDecrease FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
SpectralFlatness¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralFlatness
¶ Compute global spectral flatness using the ratio between geometric and arithmetic mean.
- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralFlatness FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
SpectralFlatnessPerBand¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralFlatnessPerBand
¶ Compute spectral flatness per log-spaced band of 1/4 octave, as proposed in MPEG7 standard.
- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralFlatnessPerBand FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
SpectralFlux¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralFlux
¶ Compute flux of
spectrum
between consecutives frames.- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneFluxSupport
(default=All): support of flux computation. if ‘All’ then use all bins (default), if ‘Increase’ then use only bins which are increasingblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralFlux FFTLength=0 FFTWindow=Hanning FluxSupport=All blockSize=1024 stepSize=512
See also
SpectralIrregularity¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralIrregularity
¶ Compute difference between consecutive CQT bins, see [Brown2000].
[Brown2000] J.C. Brown, O.Houix, Stephen McAdams, Feature dependence in the automatic identification of musical woodwind instruments., Journal of the Acoustical Society of America, 109: 1064-1072, 2000. - Parameters:
CQTAlign
(default=c): Alignment of cqt kernels on analysis frame. ‘l’ to the left, ‘c’ to the center, ‘r’ to the rightCQTBinsPerOctave
(default=36): Number of bins per octave to considerCQTMinFreq
(default=73.42): Minimal frequency. If <0.5 then assume it’s a factor of sampleRate else assume it’s expressed in Hertz.CQTNbOctaves
(default=3): Number of octaves to consider for analysisstepSize
(default=512): step between consecutive frames
Declaration example:
SpectralIrregularity CQTAlign=c CQTBinsPerOctave=36 CQTMinFreq=73.42 CQTNbOctaves=3 stepSize=512
See also
SpectralRolloff¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralRolloff
¶ Spectral roll-off is the frequency so that 99% of the energy is contained below. see [SS1997].
[SS1997] (1, 2) E.Scheirer, M.Slaney. Construction and evaluation of a robust multifeature speech/music discriminator. IEEE Internation Conference on Acoustics, Speech and Signal Processing, p.1331-1334, 1997. - Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralRolloff FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
SpectralShapeStatistics¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralShapeStatistics
¶ Compute shape statistics of
MagnitudeSpectrum
, (see [GR2004]).Shape Statistics are centroid, spread, skewness and kurtosis, defined as follow:
[GR2004] O.Gillet, G.Richard, Automatic transcription of drum loops. in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Montreal, Canada, 2004. - Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralShapeStatistics FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
SpectralSlope¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralSlope
¶ SpectralSlope is computed by linear regression of the spectral amplitude. (see [GP2004])
- Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralSlope FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
SpectralVariation¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SpectralVariation
¶ SpectralVariation is the normalized correlation of
spectrum
between consecutive frames. (see [GP2004])[GP2004] (1, 2, 3, 4, 5, 6) Geoffroy Peeters, A large set of audio features for sound description (similarity and classification) in the CUIDADO project, 2004. - Parameters:
FFTLength
(default=0): Frame’s length on which perform FFT. Original frame is padded with zeros or truncated to reach this size. If 0 then use original frame length.FFTWindow
(default=Hanning): Weighting window to apply before fft. Hanning|Hamming|NoneblockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
SpectralVariation FFTLength=0 FFTWindow=Hanning blockSize=1024 stepSize=512
See also
TemporalShapeStatistics¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
TemporalShapeStatistics
¶ Compute shape statistics of signal frames.
- Parameters:
blockSize
(default=1024): output frames sizestepSize
(default=512): step between consecutive frames
Declaration example:
TemporalShapeStatistics blockSize=1024 stepSize=512
See also
Available feature transforms¶
AutoCorrelationPeaksIntegrator¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
AutoCorrelationPeaksIntegrator
¶ Feature transform that compute peaks of the autocorrelation function, outputs peaks and amplitude.
- Parameters:
ACPInterPeakMinDist
(default=5): Minimal distance between consecutive autocorrelation peaks, expressed in lags.ACPNbPeaks
(default=3): Number of autocorrelation peaks to keepACPNorm
(default=No): can be No|BPM|Hz. Normalize output to be expressed respectively in lag, BPM, HzNbFrames
(default=60): Number of frames to integrate togetherStepNbFrames
(default=30): Number of frames to skip between two integration
Declaration example:
AutoCorrelationPeaksIntegrator ACPInterPeakMinDist=5 ACPNbPeaks=3 ACPNorm=No NbFrames=60 StepNbFrames=30
Cepstrum¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
Cepstrum
¶ Feature transform that compute cepstrum coefficients of input feature frames. (use DCT II)
- Parameters:
CepsIgnoreFirstCoeff
(default=1): 0 keeps the first cepstral coeffcient, 1 ignore itCepsNbCoeffs
(default=13): Number of cepstral coefficient to keep.
Declaration example:
Cepstrum CepsIgnoreFirstCoeff=1 CepsNbCoeffs=13
Derivate¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
Derivate
¶ Compute temporal derivative of input feature. The derivative is approximated by an orthogonal polynomial fit over a finite length window. (see [RR1993] p.117).
[RR1993] L.R.Rabiner, Fundamentals of Speech Processing. Prentice Hall Signal Processing Series. PTR Prentice-Hall, 1993. - Parameters:
DO1Len
(default=4): Horizon used to compute order 1 derivative.DO2Len
(default=1): Horizon used to compute order 2 derivative. Useless if DOrder=1.DOrder
(default=1): Order of the derivative to compute.
Declaration example:
Derivate DO1Len=4 DO2Len=1 DOrder=1
HistogramIntegrator¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
HistogramIntegrator
¶ Feature transform that compute histogram of input values
- Parameters:
HInf
(default=0): Minimal value to take into considerationHNbBins
(default=10): Nb bins of histogramHSup
(default=1): Maximal value to take into considerationHWeighted
(default=0): Set it to 1 if input values are weighted. If 1, input is considered to be a list of couple (value,weight).NbFrames
(default=60): Number of frames to integrate togetherStepNbFrames
(default=30): Number of frames to skip between two integration
Declaration example:
HistogramIntegrator HInf=0 HNbBins=10 HSup=1 HWeighted=0 NbFrames=60 StepNbFrames=30
SlopeIntegrator¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
SlopeIntegrator
¶ Feature transform that compute the slope of input feature over the given number of frames.
- Parameters:
NbFrames
(default=60): Number of frames to integrate togetherStepNbFrames
(default=30): Number of frames to skip between two integration
Declaration example:
SlopeIntegrator NbFrames=60 StepNbFrames=30
StatisticalIntegrator¶
-
class
yaafelib.yaafe_extensions.yaafefeatures.
StatisticalIntegrator
¶ Feature transform that compute the temporal mean and variance of input feature over the given number of frames.
- Parameters:
NbFrames
(default=60): Number of frames to integrate togetherSICompute
(default=MeanStddev): if ‘MeanStddev’ then compute mean and standard deviation, if ‘Mean’ compute only mean, if ‘Stddev’ compute only stantard deviation.StepNbFrames
(default=30): Number of frames to skip between two integration
Declaration example:
StatisticalIntegrator NbFrames=60 SICompute=MeanStddev StepNbFrames=30