5. The yaafelib Python module

Main Yaafe module, providing classes to extract features with Yaafe.

5.1. FeaturePlan

class yaafelib.FeaturePlan(sample_rate=44100, normalize=None, resample=False, time_start=0.0, time_limit=0.0)

FeaturePlan is a collection of features to extract, configured for a specific sample rate.

Parameters:
  • sample_rate – analysis samplerate
  • normalize – signal maximum normalization, in [0,1], or None to skip normalization.
  • resample – force resample the sample rate of input audio to param sample_rate, default: False.
  • time_start – time offset where to start analyse, if given a negative value(e.g: “-10s” ), it will start at the last 10s. default: 0.0(s)
  • time_limit – longest time duration to keep, 0s means no limit. If the given value is longer than the available duration, the excess should be ignored. default: 0.0(s)

This collection can be load from a file using the loadFeaturePlan() method, or built by adding features with the addFeature() method.

Then, the getDataFlow() method retrieve the corresponding DataFlow object.

>>> fp = FeaturePlan(sample_rate=16000)
>>> fp.addFeature('mfcc: MFCC blockSize=512 stepSize=256')
True
>>> fp.addFeature('mfcc_d1: MFCC blockSize=512 stepSize=256 > '                              'Derivate DOrder=1')
True
>>> fp.addFeature('mfcc_d2: MFCC blockSize=512 stepSize=256 > '                              'Derivate DOrder=2')
True
>>> df = fp.getDataFlow()
>>> df.display()
...
addFeature(definition)

Add a feature defined according the feature definition syntax.

Parameters:definition (string) – feature definition
Return type:True on success, False on fail.
getDataFlow()

Get the DataFlow object representing how to extract defined features.

Return type:DataFlow
loadFeaturePlan(filename)

Loads feature extraction plan from a file. The file must be a text file, where each line defines a feature (see feature definition syntax).

Return type:True on success, False on fail.

5.2. DataFlow

class yaafelib.DataFlow

A DataFlow object hold a directed acyclic graph of computational steps describing how to compute some audio features.

DataFlow can be loaded directly from a dataflow file using the load() method, or created with a FeaturePlan object. The advanced user may also build a dataflow graph from scratch.

display()

Print the DataFlow to the standard output

dumpdot(filename)

write a got graph corresponding to the DataFlow

Parameters:filename (string) – file to write
load(filename)

Build DataFlow from a dataflow file.

Parameters:filename (string) – dataflow file name.
Returns:True on success, False on fail.
loads(buf)

Build DataFlow from buf read from a dataflow file.

Parameters:buf (string) – buffer read from a dataflow file
Returns:True on success, False on fail.
save(filename)

write DataFlow into a dataflow file.

Parameters:filename (string) – file to write

5.3. Engine

class yaafelib.Engine

A Engine object is in charge of processing computations defined in a DataFlow object on given inputs.

>>> # Initialization
>>> fp = FeaturePlan(sample_rate=16000)
>>> fp.addFeature('mfcc: MFCC blockSize=512 stepSize=256')
True
>>> fp.addFeature('sr: SpectralRolloff blockSize=512 stepSize=256')
True
>>> fp.addFeature('sf: SpectralFlux blockSize=512 stepSize=256')
True
>>> df = fp.getDataFlow()
>>> df2 = DataFlow()
>>> len(str(df)) > 0
True
>>> df2.loads(str(df))
True
>>> str(df2) == str(df)
True
>>> engine = Engine()
>>> engine.load(df)
True
>>> # get input metadata
>>> eng_out = engine.getInputs()
>>> sorted(eng_out.keys())
['audio']
>>> sorted(eng_out['audio'].items())
[('frameLength', 1), ('parameters', {...}),
 ('sampleRate', 16000.0), ('sampleStep', 1), ('size', 1)]
>>> sorted(eng_out['audio']['parameters'].items())
[('Resample', 'no'), ('SampleRate', '16000'),
 ('TimeLimit', '0.0s'), ('TimeStart', '0.0s')]
>>>
>>>
>>> # get output metadata
>>> eng_out = engine.getOutputs()
>>> sorted(eng_out.items())
[('mfcc', {...}), ('sf', {...}), ('sr', {...})]
>>> sorted(eng_out['sr'].items())
[('frameLength', 512), ('parameters', {...}),
 ('sampleRate', 16000.0), ('sampleStep', 256), ('size', 1)]
>>> sorted(eng_out['sr']['parameters'].items())
[('normalize', '-1'), ('resample', 'no'), ('samplerate', '16000'),
 ('version', '0.7'),
 ('yaafedefinition', 'SpectralRolloff blockSize=512 stepSize=256')]
>>> # extract features from a numpy array
>>> import numpy # needs numpy
>>> audio = numpy.random.randn(1,1000000)
>>> feats = engine.processAudio(audio)
>>> feats['mfcc']
array([[...]])
>>> feats['sf']
array([[...]])
>>> feats['sr']
array([[...]])

It is also possible to extract features block per block:

# first reset the engine
engine.reset()
for i in range(1,10):
    # get your audio data
    audio = numpy.random.rand(1,32000)
    # write audio array on 'audio' input
    engine.writeInput('audio',audio)
    # process available data
    engine.process()
    # read available feature data
    feats = engine.readAllOutputs()
    # do what you want your feature data
# do not forget to flush
engine.flush()
feats = engine.readAllOutputs() # read last data
# do what you want your feature data

When extracting features block per block, you should be aware of Yaafe’s engine internals.

flush()

Process available data and flush all buffers so that all output data is available. Analysis is ended, the reset() method must be called before any further calls to writeInput() and process().

getInputs()

Get input metadata. Result format is the same as for getOutputs() method, but the general case is that there is only one input named ‘audio’ and the sole relevant metadata are:

SampleRate:expected audio sampleRate
Parameters:attached parameters

Others fields should be set to 1.

getOutputs()

Get output metadata. For each output feature, you get the following metadata:

SampleRate:audio analysis samplerate
SampleStep:Number of audio samples between consecutive feature values
FrameLength:Analysis frame size in number of audio samples
Size:size the feature (or number of coefficients)
Parameters:attached parameters.
load(dataflow)

Configure engine according to the given dataflow.

Parameters:dataflow (DataFlow or string) – dataflow object or filename of a dataflow file.
Returns:True on success, False on fail.
process()

Process available data.

processAudio(data)

Convenient method to extract features from data. It successively calls reset(), writeInput(), process(), flush(), and returns output of readAllOutputs()

readAllOutputs()

Read all outputs.

Returns:dictionary with output name as key and numpy.array as value.
readOutput(name)

Read a specific output, and returns values as a numpy.array

Parameters:name (string) – output name to read
Return type:numpy.array
reset()

Reset engine. All buffers are cleared, and a new analysis can start.

writeInput(name, data)

Write data on an input.

Parameters:
  • name (string) – input on which to write
  • data (numpy array) – data to write.

5.4. AudioFileProcessor

class yaafelib.AudioFileProcessor

A AudioFileProcessor object allow to extract features from audio files, and possibly write output features into files.

It must be provided with a configured Engine.

Here is how to extract features from audio files and get it as numpy arrays:

>>> # configure your engine
>>> engine = Engine()
>>> engine.load(dataflow_file)
True
>>> # create your AudioFileProcessor
>>> afp = AudioFileProcessor()
>>> # leave output format to None
>>> afp.processFile(engine, audiofile)
0
>>> # retrieve features from engine
>>> feats = engine.readAllOutputs()
>>> # do what you want with your feature data
>>> feats['mfcc']
array([[...]])

To write features directly to output files, just set an output format with the setOutputFormat() method.

processFile(engine, filename)

Extract features from the given file using the given engine.

If an output format has been set, then output files will be written, else output feature data can be read using engine’s Engine.readOutput() or Engine.readAllOutputs() methods.

Parameters:
  • engine (Engine) – engine to use for feature extraction. It must already have been configured.
  • filename (string) – audio file to process
Returns:

0 on success, negative value on fail

setOutputFormat(format, outDir, params)

Set output format.

Parameters:
  • format (string) – format to set
  • outDir (string) – base output directory for output files
  • params (dict) – format parameters
Returns:

True if ok, False if format does not exists.