Audio Analysis Descriptors

This module performs audio analysis and outputs a variety of descriptors characterizing the sound. The simple version computes a set of commonly used descriptors, suitable for general applications. The complex version extracts a more extensive set of features, mainly used in sound analysis contexts.

For detailed descriptions and mathematical definitions, refer to G.Peeters' 2004 paper.

Settings

on analysis

Allows the activation/deactivation of the module.

in

Audio input to analyze.

on spectral

Switch that controls the activation state of the spectral descriptors.

spectral centroid

Value of the spectral centroid descriptor.
A high spectral centroid indicates a bright or high-pitched sound, while a low spectral centroid indicates a darker or lower-pitched sound.

spectral spread

Value of the spectral spread descriptor.
A high spectral spread indicates a wide band of frequencies present in the signal, while a low spectral spread indicates a concentration of frequencies around the centroid.

spectral flatness

Value of the spectral flatness descriptor.
Noisy signals have their spectral flatness close to 1, while tonal signals have a flatness close to 0.

spectral skewness

Value of the spectral skewness descriptor.
Spectral skewness measures the asymmetry of the frequency distribution around the spectral centroid. A negative skewness indicates more energy in the frequencies higher than the spectral centroid and positive is the opposite.

spectral kurtosis

Value of the spectral kurtosis descriptor.
The spectral kurtosis measures the flatness of the frequency distribution. High kurtosis indicates a peaked distribution with heavy tails, while low kurtosis indicates a flatter distribution.

spectral slope

Value of the spectral slope descriptor.
A negative spectral slope indicates a decrease in amplitudes with increasing frequency.

spectral rolloff

Value of the spectral rolloff descriptor.
The spectral roll-off point is the frequency below which 95% of the total signal energy is contained.

rolloff threshold

Value controlling the energy percentage threshold for the spectral rolloff calculation.

spectral crest

Value of the spectral crest descriptor.
A high spectral crest indicates that the spectrum energy is concentrated in strong tonal components.

tonality coeff

Value of the tonality coefficient descriptor.
For tonal signals, tonality is close to 1 while for noisy signals it's close to 0.

spectral decrease

Value of the spectral decrease descriptor.
A high spectral decrease indicates a quick reduction in magnitudes as the frequency increases.

spectral entropy

Value of the spectral entropy descriptor.
Low spectral entropy indicates that the signal's energy is concentrated in a few frequencies, which may correspond to a pure or tonal sound.

peaks amplitudes

Array containing the amplitudes of the prominent frequencies in the sound.

peaks frequencies

Array containing the prominent frequencies in the sound.

on harmonic

Activate the calculation of the harmonic descriptors.

MFCC

Array of the 13 MFCC coefficients. MFCCs represent the short-term power spectrum of a sound in a compact form. The first few coefficients represent the overall spectral envelope, while higher coefficients represent finer details.

estimated frequency

Provides an estimation of the fundamental frequency.
Best suited for monophonic signals. Using a larger FFT size improves the resolution and enables detection of lower frequencies.

frequency score

Indicates the confidence level (probability) associated with the estimated fundamental frequency. Higher scores reflect a more reliable estimation.

inharmonicity

Value of the inharmonicity factor. Inharmonic signals have an inharmonicity close to 1 which means that the spectral peaks deviate from integer multiples of the fundamental frequency.

odd to even ratio

Value of the odd to even ratio. A positive ratio means that the amplitude of the odd harmonics are dominant. A null ratio indicates an evenly distributed repartition across harmonics.

tristimulus

Array of the 3 tristimulus coefficients. The tristimulus are three different types of energy ratio allowing a fine description of the repartition of the first harmonics of the spectrum.

on temporal

Switch that controls the activation state of the temporal descriptors.

RMS

Value of the RMS of the signal.
The Root Mean Square (RMS) value is a measure of the average loudness of the signal. It is measured in dB.

on perceptual

Switch that controls the activation state of the perceptual descriptors.

total loudness

Value of the total loudness descriptor. The total loudness is a sum of each loudness per bands.

loudness per bands

Values of the loudness for each frequency band using Bark's frequency bands.
It is a measure of the intensity of the sound using a modeling of the human ear system.

sharpness

Value of the current sharpness. The sharpness descriptor can categorize the brightness of a sound using a model of the human ear. A sharpness value close to 0 indicates that most of the spectrum energy is the lower frequency.

fft size

Change the size in samples of the FFT.
Available sizes : 1024, 2048, 4096, 8192.

overlap

Change the processing rate. Increase precision and CPU consumption.
Available rates : 1, 2, 4.

latency

Indicates the overall latency of the system in milliseconds between successive calculation outputs.

Common Settings

info

show manual

Opens the web browser to display information or help about the selected object, if it exists.

For more details about information/help creation, see create-help-file.

description

Description of the module for internal help purposes only. The description is not displayed in the interface.

ID's

visible only in god mode, see setup-panel-tab-expert.

unique ID

Current private ID for this control used to identify the object.

preset ID

Current private preset ID for this control used for presets.

recreate ID

If you experience difficulties in Polyphonic mode, try to recreate new id(s) with this button.

repair ID s

Each Patch shared on the local network uses its own ID (identification number). If you experience issues of Patches that don't send information to the good target, this button will rebuild all these id's.

Object Remote Address

absolute

Absolute remote address. see objects-address.

local

Local to the current patch remote address. see objects-address.

user addr

User defined remote address. see objects-address.

See also

version 7.0.250121

Edit All Pages