This module performs audio analysis and outputs a variety of descriptors characterizing the sound. The simple version computes a set of commonly used descriptors, suitable for general applications. The complex version extracts a more extensive set of features, mainly used in sound analysis contexts.
For detailed descriptions and mathematical definitions, refer to G.Peeters' 2004 paper.
Allows the activation/deactivation of the module.
Audio input to analyze.
Switch that controls the activation state of the spectral descriptors.
Value of the spectral centroid descriptor.
A high spectral centroid indicates a bright or high-pitched sound, while a low spectral centroid indicates a darker or lower-pitched sound.
Value of the spectral spread descriptor.
A high spectral spread indicates a wide band of frequencies present in the signal, while a low spectral spread indicates a concentration of frequencies around the centroid.
Value of the spectral flatness descriptor.
Noisy signals have their spectral flatness close to 1, while tonal signals have a flatness close to 0.
Value of the spectral skewness descriptor.
Spectral skewness measures the asymmetry of the frequency distribution around the spectral centroid. A negative skewness indicates more energy in the frequencies higher than the spectral centroid and positive is the opposite.
Value of the spectral kurtosis descriptor.
The spectral kurtosis measures the flatness of the frequency distribution. High kurtosis indicates a peaked distribution with heavy tails, while low kurtosis indicates a flatter distribution.
Value of the spectral slope descriptor.
A negative spectral slope indicates a decrease in amplitudes with increasing frequency.
Value of the spectral rolloff descriptor.
The spectral roll-off point is the frequency below which 95% of the total signal energy is contained.
Value controlling the energy percentage threshold for the spectral rolloff calculation.
Value of the spectral crest descriptor.
A high spectral crest indicates that the spectrum energy is concentrated in strong tonal components.
Value of the tonality coefficient descriptor.
For tonal signals, tonality is close to 1 while for noisy signals it's close to 0.
Value of the spectral decrease descriptor.
A high spectral decrease indicates a quick reduction in magnitudes as the frequency increases.
Value of the spectral entropy descriptor.
Low spectral entropy indicates that the signal's energy is concentrated in a few frequencies, which may correspond to a pure or tonal sound.
Array containing the amplitudes of the prominent frequencies in the sound.
Array containing the prominent frequencies in the sound.
Activate the calculation of the harmonic descriptors.
Array of the 13 MFCC coefficients. MFCCs represent the short-term power spectrum of a sound in a compact form. The first few coefficients represent the overall spectral envelope, while higher coefficients represent finer details.
Provides an estimation of the fundamental frequency.
Best suited for monophonic signals. Using a larger FFT size improves the resolution and enables detection of lower frequencies.
Indicates the confidence level (probability) associated with the estimated fundamental frequency. Higher scores reflect a more reliable estimation.
Value of the inharmonicity factor. Inharmonic signals have an inharmonicity close to 1 which means that the spectral peaks deviate from integer multiples of the fundamental frequency.
Value of the odd to even ratio. A positive ratio means that the amplitude of the odd harmonics are dominant. A null ratio indicates an evenly distributed repartition across harmonics.
Array of the 3 tristimulus coefficients. The tristimulus are three different types of energy ratio allowing a fine description of the repartition of the first harmonics of the spectrum.
Switch that controls the activation state of the temporal descriptors.
Value of the RMS of the signal.
The Root Mean Square (RMS) value is a measure of the average loudness of the signal. It is measured in dB.
Switch that controls the activation state of the perceptual descriptors.
Value of the total loudness descriptor. The total loudness is a sum of each loudness per bands.
Values of the loudness for each frequency band using Bark's frequency bands.
It is a measure of the intensity of the sound using a modeling of the human ear system.
Value of the current sharpness. The sharpness descriptor can categorize the brightness of a sound using a model of the human ear. A sharpness value close to 0 indicates that most of the spectrum energy is the lower frequency.
Change the size in samples of the FFT.
Available sizes : 1024, 2048, 4096, 8192.
Change the processing rate. Increase precision and CPU consumption.
Available rates : 1, 2, 4.
Indicates the overall latency of the system in milliseconds between successive calculation outputs.
Opens the web browser to display information or help about the selected object, if it exists.
For more details about information/help creation, see create-help-file.
Description of the module for internal help purposes only. The description is not displayed in the interface.
visible only in god mode, see setup-panel-tab-expert.
Current private ID for this control used to identify the object.
Current private preset ID for this control used for presets.
If you experience difficulties in Polyphonic mode, try to recreate new id(s) with this button.
Each Patch shared on the local network uses its own ID (identification number). If you experience issues of Patches that don't send information to the good target, this button will rebuild all these id's.
Absolute remote address. see objects-address.
Local to the current patch remote address. see objects-address.
User defined remote address. see objects-address.
version 7.0.250121
Edit All Pages