You can prototype audio processing algorithms in real time or run custom acoustic measurements by streaming low-latency audio to and from sound cards. You can validate your algorithm by turning it into an audio plugin to run in external host applications such as Digital Audio Workstations. Plugin hosting lets you use external audio plugins as regular MATLAB objects.
Use deep learning to carry out complex signal processing tasks and extract audio embeddings with a single line of code. Access established pre-trained networks like YAMNet, VGGish, CREPE, and OpenL3 and apply them with the help of preconfigured feature extraction functions.
Set up randomized data augmentation pipelines using combinations of pitch shifting, time stretching, and other audio processing effects. Create synthetic speech recordings from text using text-to-speech cloud-based services.
Automatically create user interfaces for tunable parameters of audio processing algorithms. Test individual algorithms with the Audio Test Bench app and tune parameters in running programs with auto-generated interactive controls.
Prototype audio processing designs with single-sample inputs and outputs for adaptive noise control, hearing aid validation, or other applications requiring minimum round-trip DSP latency. Automatically target Speedgoat audio machines and ST Discovery boards directly from Simulink models.
Automatic respiratory conditional screening system development pipeline. A typical system usually starts with audio data collection, followed by data pre-processing. Hand-crafted feature with traditional machine learning classification models or end-to-end deep learning models can be constructed. Before deployment to the public, the performance of the developed model needed to be validated on real-world clinical data. (A color version of this figure is available in the online journal.)
Frequently explored respiratory acoustic features include temporal features such as onset, tempo, period, cross-zero-rate (CZR), beat-loudness, as well as spectral features like HNR, jitter, shimmer, Mel-frequency cepstral coefficients (MFCCs), spectral centroid, and roll-off frequency.44,45 There are many existing libraries that can be leveraged to automatically extract those features from raw signals, among which Librosa is a well-known Python-based programming tool.46 However, differences in audio signals associated with different respiratory conditions can be complex, subtle, and in-explicit, and thus, the above-mentioned features could be insufficient to distinguish various conditions. To this end, a number of statistical functionals have been proposed to extract massive high-order descriptors, such as the mean, delta, peak, and percentiles of those features across all frames of audio, showing favorable performance in many relevant tasks.36 openSMILE,47 MIRToolbox,48 and others are open-sourced tools for such feature set extraction, speeding up the processing procedure. 1e1e36bf2d