Courses & TutorialsProgramming
Awesome Python Scientific Audio – Massive Collection of Resources
Contents
- Audio Related Packages
- Read/Write
- Transformations – General DSP
- Feature extraction
- Data augmentation
- Speech Processing
- Environmental Sounds
- Perceptial Models – Auditory Models
- Source Separation
- Music Information Retrieval
- Deep Learning
- Symbolic Music – MIDI – Musicology
- Realtime applications
- Web – Audio
- Audio related APIs and Datasets
- Wrappers for Audio Plugins
- Tutorials
- Books
- Scientific Paper
- Other Resources
- Related lists
- Contributing
- License
Audio Related Packages
- Total number of packages: 66
Read-Write
- audiolazy
📦 – Expressive Digital Signal Processing (DSP) package for Python.
- audioread
📦 – Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.
- mutagen
📦 – Reads and writes all kind of audio metadata for various formats.
- pyAV
– PyAV is a Pythonic binding for FFmpeg or Libav.
- (Py)Soundfile
📦 – Library based on libsndfile, CFFI, and NumPy.
- pySox
📦 – Wrapper for sox.
- stempeg
📦 – read/write of STEMS multistream audio.
- tinytag
📦 – reading music meta data of MP3, OGG, FLAC and Wave files.
- audiomate
📦 – Loading different types of audio datasets.
Transformations – General DSP
- acoustics
📦 – useful tools for acousticians.
- AudioTK
– DSP filter toolbox (lots of filters).
- AudioTSM
📦 – real-time audio time-scale modification procedures.
- Gammatone
– Gammatone filterbank implementation.
- pyFFTW
📦 – Wrapper for FFTW(3).
- NSGT
📦 – Non-stationary gabor transform, constant-q.
- MDCT
📦 – MDCT transform.
- pydub
📦 – Manipulate audio with a simple and easy high level interface.
- pytftb
– Implementation of the MATLAB Time-Frequency Toolbox.
- pyroomacoustics
📦 – Room Acoustics Simulation (RIR generator)
- PyRubberband
📦 – Wrapper for rubberband to do pitch-shifting and time-stretching.
- PyWavelets
📦 – Discrete Wavelet Transform in Python.
- Resampy
📦 – Sample rate conversion.
- SFS-Python
📦 – Sound Field Synthesis Toolbox.
- STFT
📦 – Standalone package for Short-Time Fourier Transform.
Feature extraction
- aubio
📦 – Feature extractor, written in C, Python interface.
- audiolazy
📦 – Realtime Audio Processing lib, general purpose.
- essentia
– Music related low level and high level feature extractor, C++ based, includes Python bindings.
- python_speech_features
📦 – Common speech features for ASR.
- pyYAAFE
– Python bindings for YAAFE feature extractor.
- speechpy
📦 – Library for Speech Processing and Recognition, mostly feature extraction for now.
Data augmentation
- audiomentations
📦 – Audio Data Augmentation.
- muda
📦 – Musical Data Augmentation.
- pydiogment
📦 – Audio Data Augmentation.
Speech Processing
- aeneas
📦 – Forced aligner, based on MFCC+DTW, 35+ languages.
- deepspeech
📦 – Pretrained automatic speech recognition.
- gentle
– Forced-aligner built on Kaldi.
- Parselmouth
📦 – Python interface to the Praat phonetics and speech analysis, synthesis, and manipulation software.
- persephone
📦 – Automatic phoneme transcription tool.
- pyannote.audio
📦 – Neural building blocks for speaker diarization.
- pyAudioAnalysis²
📦 – Feature Extraction, Classification, Diarization.
- py-webrtcvad
📦 – Interface to the WebRTC Voice Activity Detector.
- pypesq
– Wrapper for the PESQ score calculation.
- pystoi
📦 – Short Term Objective Intelligibility measure (STOI).
- PyWorldVocoder
– Wrapper for Morise’s World Vocoder.
- Montreal Forced Aligner
– Forced aligner, based on Kaldi (HMM), English (others can be trained).
- SIDEKIT 📦 – Speaker and Language recognition.
- SpeechRecognition
📦 – Wrapper for several ASR engines and APIs, online and offline.
Environmental Sounds
Perceptial Models – Auditory Models
- cochlea
📦 – Inner ear models.
- Brian2
📦 – Spiking neural networks simulator, includes cochlea model.
- Loudness
– Perceived loudness, includes Zwicker, Moore/Glasberg model.
- pyloudnorm
– Audio loudness meter and normalization, implements ITU-R BS.1770-4.
- Sound Field Synthesis Toolbox
📦 – Sound Field Synthesis Toolbox.
Source Separation
- commonfate
📦 – Common Fate Model and Transform.
- NTFLib
– Sparse Beta-Divergence Tensor Factorization.
- NUSSL
📦 – Holistic source separation framework including DSP methods and deep learning methods.
- NIMFA
📦 – Several flavors of non-negative-matrix factorization.
Music Information Retrieval
- Catchy
– Corpus Analysis Tools for Computational Hook Discovery.
- Madmom
📦 – MIR packages with strong focus on beat detection, onset detection and chord recognition.
- mir_eval
📦 – Common scores for various MIR tasks. Also includes bss_eval implementation.
- msaf
📦 – Music Structure Analysis Framework.
- librosa
📦 – General audio and music analysis.
Deep Learning
- Kapre
📦 – Keras Audio Preprocessors
- TorchAudio
– PyTorch Audio Loaders
Symbolic Music – MIDI – Musicology
- Music21
📦 – Toolkit for Computer-Aided Musicology.
- Mido
📦 – Realtime MIDI wrapper.
- mingus
📦 – Advanced music theory and notation package with MIDI file and playback support.
- Pretty-MIDI
📦 – Utility functions for handling MIDI data in a nice/intuitive way.
Realtime applications
- Jupylet
– Subtractive, additive, FM, and sample-based sound synthesis.
- PYO
– Realtime audio dsp engine.
- python-sounddevice
📦 – PortAudio wrapper providing realtime audio I/O with NumPy.
Web Audio
- TimeSide (Beta)
– high level audio analysis, imaging, transcoding, streaming and labelling.
Audio related APIs and Datasets
- beets
📦 – Music library manager and MusicBrainz tagger.
- dsdtools
📦 – Parse and process the demixing secrets dataset.
- medleydb
– Parse medleydb audio + annotations.
- Soundcloud API
📦 – Wrapper for Soundcloud API.
- Youtube-Downloader
📦 – Download youtube videos (and the audio).
Wrappers for Audio Plugins
- VamPy Host 📦 – Interface compiled vamp plugins.
Tutorials
- Whirlwind Tour Of Python
– fast-paced introduction to Python essentials, aimed at researchers and developers.
- Introduction to Numpy and Scipy
– Highly recommended tutorial, covers large parts of the scientific Python ecosystem.
- Numpy for MATLAB® Users – Short overview of equivalent python functions for switchers.
- MIR Notebooks
– collection of instructional iPython Notebooks for music information retrieval (MIR).
- Selected Topics in Audio Signal Processing – Exercises as iPython notebooks.
Books
- Python Data Science Handbook – Jake Vanderplas, Excellent Book and accompanying tutorial notebooks.
- Fundamentals of Music Processing – Meinard Müller, comes with Python exercises.
Scientific Papers
- Python for audio signal processing – John C. Glover, Victor Lazzarini and Joseph Timoney, Linux Audio Conference 2011.
- librosa: Audio and Music Signal Analysis in Python, Video – Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015.
- pyannote.audio: neural building blocks for speaker diarization, Video – Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020.
Other Resources
- Coursera Course – Audio Signal Processing, Python based course from UPF of Barcelona and Stanford University.
- Digital Signal Processing Course – Masters Course Material (University of Rostock) with many Python examples.
- Slack Channel – Music Information Retrieval Community.
Related lists
Awesome Python Resources List
Awesome Python Asyncio Resources List
Tags
Python