GitHunt
HA

Singing Voice Quality and Technique Database (SVQTD) is a classical male singing dataset for describing classical tenor singing voices from vocal pedagogy point of view.

Data Request instructions are in the project page here.

Dataset preparation

  1. download youtube videos with a python script and convert to audios using ffmpeg
  2. performing music source separation based on spleeter
  3. energy-based segmentation, reference code can be found in ./split.py
  4. extracting feature set using OPENSMILE (optional, only if you are interested in training with traditional feature set)

Training files

  • Some pooling method for recognition neural network can be found in ./modules.
  • Some models are in ./models.
  • Some config files for respectively training Transformer and ResNet are in ./config.
  • ./E2E.py can be used to train neural networks based on config files.
  • ./RPSVM.py can be used to extract embeddings and train a SVM classifier using them.
  • ./FSSVM.py can be used to train a SVM classifier using features from ComParE feature set.

Since our code is not user-friendly, if you have any questions about dataset downloading or the training code, please feel free to contact me through yanze.xu@outlook.com. Also welcome to talk with me if you are interested in timbre phenoemena.

Contributors

MIT License
Created January 31, 2021
Updated September 15, 2025