YI
YinPing-Cho/PCS-FIR-Filter
A time-domain extension to "Perceptual Contrast Stretching on Target Feature for Speech Enhancement"
PCS-FIR-Filter
Based on the spectral perceptual gains from the official PCS repo, this project aims to derive the equivalent linear-time-invariant (LTI) finite-impulse-response (FIR) filter coefficients to allow Perceptual Contrast Stretching (PCS) be performed directly on waveforms.
FIR filtering is a differentiable operation, which makes it ideal for Deep Learning applications working directly on waveforms. The FIR filtering example in this project is performed with PyTorch 1-D convolution layer. Of course, the derived filter coefficients (in numpy format) can also be easily applied to other backends.
Requirements
torch >= 1.8
torchaudio
matplotlib
Soundfile
numpy
scipy
Available in requirements.txt
Usage
- Filter design:
python PCS_coeffs_generate.py --mode='manual'generates FIR filter coefficients (in*.npyformat) and impulse response plot under directorygenerated_freq_response/with default spectral PCS coefficients.- Since the original PCS (spectral PCS) works on log-1-p spectrograms, the nonlinearity cannot be reproduced directly with LTI FIR filters; therefore,
python PCS_coeffs_generate.pyprovides two additional statistical filter design methods to approximate the behavior of spectral PCS:python PCS_coeffs_generate.py --mode='statistical' --stat_mode='gaussian'measures and approximate spectral PCS's equivalent LTI impulse response with Gaussian signals of varying standard deviations.python PCS_coeffs_generate.py --mode='statistical' --stat_mode='wav' --wav_dir='*'measures and approximate spectral PCS's equivalent LTI impulse response with the .wav files you placed inwav_dir.
- FIR Filtering with wave-PCS:
python test_PCS_wave.pyperforms wave-PCS with the FIR filter coefficients derived byPCS_coeffs_generate.pyand outputs filtered audio.
- Quick comparison to spectral PCS:
python test_PCS_spectral.pyperforms spectral PCS with official repo's PCS functions. This snippet is meant for comparing how the FIR wave-PCS's result compares to the original spectral PCS.
Example Results
- Frequency response of the FIR filter coefficients derived from the default PCS settings with
GAIN_SMOOTHING = 0.2:
- Frequency response of the FIR filter coefficients derived with audio-wav-based statistical method with Mpop600 Mandarin singing voice dataset:
Reference
- The official repo of PCS (https://github.com/RoyChao19477/PCS).
- The original PCS paper: Rong Chao, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao, "Perceptual Contrast Stretching on Target Feature for Speech Enhancement," (http://arxiv.org/abs/2203.17152)
- Mpop600 Mandarin singing voice dataset: C. -C. Chu, F. -R. Yang, Y. -J. Lee, Y. -W. Liu and S. -H. Wu, "MPop600: A Mandarin Popular Song Database with Aligned Audio, Lyrics, and Musical Scores for Singing Voice Synthesis," 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2020, pp. 1647-1652.
On this page
Languages
Python100.0%
Contributors
Latest Release
v1.0.0-alphaMay 5, 2022Created April 22, 2022
Updated September 5, 2025



