Reflection

Frequency Detection:

Peak detection for phase-vocoding worked well. Unfortunately, we could not figure out a good way to use the peak information to manipulate the pitch of the sound in a way that preserves its quality.
In the future, we need to look into methods of converting into the frequency domain that are less computationally intensive. We were significantly limited by the computational intensity of some of our approaches.

Denoising:

Denoising worked well: We successfully increased SNR from 15dB to 30dB.
However there are some problems: while we preserved signal and suppressed noise, some important features of the signal is lost. Although we tried our best to keep the instrumental features of the signal, some of them are lost. They are mainly high-frequency, low-intensity parts.
However there are some problems: while we preserved signal and suppressed noise, some important features of the signal is lost. Altough we tried our best to keep the instrumental features of the signal, some of them are lost. They are mainly high-frequency, low-intensity parts.

Pitch Correction:

We found the PSOLA proved to be the most effective algorithm for pitch-correction.
Simplistic algorithms were sometimes effective at pitch-correction but only at the cost of severely damaging the quality of the sound
More advanced techniques may improve sound quality in the future. For example, precisely manipulating peaks with a phase-vocoder algorithm could help to reduce unpleasant side effects of modifying an entire signal.

Smoothing:

Naive smoothing technique is used and proved to be pretty good. We didn't test out different smoothing algorithms due to time constraints.

In-Class and Out-of-Class Techniques

Some of the in-class techniques we used were

Windowing a signal to separate it into small segments
Converting to the frequency domain with STFT (essentially a bunch of FFTs)
Processing signals in other bases (e.g. Wavelet Transform)
Analyzing the trade-off between sampling rate for windowing and transforming with STFT, which improves signal quality, and computation time

Some of the out-of-class techniques we used were

Phase-Vocoding for pitch-identification and shifting
PSOLA for pitch shifting
Gaussian smoothing
Block thresholding for de-noising

What We've Learned

Here is a quick summary of the techniques and skills we have learned:

Using Short-Time Fourier Transforms and Wavelet Transforms to transform and analyze signals.
Applying noise identification and thresholding techniques for denoising.
Using phase-vocoder and threshold based algorithms for pitch shifting without a ground truth.
Using column-based, segment-based, and pitch-based algorithms for matching a ground truth.
Applying PSOLA to match an audio signal to a ground truth.
Balancing audio quality and desired effects with compute time.

References

[1] G. Yu, S. Mallat and E. Bacry, "Audio Denoising by Time-Frequency Block Thresholding," in IEEE Transactions on Signal Processing, vol. 56, no. 5, pp. 1830-1839, May 2008.

[2] Phu Ngoc Le, E. Ambikairajah and E. Choi, "An improved soft threshold method for DCT speech enhancement," 2008 Second International Conference on Communications and Electronics, Hoi an, 2008, pp. 268-271.

[3] Verhelst, Werner and Roelands Marc, “An Overlap-add Technique based on Waveform Similarity for High Quality Time-Scale Modification of Speech” https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=319366

[4] Cheng，Corey， “Design of a pitch quantization and pitch correction system for real-time music effects signal processing”https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6411995

[5] “Frequencies for equal-tempered scale, A4 = 440 Hz”. http://pages.mtu.edu/~suits/notefreqs.html

[6] Moshe, Dorit. “Denoising Using Wavelets”. http://cs.haifa.ac.il/hagit/courses/seminars/wavelets/Presentations/ Lecture09_Denoising.pdf

[7] A. Nagel "ELEN E4810 Digital Signal Processing Final Project: Pitch Correction". http://www.columbia.edu/~agn2114/index.html

Codes and data

eecs351_project_group2.zip
File Size:	12608 kb
File Type:	zip

Download File

<< Results

AUTO-TUNE