EE107c | Naive Reverb Deconvolution

Overview

The purpose of this is project is to implement and explore the limitations of a using a naive deconvolution algorithm on the impulse response of a real acoustic space, such as one that might be encountered in a concert recording situation. Deconvolution refers here to a process intended to reverse the acoustic "response" of a room. Every room has a unique acoustic signature that results in a characteristic sound imparted to any sound generated and heard (or recorded) in that room. For example many people like to sing in the shower because the abundance of strong and fast (i.e. short) reflections of sound give a strong and confident sound to the human voice.

In controlled situations, the sound characteristic of the room, the impulse response, can be carefully measured using a known input signal. With a simple signal processing technique called deconvolution we can remove the impulse response of the room from recordings made with the same controlled set of parameters used to record the impulse response. Theoretically it should be possible to recover any 'reasonable' input signal from a recording of the room response to that signal, given the impulse response of the room.

Deconvolution and many other signal processing schemes rely on a mathematical property of a system called linearity. Here's a link to a brief discussion of the meaning of linearity, and why an acoustic space can be modeled as a linear system.


Naive Deconvolution Algorithm

If the system is causal LTI, then it is possible to find a filter for the system inversion response h^-1(t). Here is a brief outline of the deconvolution operator.


Nonlinear Effects

There are are limits to the effectiveness of deconvolution method. Below is a link to a discussion of nonlinearities that can affect the results of this technique.

Also since these signals are finite length we need to avoid circular convolution effects in the output signal. This is accomplished by using test signals long compared to the effective impulse response, and time-windowing the edges of audio vectors prior to the system input. This practically means the music signal is "faded in" and "faded out."


Results

Summary of Results

The basic deconvolution process works very well with very little noise on simple test signals (as we might expect). On real music the results are good, but with a great deal more noise artifacts present. These artifacts result from spurious contributions of noise in the FFT of the recorded signal. This is an essential difficulty with modeling a real recording situation as an LTI system. Check out the results for yourself to hear what I'm talking about.

Audio Files

I chose the music from Autechre's Peel Sessions EP as a test sample because the source signal has very little reverberation and contains a wide range of frequencies, allowing a good basis of judging the results of the algorithm.

The following two files are stereo WAV files (44.1kHz). The right channel is the system input, the left channel is the recorded system response:

The following two files are mono WAV files (44.1kHz):


Source Code

All signal processing was coded with MATLAB 6.1, an extremely verstatile and useful data manipulation package. Here are the source files I used to process and manipulate my data:


Other Equipment

System responses were recorded with an AKG C-414 ULS in omni mode through a Symetrix SX202 mic preamp. Source signals were played back on a Fostex CR300 Compact Disc Recorder. Amplification for system input was supplied by an Alesis RA100 through a single Yamaha NS-10 passive studio monitor. Source and response signals were simultaneously recorded on a Panasonic SV-800 DAT recorder, and transferred later via S/PDIF to WAV format through an Echo Audio Gina 24/96 PCI audio interface.


Further Work

I would like to explore the effectiveness of various noise reduction algorithms in improving the deconvolved result. I will implement basic spectral subtraction and wavelet thresholding code to attempt to improve upon the results presented here.

It should also be possible to obtain a much better signal to noise ratio on the system function by using a Maximum Length Sequence (MLS) method. This could help reduce some of the spectral artifacts annoyingly audible in the deconvolved result.

The motivation behind this project is my pursuit of "blind" deconvolution of a reverberant impulse response from a recorded signal. Clearly having some prior knowledge of the class of impulse response one is dealing with is very helpful. In the case of a real reverberant impulse response, we can assume that the impulse response is linear time invariant, causal, and roughly describable as a tapped delay line.

I am currently working on algorithms to estimate a reverberant impulse applied to single sound source response without any prior knowledge of the impulse response. Needless to say this is a big project, so here I am tackling the simplest possible case, where the impulse response is known.


References

  1. Oppenheim, Wilsky, and Nawab. Signals and Systems, Prentice Hall 1997.

6/2/2002 Matthew Ashman
Senior
Engineering & Applied Science
California Institute of Technology
ashman@ugcs.caltech.edu

last modified 25 September 2003