Block Frequency Domain Adaptive Filter Algorithm |
1. Abstract
This document describes BFDAFAEC.EXE, the command line Acoustic Echo Canceller (AEC) based on the Block Frequency Domain Adaptive Filter (BFDAF) for 32-bit Windows operating systems. BFDAFAEC lets you change the adaptive filter parameters, run the AEC on two audio input files, namely the Far End Speech (FES), and Near End Speech (microphone) signals, and writes the echo free signal to an output file. BFDAFAEC will also print the processing time, and you can instruct it to save the filter coefficients for later examination. BFDAFAEC is meant to serve as a demonstrator for the performance of the BFDAF algorithm but can also be used as a design or a teaching tool.
2. Copyright
The software described in this document is protected by copyright laws and international treaty provisions. DSP ALGORITHMS grants you the right to use this software under the terms and conditions stated in the accompanying file "license.txt". Please make sure that you carefully read and fully understand the license agreement before installing or using this software.
3. Limitations
This demo version of the acoustic echo canceller described here does not include many functions available in DSP ALGORITHMS' commercial echo cancellers such as double-talk detector, far end speech detector, near end speech detector, control unit, automatic gain control, non-linear processor, or noise canceller. It implements the adaptive filter block only. Future versions may include more features as appropriate for the purpose of demonstrating the capabilities of the full version.
The demo version is NOT the fastest code. It has been compiled to run on any Intel 80386 compatible processor, and therefore does not include code that support MMX and SIMD instructions as in the commercial version. The demo version is also limited to processing MONO audio files. For processing real-time stereo and for evaluating the fast version of this software please email to support@dspalgorithms.com
BFDAFAEC prints the time it takes to process the complete input files. This processing time is estimated using the system time and therefore will be accurate enough for long input files. You can compare the processing time in seconds with the duration of the input files in seconds to see whether your machine will meet the real-time requirements for the specific AEC parameters you have chosen.
4. Audience
Guavec, NLMSAEC, BFDAFAEC, and PBFDAF were all originally intended as design tools to help DSP ALGORITHMS' customers in understanding and integrating the AEC software into their products. The software however proved very useful in teaching engineers the basics of adaptive filters in general, the BFDAF algorithm in particular, and the application of adaptive filters in acoustic echo cancellers.
5. Acoustic echo Cancellers

Acoustic echo is inevitable whenever a loudspeaker is placed near a microphone in a full-duplex communication application. This is the case in speaker phones, audio and video conferencing, desktop communication, and many other communication scenarios. Especially hands-free mobile communication kits for cars are becoming increasingly important due to safety regulations introduced in more and more countries. In all those and similar communication scenarios, the voice from the loudspeaker (Far End Speech, FES) is inevitably picked by the microphone (NES) and transmitted back to the remote speaker as shown in Fig. 1. This makes the remote speaker hear her own voice distorted and delayed by the communication channel, which is known as echo. The longer the channel delay, the more annoying the echo becomes until it makes natural conversation impossible and decreases the perceived quality of the communication service. It is therefore absolutely necessary to avoid transmitting back the echo picked by the microphone.
Modern full-duplex communication systems make use of an acoustic echo canceller (AEC) to prevent the echo from being transmitted back to the channel. The AEC is employed in each terminal, and has completely different requirements than the Network Echo Canceller employed by the telephone network provider to eliminate the electric echo. The AEC basically estimates the echo and subtracts the estimated echo from the microphone signal as shown in Fig.1. The resulting signal (RESidual) is transmitted to the far end speaker through the communication channel.
6. BFDAFAEC.EXE, The Command Line Version
BFDAFAEC.EXE accepts command line parameters to allow experimenting with different audio/speech signals at different sampling frequencies, as well as different parameters for the AEC itself. This section describes those parameters and gives some guidelines on how to choose their values for optimum performance. The command line arguments correspond to internal variables with the same names. This makes real-time operation and integration into other applications as simple as directly assigning values to those variables in the source file and recompiling. The calling syntax is as following:
bfdafaec.exe [/ARG <value>] [ /ARG <value>] [à]
All command line arguments have default values, so no need to list them all if you are satisfied with the default. The order of the arguments in the command line is not important. As an example, the following command
bfdafaec.exe /inFile aecfes.wav /micFile aecnes.wav /outFile aecre.wav /filterLength 512 /blockLen 128 /stepSize 0.04 /pref accuracy /delay 0 /print 0
will read the FES signal from the
aecfes.wav audio file, the microphone signal from aecnes.wav, and writes the echo-free speech (residual) to the aecres.wav wave file. The adaptive filter is set to have 512 coefficients with 128 samples processed per block. This means that each 128 samples, the adaptive filter will update all its 512 coefficients and calculate 128 samples of the echo-free signal. The stepSize parameter that controls the convergence speed is set to 0.04. The rest of the parameters are explained in detail below. It is recommended however to use a batch file similar to the one supplied with the software for running the AEC to avoid typing long commands.6.1 Input and output files
The AEC uses two signals as input and produces one output signal. The inputs are the FES and the microphone (NES) signals as shown in Fig. 1. You can specify those two signals to BFDAFAEC by giving wave file names as values for the command line arguments
inFile, and micFile, respectively. Note that the AEC will work correctly only if the audio signal in micFile is correctly aligned in time with the signal in inFile. The parameter delay is provided for that reason as mentioned below. In real-time operation this synchronization is automatically performed. BFDAFAEC also supports many other audio file formats such as AIFF and AU.The output from the AEC is the echo-free signal RES. Use the command line argument
outFile to choose the output file name. When the near-end speech is zero (the local speaker is listening), this file should contain the micFile signal decaying in amplitude with time, until silence is heard. The performance of the AEC can be measured by inspecting this output file.6.2 filterLength
The success of the AEC in removing the echo from the Mic signal depends on many (inter-dependent) factors. To start with, for the adaptive filter to make a good estimation of the echo, the filter should have sufficient number of coefficients. The number of coefficients has direct relation with the amount of echo reduction (dB reduction). Correctly choosing the number of filter coefficients is, therefore, very important. Shorter filters result in low echo reduction and longer filters result in slow convergence.
As an example, assume that we need to obtain, say 20 dB of acoustic echo reduction in a certain room. Suppose that the room has a reverberation time of 600 ms. The reverberation time is the time it takes the sound to decay by 60 dB from its initial value when a sound source (already playing for long in the room) is suddenly stopped. This means that it takes 200 ms for the sound to decay 20 dB from its initial value. For the purpose of designing our AEC, we can then assume that the room impulse response (IR) between the loudspeaker and the microphone will have negligible values beyond the 200 ms. For the AEC to produce the 20 dB echo reduction, the filter impulse response has to be at least 200 ms long. Assuming the speech is sampled at 8kHz, this calculated in samples (or number of coefficients) becomes 0.2s * 8000 samples/s = 1600. Choosing
filterLength less than 1600 will NOT give the required 20 dB echo reduction. It is not guaranteed however that a filter of 1600 coefficients will result in the required 20 dB reduction.
6.3 stepSize
The second parameter affecting the AEC performance is the convergence time. This is the time needed for the filter to adjust itself to reach a reasonable approximation of the room impulse response between Spkr and Mic, and consequently, resulting in a reasonable echo reduction. The convergence time depends on factors such as the statistics of the FES input signal, the number of filter coefficients, and the algorithm used to adapt the filter.
The BFDAF algorithm convergence time is controlled by one variable, the
stepSize, and is theoretically independent of the input signal statistics, since normalization per frequency bin is performed internally in the frequency domain. The stepSize variable should always be chosen between zero and one. A stepSize = 0 will result in a filter that never converges (convergence time = infinity). A stepSize = 1, will probably result in an unstable filter (one that changes very fast and therefore diverges instead of converging). Smaller stepSize result in more accurate estimation of the echo, and therefore produces higher echo reduction. The best way to find the optimal stepSize in a specific application is by experimenting with different values and choosing the one resulting in the fastest convergence time and provides the required echo reduction.6.4 blockLen
The third factor of importance in an AEC is the delay the algorithm introduces during processing. Sample per sample time-domain algorithms such as the (N)LMS have zero processing delay, but show poor convergence behavior and need a huge computation effort for long filters, such as those used in acoustic echo cancellers. You can examine this by comparing the processing time of NLMSAEC and one of BFDAFAEC or PBFDAFAEC. Frequency-domain adaptive algorithms such as the Block Frequency Domain Adaptive Filter (BFDAF) introduce latency
blockLen samples, but provide superior convergence, excellent echo reduction performance, and can be implemented very efficiently.The algorithm processing delay in BFDAFAEC can be controlled by using the command line argument
blockLen. This means that blockLen samples are first collected before processing can start, which generates a processing delay equals to the time needed to collect the block. In embedded real-time systems that read from and write to the outside world on sample per sample basis, this delay is noticeable and in most cases is objectionable. This problem is solved in the Partitioned BFDAF algorithm (PBFDAF) by dividing the filter into smaller partitions, and therefore reducing the delay to the first partition only, which is a fraction of the delay in the BFDAF.In systems where blocks of samples are read and written to the outside world, the algorithm delay is integrated in the input/output process. This is the case for instance when a decoder is used before the AEC, which works on blocks of, samples (G.723.1 for instance) or when a non real-time operating system such as Windows (or Linux) is used. In the latter case, the audio driver captures a buffer of samples and returns it to the application when the buffer is full. Playing the sound is also done whenever the application is ready with filling a buffer. In those cases, the algorithm delay is buried into the buffering process.
In the case of BFDAF, a value of
blockLen = filterLength will result in the maximum computational efficiency. If this computationally optimum case results in an unacceptable delay, blockLen can be chosen less than filterLength on the cost more computations being performed per sample. When processing delay is an issue and computational efficiency is crucial, PBFDAF, also available from DSP ALGORITHMS, is recommended. PBFDAF is as efficient as the BFDAF, but introduces less delay.6.5 Preference
To give the system designer further control on the AEC performance, the
pref parameter has been introduced. This parameter can take a value either speed or accuracy. If speed is chosen, the AEC will use the unconstrained BFDAF algorithm, which skips performing three Fast Fourier Transformation (FFT) computations every blockLen samples. This however is traded against a slight inaccuracy, which can be adjusted by reducing the stepSize value, if necessary. If accuracy is chosen, the constrained BFDAF algorithm is used, which is an exact implementation of the BFDAF, but consumes more computations. To give an idea of the computational saving obtained by choosing for speed, suppose that blockLen = filterLength = 256. This will lead the algorithm to choose the internal parameter fftLength to be 512. This results in saving about 3 * 512 * log2(512/8) = 9216 multiplications and additions every 256 samples. Or 36 multiplications and 36 additions each sample.6.6 delay
This parameter is provided to adjust the FES and NES input signals to be aligned in time and to make the filter skip estimating the delay it takes the sound to travel from the loudspeaker (Spkr) to the microphone (Mic), to save calculations. As an example, suppose that Spkr is placed 68 cm from Mic. Since the sound travels in air with a speed of about 340 m/s, the sound needs (68 cm / 34000 cm/s = 2 ms) to travel from the Spkr to Mic. Assuming the speech is sampled at 8kHz, this can be expressed in samples as (0.002 s * 8000 samples/s) = 16 samples. This means that the direct sound from Spkr to Mic will appear in the filter after 16 coefficients, while the first 16 coefficients should be zeros. To save computations, the filter is shifted 16 samples using the
delay parameter so that the filter will not attempt to estimate those zeros.6.7 print
This parameter is a flag specifying whether to print the filter coefficients to a text file or not. A value of 0 will cause BFDAFAEC not to save the coefficients. A value of 1 will cause the coefficients to be saved into a file called aeccoef.txt. The file will be overwritten if already existed.
7. Performance

The performance of the BFDAF AEC is shown in Fig.2. This result has been obtained by running BFDAFAEC with FES = aecfes.wav, NES = aecnes.wav, and RES=aecres.wav, filterLength = 512, blockLen = 256, and stepSize = 0.2. You will find the above three wave files in the wav directory where you installed the software package. The top graph in Fig.2 is the estimated and actual impulse responses between the loudspeaker and the microphone. The middle graph in Fig. 2 shows two speech signals, the NES, (echo or microphone signal) in blue
and the echo canceller output RES, (residual) in green. The bottom graph is the energy of RES signals divided by the energy of NES and the result expressed in dB, which indicates how fast the filter converges. Note however that this quantity is meaningful only when the NES is non-zero, in other words, when there is speech to cancel.
It is clear from those graphs that the filter could establish a good estimate of the echo very quickly. The echo estimate improves with the time, which makes the residual smaller and smaller. After less than one second (8000 samples on the graph) 20 dB reduction has bee achieved. The convergence however can be controlled by the
stepSize parameter as mentioned before.

Commercial acoustic echo cancellers provided by DSP ALGORITHMS are much more complex than the one implemented in this simple version. A commercial AEC includes the following components besides the adaptive filter as shown in Fig. 3;
9. For more information
If you are interested in test driving the full version of the acoustic echo canceller, or you have any questions or comments on this document or the software described herein, please email to support@dspalgorithms.com. We will be glad to help you.