Partitioned Block Frequency Domain Adaptive Filter Algorithm |
1. Abstract
This document describes PBFDAFAEC.EXE, the command line Acoustic Echo Canceller (AEC) based on the Partitioned Block Frequency Domain Adaptive Filter (PBFDAF) for 32-bit Windows operating systems. Like BFDAFAEC, PBFDAFAEC lets you change the adaptive filter parameters, run the AEC on two audio input files, namely the Far End Speech (FES), and Near End Speech (microphone) signals, and writes the echo free signal to an output file. PBFDAFAEC will also print the processing time, and you can instruct it to save the filter coefficients for later examination. PBFDAFAEC is meant to serve as a demonstrator for the performance of the PBFDAF algorithm and but can also be used as a design or a teaching tool.
2. Copyright
The software described in this document is protected by copyright laws and international treaty provisions. DSP ALGORITHMS grants you the right to use this software under the terms and conditions stated in the accompanying file "license.txt". Please make sure that you carefully read and fully understand the license agreement before installing or using this software.
3. Limitations
This demo version of the acoustic echo canceller described here does not include many functions available in DSP ALGORITHMS' commercial echo cancellers such as double-talk detector, far end speech detector, near end speech detector, control unit, automatic gain control, non-linear processor, or noise canceller. It implements the adaptive filter block only. Future versions may include more features as appropriate for the purpose of demonstrating the capabilities of the full version.
The demo version is NOT the fastest code. It has been compiled to run on any Intel 80386 compatible processor, and therefore does not include code that support MMX and SIMD instructions as in the commercial version. The demo version is also limited to processing MONO audio files. For processing real-time stereo and for evaluating the fast version of this software please email to support@dspalgorithms.com
PBFDAFAEC prints the time it takes to process the complete input files. This processing time is estimated using the system time and therefore will be accurate enough for long input files. You can compare the processing time in seconds with the duration of the input files in seconds to see whether your machine will meet the real-time requirements for the specific AEC parameters you have chosen.
4. Audience
Guavec, NLMSAEC, BFDAFAEC, and PBFDAF were all originally intended as design tools to help DSP ALGORITHMS' customers in understanding and integrating the AEC software into their products. The software however proved very useful in teaching engineers the basics of adaptive filters in general, the PBFDAF algorithm in particular, and the application of adaptive filters in acoustic echo cancellers.
5. Acoustic echo Cancellers

Acoustic echo is inevitable whenever a loudspeaker is placed near a microphone in a full-duplex communication application. This is the case in speaker-phones, audio and video conferencing, desktop communication, and many other communication scenarios. Especially hands-free mobile communication kits for cars are becoming increasingly important due to safety regulations introduced in more and more countries. In all those and similar communication scenarios, the voice from the loudspeaker (Far End Speech, FES) is inevitably picked by the microphone (NES) and transmitted back to the remote speaker as shown in Fig. 1. This makes the remote speaker hear her own voice distorted and delayed by the communication channel, which is known as echo. The longer the channel delay, the more annoying the echo becomes until it makes natural conversation impossible and decreases the perceived quality of the communication service. It is therefore absolutely necessary to avoid transmitting back the echo picked by the microphone.
Modern full-duplex communication systems make use of an acoustic echo canceller (AEC) to prevent the echo from being transmitted back to the channel. The AEC is employed in each terminal, and has completely different requirements than the Network Echo Canceller employed by the telephone network provider to eliminate the electric echo. The AEC basically estimates the echo and subtracts the estimated echo from the microphone signal as shown in Fig.1. The resulting signal (RESidual) is transmitted to the far end speaker through the communication channel.
6. PBFDAFAEC.EXE, The Command Line Version
PBFDAFAEC.EXE accepts command line parameters to allow experimenting with different audio/speech signals at different sampling frequencies, as well as different parameters for the AEC itself. This section describes those parameters and gives some guidelines on how to choose their values for optimum performance. The command line arguments correspond to internal variables with the same names. This makes real-time operation and integration into other applications as simple as directly assigning values to those variables in the source file and recompiling. The calling syntax is as following:
pbfdafaec.exe [/ARG <value>] [ /ARG <value>] [à]
All command line arguments have default values, so no need to list them all if you are satisfied with the default. The order of the arguments in the command line is not important. As an example, the following command
pbfdafaec.exe
/inFile aecfes.wav /micFile aecnes.wav /outFile aecres.wav /filterLength 64 /blockLen 64 /partitions 8 /stepSize 0.1 /pref accuracy /print 1 /delay 0will read the FES signal from the
aecfes.wav audio file, the microphone signal from aecnes.wav, and writes the echo-free speech (residual) to the aecres.wav wave file. The only difference between the PBFDAF and the BFDAF is that the filter in the former is partitioned into smaller filters allowing the block length to be chosen smaller, therefore decreasing the processing delay, without increasing the computation load. This is reflected in the command line arguments filterLength, blockLen, and partitions. In the above example, the adaptive filter is set to have 8 partitions each of 64 coefficients, which makes up a filter of length 512 coefficients in total. The blockLen is chosen 64 samples for maximum efficiency. This means that each 64 samples, the adaptive filter will update all its 512 coefficients and calculate 64 samples of the echo-free signal. The stepSize parameter that controls the convergence speed is set to 0.1. The rest of the parameters are explained in detail below. It is recommended however to use a batch file similar to the one supplied with the software for running the AEC to avoid typing long commands.6.1 Input and output files
The AEC uses two signals as input and produces one output signal. The inputs are the FES and the microphone (NES) signals as shown in Fig. 1. You can specify those two signals to PBFDAFAEC by giving wave file names as values for the command line arguments
inFile, and micFile, respectively. Note that the AEC will work correctly only if the audio signal in micFile is correctly aligned in time with the signal in inFile. The parameter delay is provided for that reason as mentioned below. In real-time operation this synchronization is automatically performed. PBFDAFAEC also supports many other audio file formats such as AIFF and AU.The output from the AEC is the echo-free signal RES. Use the command line argument
outFile to choose the output file name. When the near-end speech is zero (the local speaker is listening), this file should contain the micFile signal decaying in amplitude with time, until silence is heard. The performance of the AEC can be measured by inspecting this output file.6.2 filterLength
The success of the AEC in removing the echo from the Mic signal depends on many (inter-dependent) factors. To start with, for the adaptive filter to make a good estimation of the echo, the filter should have sufficient number of coefficients. The number of coefficients has direct relation with the amount of echo reduction (dB reduction). Correctly choosing the number of filter coefficients is, therefore, very important. Shorter filters result in low echo reduction and longer filters result in slow convergence.
As an example, assume that we need to obtain, say 20 dB of acoustic echo reduction in a certain room. Suppose that the room has a reverberation time of 600 ms. The reverberation time is the time it takes the sound to decay by 60 dB from its initial value when a sound source (already playing for long in the room) is suddenly stopped. This means that it takes 200 ms for the sound to decay 20 dB from its initial value. For the purpose of designing our AEC, we can then assume that the room impulse response (IR) between the loudspeaker and the microphone will have negligible values beyond the 200 ms. For the AEC to produce the 20 dB echo reduction, the filter impulse response has to be at least 200 ms long. Assuming the speech is sampled at 8kHz, this calculated in samples (or number of coefficients) becomes 0.2s * 8000 samples/s = 1600. Choosing
filterLength less than 1600 will NOT give the required 20 dB echo reduction. It is not guaranteed however that a filter of 1600 coefficients will result in the required 20 dB reduction.
6.3 stepSize
The second parameter affecting the AEC performance is the convergence time. This is the time needed for the filter to adjust itself to reach a reasonable approximation of the room impulse response between Spkr and Mic, and consequently, resulting in a reasonable echo reduction. The convergence time depends on factors such as the statistics of the FES input signal, the number of filter coefficients, and the algorithm used to adapt the filter.
The PBFDAF algorithm, like the BFDAF, convergence time is controlled by one variable, the
stepSize, and is theoretically independent of the input signal statistics, since normalization per frequency bin is performed internally in the frequency domain. The stepSize variable should always be chosen between zero and one. A stepSize = 0 will result in a filter that never converges (convergence time = infinity). A stepSize = 1, will probably result in an unstable filter (one that changes very fast and therefore diverges instead of converging). Smaller stepSize result in more accurate estimation of the echo, and therefore produces higher echo reduction. The best way to find the optimal stepSize in a specific application is by experimenting with different values and choosing the one resulting in the fastest convergence time and provides the required echo reduction.6.4 blockLen
The third factor of importance in an AEC is the delay the algorithm introduces during processing. Sample per sample time-domain algorithms such as the (N)LMS have zero processing delay, but show poor convergence behavior and need a huge computation effort for long filters, such as those used in acoustic echo cancellers. You can examine this by comparing the processing time of NLMSAEC and one of BFDAFAEC or PBFDAFAEC. Frequency-domain adaptive filters introduce latency
blockLen samples, but provide superior convergence, excellent echo reduction performance, and can be implemented very efficiently.The algorithm processing delay in PBFDAFAEC can be controlled by using the command line argument
blockLen. This means that blockLen samples are first collected before processing can start, which generates a processing delay equals to the time needed to collect the block. In embedded real-time systems that read from and write to the outside world on sample per sample basis, this delay is noticeable and in most cases is objectionable. This problem is solved in the Partitioned BFDAF algorithm (PBFDAF) by dividing the filter into smaller partitions, and therefore reducing the delay to the first partition only, which is a fraction of the delay in the BFDAF. Usually, this blockLen is chosen to be equal to the block length of the codec used before the AEC. For instance if the speech is decoded using G.723.1 decoder which has a frame length of 60 samples, the blockLen of the AEC would be chosen 60 and the filterLen 128-60=68 for maximum computational efficiency.In systems where blocks of samples are read and written to the outside world, the algorithm delay is integrated in the input/output process. This is the case for instance when a decoder is used before the AEC, which works on blocks of, samples or when a non real-time operating system such as Windows (or Linux) is used. In the latter case, the audio driver captures a buffer of samples and returns it to the application when the buffer is full. Playing the sound is also done whenever the application is ready with filling a buffer. In those cases, the algorithm delay is buried into the buffering process.
In the case of PBFDAF, a value of
blockLen = filterLength will result in the maximum computational efficiency. If this computationally optimum case results in an unacceptable delay, decrease both the blockLen and the filterLength and increase the number of partitions.
6.5 Preference
To give the system designer further control on the AEC performance, the
pref parameter has been introduced. This parameter can take a value either speed or accuracy. If speed is chosen, the AEC will use the unconstrained PBFDAF algorithm, which skips performing three Fast Fourier Transformation (FFT) computations on every partition every blockLen samples. This however is traded against a slight inaccuracy, which can be adjusted by reducing the stepSize value, if necessary. If accuracy is chosen, the constrained PBFDAF algorithm is used, which is an exact implementation of the PBFDAF, but consumes more computations.6.6 delay
This parameter is provided to adjust the FES and NES input signals to be aligned in time and to make the filter skip estimating the delay it takes the sound to travel from the loudspeaker (Spkr) to the microphone (Mic), to save calculations. As an example, suppose that Spkr is placed 68 cm from Mic. Since the sound travels in air with a speed of about 340 m/s, the sound needs (68 cm / 34000 cm/s = 2 ms) to travel from the Spkr to Mic. Assuming the speech is sampled at 8kHz, this can be expressed in samples as (0.002 s * 8000 samples/s) = 16 samples. This means that the direct sound from Spkr to Mic will appear in the filter after 16 coefficients, while the first 16 coefficients should be zeros. To save computations, the filter is shifted 16 samples using the
delay parameter so that the filter will not attempt to estimate those zeros.
6.7 print
This parameter is a flag specifying whether to print the filter coefficients to a text file or not. A value of 0 will cause PBFDAFAEC not to save the coefficients. A value of 1 will cause the coefficients to be saved into a file called aeccoef.txt. The file will be overwritten if already existed.
7. Performance

The performance of the PBFDAF AEC is shown in Fig.2. This result has been obtained by running PBFDAFAEC with FES = aecfes.wav, NES = aecnes.wav, and RES=aecres.wav, filterLength = 64, blockLen = 64, partitions = 8, and stepSize = 0.1. You will find the above three wave files in the wav directory where you installed the software package. The top graph in Fig.2 is the estimated and actual impulse responses between the loudspeaker and the microphone. The middle graph in Fig. 2 shows two speech signals, the NES, (echo or microphone signal) in blue and the echo canceller output RES, (residual) in green. The bottom graph is the energy of RES signals divided by the energy of NES and the result expressed in dB, which indicates how fast the filter converges. Note however that this quantity is meaningful only when the NES is non-zero, in other words, when there is speech to cancel.
It is clear from those graphs that the filter could establish a good estimate of the echo very quickly. The echo estimate improves with the time, which makes the residual smaller and smaller. After less than one second (8000 samples on the graph) 20 dB reduction has bee achieved. The convergence however can be controlled by the
stepSize parameter as mentioned before. 8. Commercial version

Commercial acoustic echo cancellers provided by DSP ALGORITHMS are much more complex than the one implemented in this simple version. A commercial AEC includes the following components besides the adaptive filter as shown in Fig. 3;
9. For more information
If you are interested in test driving the full version of the acoustic echo canceller, or you have any questions or comments on this document or the software described herein, please email to support@dspalgorithms.com. We will be glad to help you.