Random Quote Board
Creating a FFT Filter with Gnu Radio Companion
Acknowledgement: This post would not have been possible without Steven W. Smith's book, "The Scientist and Engineer's Guide to Digital Signal Processing. While Dr. Smith has made his book available for download for free, please support such great work by buying a hardcopy from your favorite book vendor.
There are two ways to filter a signal. One is in the time domain; the other is in the frequency domain. Either way, you're convolving the input signal with the filter impulse response. The time domain requires a lot multiplies and adds. Every sample has to be multiplied with every tap of the filter. Thus, if we have a 201 tap filter, then passing 201 samples through the filter would require 2012 multiplies, since each sample is multiplied by each tap in the filter. (NOTE: There's also the adds requiring 201 adds for each sample, thus the filter has to do 2012 multiplies and 2012 adds for each 201 samples.) Let's just say that that is a lot.
That's in the time domain. In the frequency domain, it's different. Convolution in the time domain is equivalent to multiplication in the frequency domain.
The reason that this equivalency is a really good thing can be summed up in three words: fast Fourier transform. The FFT is the super-efficient version of the discrete Fourier transform (DFT). The FFT requires many fewer calculations than would otherwise be the case. The equivalency of "convolution in time domain = multiplication in the frequency domain" plus the efficiency of the FFT combined means we can create digital filters that require much less processing than would otherwise be the case.
Speed Test of Time vs FFT Filter in GRC
From a user standpoint in Gnu Radio Companion, there's no difference between the time domain and frequency domain filters. They're straightforward blocks. For either case, we add the block to the flowgraph, adjust the parameters, and away we go.
But, from a processing standpoint, there's a big difference. I'm going to make a straightforward flowgraph to show the difference.
Running this flowgraph, we're not really interested in the display. Rather, we're interested in the terminal in the lower, lefthand corner. The messages for the "Low Pass Filter" (time domain convolution) appeared as follows:
******* MESSAGE DEBUG PRINT ********
(((rate_now . 2.55393e+07) (rate_avg . 2.55992e+07)))
************************************
The maximum sample rate on my system for this time domain filter and parameters (2.4 MHz sample rate, 20 kHz transition width, Blackman window) is roughly 25.6 MSam/sec. Switching over to the "FFT Low Pass Filter" (frequency domain convolution), I get the following message:
******* MESSAGE DEBUG PRINT ********
(((rate_now . 3.25155e+08) (rate_avg . 3.24794e+08)))
************************************
With the frequency domain convolution, the speed has increased to roughly 325 MSam/sec. On my primary desktop computer, the frequency domain convolution is over 12 times faster than the time domain convolution. This speed test is dependent on the number of taps in the filter (fewer taps may mean the time domain is faster), the computer system (CPU, motherboard, RAM, and because I'm reading from the hard drive, drive read speed), operating system, and the specific version of Gnu Radio I'm using. But the general idea is valid.
Running this same flowgraph on a slower machine (i5-5200U vs the i9-13900 of my main computer), the time domain convolution is 7.45 MSam/sec, while the frequency domain convolution is 59.3 MSam/sec. This is only roughly 8 times faster. But its still significantly faster.
FFT Filter in GRC
How does this work? What are the steps for creating such a filter? Let's start with a general block diagram.
Where did the zeropadding come from? Let's continue the discussion with that.
FFT Convolution and Zeropadding
The FFT is a block-oriented algorithm. It does not work on a sample-by-sample basis; it works on a block of samples all at the same time. The algorithm takes in a block of samples (which, depending on which description you're reading may be called an array, a vector, or a time record) representing the time domain and outputs a block of samples representing the frequency domain. The important point here is that the size of the two blocks (input time domain and output frequency domain) are the same size.
The question becomes, "How big does this block of input points need to be?" The one part that has very little adjustment is the number of filter taps. Gnu Radio uses the harris approximation to calculate the number of taps. While this is an approximation, it gives us a good idea of the number of taps required based on the sample rate, transition width, and attenuation of the window used.
The other question is, "How many signal samples do I need?" The answer is, "How many do you want?" That's right. Put in as many as you want. You can go with fewer samples per block (means more FFT blocks), or more samples per block (means fewer blocks, but more calculations per block). Just for the sake of moving things along, let's assume that we're going to use the same number of signal samples as filter taps. And let's say that our "harris approximation" says we need 101 taps. This means we're also going to use 101 signal samples.
Next question: How many samples do we need to feed into the FFT? How large will the block be? We said 101 taps and 101 samples. The output will be the response of the signal convolved with the filter. This effectively spreads out the signal in time. The length of the signal will be spread out in time by the length of the filter. That's a total of 202 points going into the FFT, right? Not quite. The maximum length of the two waveforms (filter and signal) occurs just as they come together. But the response doesn't start until they overlap at one point. When the filter starts, the first tap of the filter covers the first sample of the signal. The maximum size is (filter taps) + (signal samples) - (the one point they have in common). Given these parameters (101 filter taps, 101 signal samples), the maximum size is a convolution output of 201 points. This is the maximum size we need to worry about.
Except the two blocks, filter taps and signal samples, aren't put together and fed into a single FFT. They're separate, and each has to feed into its own FFT. Once the two are multiplied together in the frequency domain, the inverse FFT will be the convolution of the two blocks (filter taps and signal samples). The output block has to account for the spreading of the signal in time. Thus, the overall block size has to be at least equal to the (taps)+(signal)-1. How do we make up the difference for each block? Zeros. We add zeros so that each block (filter and signal) is equal in length to taps + samples -1. We're going to start with the filter taps or signal samples, then we're going to add enough zeros to each so that our output is at least equal to taps+signal-1.
The practical aspect is that we don't just add enough zeros to get us to taps+signal-1. We add this plus enough zeros to get us to a block size that is a power of 2. This makes it easy to feed into the FFT. The next power-of-2 value from 75 is 128.
For our example, we're going to take our filter taps and signal samples and zeropad each to get each block to 128 samples. Thus:
- Signal 1 will be zeropadded with 128 - 50 = 78 zeros.
- Signal 2 will be zeropadded with 128 - 26 = 102 zeros.
A Basic GRC Implementation of FFT Convolution
Let's take a look at how this would be implemented. I've created a basic Gnu Radio Companion flowgraph that will filter a set of complex samples (2.4 MHz of the FM broadcast band, centered on a FM station). I'm specifically going to filter out the center station. I'm using the following specifications for the filter:
- Cutoff frequency: 100 kHz
- Transition width: 20 kHz
- Window type: Blackman-harris (attenuation of 92 dB)
- Filter taps based on harris approximation: 501
- Signal samples: 501 (same as the filter taps)
- Size of FFT: 1001 points (501+501-1) -> 1024 points (next power-of-2)
- Zeropadding required: 1024 - 501 = 523 zeros
Here's the nice thing about the FFT convolution. Once you have the filter taps, you can pre-calculate the FFT of those taps and store those in memory. You do not have to continually calculate the FFT of the filter taps the same way you do with the signal samples. This reduces the amount of computations required for this convolution implementation. That's what I did with this flowgraph. Rather than using the "FFT" block to calculate the FFT for the filter taps, I'm going to use "Variable" blocks to create the filter taps, zeropad the filter taps, and calculate the FFT of the zeropadded taps using a "numpy" statement.
Speaking of FFT, there's one, key point I should point out. Do not window the data before calculating the FFT. Windowing is used when you're performing spectral analysis, not spectral processing. I set the window in the FFT of the flowgraph to window.rectangular(1024) (or whatever number of points you're sending to the FFT if not 1024). The rectangular window is equivalent to no window.
The Signal Gap Problem with the FFT Convolution
Here's the problem with this simple implementation. Yes, the signal is filtered. But notice that there are gaps in the signal. Look at the center graph above. Yes, it's filtered, but there are gaps of 523 samples between the filtered samples. Let's try a simple way to remove the gaps. I'll modify the flowgraph so that it extracts those filtered samples, and deletes the rest.
Yes, we've removed the zeropadding, but the signal is still not correct. Look at the frequency domain (bottom graph above). The red shows how the signal should look (it's from the standard "FFT Filter" block), but the blue is how it does look. The noise floor is much higher than it should be. If you look at the time domain graph of the filtered waveform (center graph above), you can see why. Let's zoom in on that waveform to get a better look.
Overlap-and-Add Implementation
How do we get rid of those discontinuities? The answer is overlap and add. In other words, do not extract the samples from the filtered blocks. Instead, align the samples so that the end of the samples from one block line up with the beginning samples of the next block. Through the "magic of math" (No, it's not actually magic.), those impulses go away and you wind up with a perfectly continuous waveform. Gnu Radio does not have a way to do this directly, but there's a way to do it indirectly. The answer is in two parts. The first is that, rather than try to process in one flow, split the processing into two branches. The second is that, rather than process 501 samples at a time, process 512 samples. This is one half of the block size (1024). By processing half of the samples in each branch, it should be possible to add them together at the end to create the one, continuous sample stream. Here's how the flowgraph I created looks.
The end result is a continuous, perfectly filtered (or filtered, perfectly continuous) sample stream. Note that this is not exactly how the "FFT Filter" block operates. I'm certain it does not process two streams in parallel. Since it's specifically programmed for GRC, it most likely keeps a running stream which new samples are simply added to before going to the output. (Yes, the grammar of that last sentence sucks. But... whatever.)
Summary
The point of this post was to demonstrate how FFT convolution works. The general flow is:
- Calculate the filter taps, zeropad them, then calculate the FFT of these zeropadded taps.
- Pull out a certain number of signal samples, depending on how large the FFT will be.
- Zeropad the signal samples such that it is the same number of points as the zeropadded filter taps.
- Calculate the FFT of the zeropadded signal samples.
- Multiply the two FFTs (zeropadded filter taps and zeropadded signal samples) together.
- Calculate the inverse FFT to convert back to the time domain.
- Add output of inverse FFT to previous output to create continuous stream.
- Repeat steps 2 - 7.
A couple of other points:
- To repeat, do not window the data before calculating the FFT. Windowing is used when you're performing spectral analysis and you need to get rid of the issues of spectral leakage. When you're performing spectral processing, you don't use any windowing. In GRC, the easy way to do this is to set the window to rectangular (e.g. "window.rectangular(512)"). This is the equivalent of no windowing.
- The number of signal samples is totally arbitrary. For example, even though I used 512 samples in my flowgraph above, I could have used 1024 or 2048 while using an equal number of zeros and double the number of points (so a 4096 point FFT for 2048 samples, or 8192 point FFT for 4096 samples). If I was actually programming this, I could set the number of samples to pretty much whatever I want. I could increase the the size to, say, 8192 samples. This would require adding a lot more zeros to the filter taps, but since that calculation only has to be done once, that probably is a net savings and allows the FFT to really provide much more efficiency.
That's it for now. Til next time!