An Introduction to Digital Signal Filtering for Programmers

This article will present an introduction to digital signal filtering techniques, aimed towards programmers with no prior experience in signal processing. It will cover various filtering algorithms in common use, how to analyse them, how to select appropriate parameters for them, and how to implement them.

Digital signal processing (DSP) has traditionally been the domain of electronics engineers, however with today’s high prevalence of mobile devices, as well as the increasing popularity of platforms such as Arduino and Raspberry Pi, programmers are more and more often finding themselves wanting to interpret signals that have been obtained from sensors such as a mobile device’s built-in accelerometer or magnetometer, or any number of digital or analog sensors that might be connected to an Arduino or Raspberry Pi. DSP techniques, most notably signal filtering, are very often crucial in accurately interpreting these signals.

Some DSP Fundamentals

For those completely new to the concept of signal processing, a signal is any measurable value that can change through time or space. For example, an analog audio signal could exist as an electrical voltage which changes with time to represent sound pressure, and the movement of the pen of a seismometer is a signal representing earthquake magnitude.

A digital signal is a signal that is discretised, meaning that its values are not continuous but rather exist only at discrete points in time, and quantised, meaning that its amplitude must have a finite level of precision. Put simply, a digital signal is a signal that could be stored as an ordered sequence of numbers in a computer. (An unordered list of numbers, however, such as the heights of a sample of people in a room, is not a signal, as its values do not occur in a sequence.) Examples of digital signals could include a digital audio signal, where the value represents the amplitude of a sound wave at an instant in time, sampled as rapidly as 44,100 measurements per second; the sale price of a security listed on a stock exchange, with one value reported every 10 minutes; or the maximum air temperature measured in a city on each day of the year. A common signal that varies through space rather than time is a digital image, where the whole image can be thought of as a signal whose values are the intensities of each pixel, with those values varying with their position within the image. The field of DSP is concerned with the analysis and manipulation of digital signals.

Figure 1: An example of a signal, represented as a plot. Note that while there appears to be a continuous underlying pattern, the signal actually only exists at the discrete sample points indicated by the vertical lines and circled points.

The preceding examples all describe signals represented in what is called the time domain, where where the signal is stored as a sequence of values that change through time or space. (Strictly speaking, signals that vary through space would be described as being represented in the spatial domain, but for simplicity, this can be thought of as equivalent to the time domain.) In the time domain, a digital signal ‘x’ can be written in terms of its individual values, e.g. x[0] = 0, x[1] = 0.7, x[2] = 1, x[3] = 0.7, x[4] = 0, etc. Or, if the signal is defined by a function of time, it may be written as a formula using the sample index as a dependent variable, e.g. x[n] = sin(π/2 * n), where n is an integer indicating the index of the sample. It is customary to use lowercase letters for time domain representations of signals.

A signal can, however, also be represented as a combination of wave functions of differing frequencies and phases, most commonly as the sum of a series of sinusoids. Such representations are describes as instead belonging to the frequency domain, and one of the most fundamental techniques used in DSP is the conversion of signal representations between these two domains. Even when such conversions are not being performed, it is important to understand that any signal can be thought of as a sum of a series of sinusoids of differing frequencies, as this means that we can meaningfully envisage manipulations of a signal that would be aimed at making specific, straightforward changes to its frequency domain representation, even when it would be difficult or impossible for a human to imagine how these changes would affect the time domain representation. Common examples of such frequency-domain manipulations would involve amplifying, filtering, or phase-shifting frequency domain components of the signal, within specific frequency ranges.

One further important concept in digital frequency domain analysis is that the frequencies it uses are normalised frequencies, not real-world frequencies, and are typically measured in in radians per sample, rather than hertz (cycles per second) or radians per second. The reason for this is that DSP algorithms operate on the signal as a series of values alone, without any knowledge of the sampling rate at which those values were collected. As such, a periodic signal that repeats every ten samples could have had a real world frequency of 10 Hz, if the signal was recorded at a sampling rate of one sample per second, or 441 KHz if the signal was recorded at a sampling rate of 44100 samples per second. This makes no difference to the algorithm. These normalised frequencies are usually expressed in values from zero to pi, since zero radians per sample implies a constant value, the lowest frequency that could be represented, and pi radians per sample is the fastest frequency that could be represented, having a period of only two samples. Higher frequencies cannot be represented digitally, e.g., a signal with a period of one sample would not change from sample to sample, and so would be indistinguishable from a constant (zero frequency) signal.

Frequency Example time domain representation
0 1,1,1,1,1,1,1,1
π/4 1.0,0.7,0.0,-0.7,-1.0,-0.7,0.0,0.7
π/2 1,0,-1,0,1,0,-1,0
π 1,-1,1,-1,1,-1,1,-1
Table 1: Time domain and frequency domain representations of signal components.
Why use DSP?

The information contained within signals is very often not straightforward to extract. The signal may contain noise, which could be random or could follow some pattern, such as mains electric hum, which occurs as a 50 or 60 Hz signal added as noise to the actual signal of interest, as a result of magnetic induction. The signal may be provided at a rate that is faster than desired, and require conversion to a slower rate. Or one might be looking for complicated patterns within the signal, such as when pinpointing and classifying heart rhythms within an electrocardiogram signal. Alternatively, the goal may be to manipulate the signal, rather than extract information from it, as is the case with audio equalisation, and with many different image processing applications. 

DSP gives us the tools to solve these problems, as well as many others. This article focuses on a particular class of tool - digital filters.

Signal Filtering

A process which modifies a signal is called a “filter”. Filters are commonly used to either remove unwanted components of a signal, to adjust the relative magnitudes of different components of a signal, or to separate multiple desired components that exist within the same signal.

Filters may be described as acting either in the time domain or the frequency domain, depending on which domain makes more sense to use to describe the information that the filter is intended to extract or remove. Looking for short-term trends in stock prices calls for a time-domain smoothing filter, whereas removing a 60 Hz mains power hum from an electrical signal calls for a frequency-domain band-reject filter. Filters need not fall strictly into one of these categories, but the stronger the filter’s performance in one domain, the less useful it will be in the other. As the time and frequency domain representations are merely different representations of the same signal, time domain filters will still affect the frequency domain representation, and vice-versa. The differences lie in how intuitive and effective their behaviour is in each domain. Time domain filters are good at making changes that are easily visualised in the time domain, such as smoothing or noise reduction, but tend to be very poor at separating frequency components, and their effects may be difficult for a human to visualise in the frequency domain. Conversely, frequency domain filters are very effective at separating frequency components, but not so useful for random noise reduction, and their effects may be difficult for a human to visualise in the time domain.

Filters intended to make frequency domain changes to a signal are described in terms of how they amplify or attenuate the different frequency components that make up the signal. A filter is said to “cut”, “stop”, or “reject” the frequencies which it attenuates, and “pass” the frequencies which it does not. As such, a “low-pass” filter is one which ideally removes high frequency components while not attenuating low frequency components, and a “high-pass” filter does the opposite. Filters may also pass or cut only a middle range of frequencies, in which case they are called “band-pass” or “band-reject”.

Frequency-domain filters are not usually capable of the ideal behaviour of completely stopping all frequencies on one side of a specified cutoff frequency, while leaving those on the other side unmodified. In reality, they will typically have a transition range between the frequencies being passed and stopped, where the signal is only partially attenuated. They will also not be capable of completely stopping all undesired frequency components, and will instead merely attenuate them to a satisfactory level, and may also cause some undesired attenuation in the frequencies which they are intended to pass. Filters may also modify the phase of the different frequency components of the signal, though much more often as a side-effect than as a deliberate action.

Figure 2: Frequency-response plots of the basic frequency-domain filter types.

The preceding charts are frequency-response plots - a tool for the analysis or visualisation of frequency domain filters, which shows the gain or attenuation that the filter causes to the different frequency components of a signal. Other useful graphical tools for filter analysis include the impulse response and step response plots, which are time-domain plots that show, respectively, the effect that a filter has on an impulse input - a single non-zero value surrounded by all zero values - and a step input - a series of zero values followed by a sudden change to constant non-zero values. The impulse and step response plots are useful for visualising how the filter will affect sudden changes in the input, and is still useful for frequency-domain filters, as it is often desirable to know, but very difficult to intuitively predict, how changes to the frequency components would affect a sudden, non-periodic event like an impulse or step, as might occur if a long bang were present in an audio signal, or when a change of orientation suddenly shifted the gravity component of an accelerometer signal.

An example application that we will consider is the case of an accelerometer which produces a signal measuring the acceleration of a device, but which is also affected by the constant force of gravity. It would usually be desirable to remove the (constant, piecewise constant, or low frequency) gravity component, leaving only the movements of the device, or to cut out all movements, leaving only the gravity component of the original signal, in order to determine the orientation of the device. These actions could be best described in either the time domain or the frequency domain, depending on whether the direction of gravity is fixed, subject to sudden chances, or subject to periodic movement. A non-changing gravity component, or one that changes with a constant frequency, is easily described and filtered in the frequency domain, where it can simply be described as a zero-frequency component. However, this is not the case if the gravity component is piecewise constant - remaining constant only for a period of time before changing to a different constant value, as would occur if the device were held in one orientation, and then changed to a different orientation. In the latter case, a frequency-domain filter might produce unexpected behaviour when the device’s orientation suddenly changes, something which could be assessed by looking at the step-response of the filter. The terms “DC pass” and “DC removal”, names which originate from analog electronic signal filtering, will be used to describe the passing or removal of a constant or piecewise constant component of a signal.

The One Pole Low Pass Filter

This brings us back to the initial motivation for this article, which was a lecture from The University of Maryland’s Programming Mobile Applications for Android Handheld Systems course, on Coursera, which while describing how to utilise a device’s sensors, provided a very simplistic approach to filtering data from an accelerometer, with the aim of separating the raw signals into a constant force component (gravity) and transient force components (movement). While this simple filter was adequate for demonstrating the concepts of low-pass and high-pass filtering, in practice, much better results can be achieved with some additional effort.

    // Filter parameter, selected from values 0<mAlpha<1. A larger value gives
    // stronger attenuation, but slower reaction to changes.
    private final float mAlpha = 0.8f;

    /** Low-pass filter: Deemphasize transient forces
     *
     *  @param current_input The most recent input of the signal to be filtered.
     *  @param prev_output The previous output value from this filter.
     *  @return A single filter output value.
     */
    private float lowPass(float current_input, float prev_output)
    {
        return prev_output * mAlpha + current_input * (1 - mAlpha);
    }

    /** High-pass filter: Deemphasize constant forces. The low-pass filter
     *  ({@link #lowPass(float, float)}) must be called first for each input
     *  value.
     *
     *  @param current_input The most recent input of the signal to be filtered.
     *  @param prev_lp_output The previous output value from function lowPass.
     *  @return A single filter output value.
     */
    private float highPass(float current_input, float prev_lp_output)
    {
        return current_input - prev_lp_output;
    }

Java code provided in the course, modified for clarity. The original code is available on Github.

This filter works by calculating each value in the output as a sum of a proportion of the current input value and the previous output value, using a parameter α to determine the proportion of each, with a value of 0.8 used for α in the code provided. So the filtering algorithm can be written as follows, writing our raw, input signal as x[n], and the filter output as yLP[n].

$$\begin{align}
y_{LP}[n] = α \times y_{LP}[n-1] + (1-α) \times x[n] && \text{(eq 1)}
\end{align}$$

Because each output value is dependent on both the current input and the previous output value, and that previous output value was dependent on the previous input value and the output value before it, and so on, the output of the filter is therefore dependent on every single previous input value. Such filters are called “feedback”, “recursive”, or “infinite impulse response” (IIR) filters, with that last name referring to the fact that an impulse input, a single non-zero value with all other values zero, will cause the output to become and remain non-zero (although still decaying towards zero) indefinitely. The output is then a weighted average of all previous input values, where the contribution of each input value diminishes as an exponential decay as that value gets older. This specific filter is called a one (or single) pole low-pass filter, or 1P-LP for short. (The reason for this name is beyond the scope of this article.)

So how effective is this filter, and how should the parameter α be chosen? We will begin by looking at the filter’s response in the time domain, since that does not require the introduction of any new concepts, and can be performed simply by selecting input signals to be fed through the filter, and observing how they come out. An obvious initial assessment of any low-pass filter is to determine the extent to which it removes high frequency components of a signal. As a simple, time-domain test of this, we can feed into the filter the highest frequency that can be represented, i.e. a signal that repeats every two samples, such as [1,-1,1,-1,1,-1…], and compare the magnitude of the output, once it has reached a near-steady state, to that of the input. The results of this input can be seen in Figure 3. (Time is needed for the system to approach steady-state, because the output of the filter depends on previous output values, and those previous values are non-existent when the input signal starts, and so will be taken as zero. As the filter runs, those past zero values still affect the new output values, but they become less and less significant, and as long as the same, periodic input signal is maintained, the filter’s output gradually approaches a steady state periodic signal.)

Figure 3: Output of 1P-LP filter for high frequency input. Note that the continuous sinusoids pictured are not part of either the input or output signals, but rather have been added in order to assist in visualising how a digital signal like [1,-1,1,-1,...] could have been sampled from a sinusoid. As far as the digital filter is concerned, the individual samples, represented as circles in the figure, need not have actually originated from any continuous signal.

Two things that are apparent from Figure 3 are that the filter’s response to this high frequency signal is to attenuate it, but not completely, with the degree of attenuation increasing as α increases; and that it takes some time for the filter to adjust to a new input once that input has begun, and to settle into a steady state, though it is difficult to accurately determine from this image how long it will take to settle. Considering first the degree of attenuation, this filter appears to only attenuate the highest possible frequency input to around 10% of its input magnitude for the default value of α = 0.8. That seems quite inadequate for a low-pass filter, which one might reasonably expect to remove almost the entirety of the highest possible frequency component of a signal. How does this attenuation vary with the choice of α? While frequency domain analysis of the filter could tell us the exact attenuation provided at any frequency, it is fairly straightforward to analyse the particular case of the highest possible frequency input in the time domain, without needing to introduce any new concepts. This can be done by solving simultaneous equations for different samples of Equation 1, using an input signal x = [1,-1,1,-1...] and finding ouput values that satisfy yLP[n+2] = yLP[n] and yLP[n+3] = yLP[n+1]. (I.e. Finding output values that repeat every two samples without otherwise changing, implying that the system is in steady state when these values are reached.)

$$\begin{align}
&\text{From eq 1:} \\
y[2] &= α y[1] + (1-α) x[2] & \text{(eq 2.0)} \\
y[3] &= α y[2] + (1-α) x[3] & \text{(eq 2.1)} \\
&\text{Inserting eq 2.0 into eq 2.1:} \\
y[3] &= α (α y[1] + (1-α) x[2]) + (1-α) x[3] \\
&\text{Applying }y[3] = y[1]\text{, }x[3] = x[1]\text{, and }x[2] = -x[1]\text{:} \\
y[1] &= α (α y[1] + (1-α) (-x[1])) + (1-α) x[1] \\
&= α^2 y[1] + (1-α) (1-α) x[1] \\
&= \frac{(1-α) (1-α)}{1-α^2} x[1] \\
&= \frac{(1-α) (1-α)}{(1-α) (1+α)} x[1] \\
&= \frac{1-α}{1+α} x[1] \\
\frac{y[1]}{x[1]} &= \frac{1-α}{1+α} \\
&\text{Since } y[n] = -y[n+1] \text{ and } x[n] = -x[n+1] \text{, } \frac{y[1]}{x[1]} = \frac{y[n]}{x[n]} \\
\therefore \frac{y[n]}{x[n]} &= \frac{1-α}{1+α} & \text{(eq 2.2)}
\end{align}$$

Figure 4: Maximum attenuation of 1P-LP filter, for varying alpha.

So it appears that we can achieve any amount of high frequency attenuation we desire with this filter, by choosing values of α that are close to 1. However, in order to obtain good high frequency attenuation, α would need to be very close to 1. For example, in order to obtain attenuation down to 0.1%, α would need to be 0.998.

Now, what about the time needed for the filter to adjust to a sudden change in input, which we also mentioned earlier? In order to analyse that, a very useful tool is the step response of the filter. This is simply the output of the filter when given a step input - a signal that is initially zero, and then suddenly changes to a constant non-zero value and remains at that value indefinitely. For the intended application - a DC pass filter - the ideal step response would be for the step to be passed without modification. The step response of this filter is shown in Figure 5.

Figure 5: 1P-LP filter step response

Immediately obvious is that the choice of α has a huge effect on how quickly the filter responds to sudden changes in input, with the response becoming slower and slower as α approaches 1. So we have a dilemma where choosing a low value of α results in terrible high frequency attenuation, and choosing a high value gives terribly slow responses to sudden changes in the input. Such trade-offs are very common in filter design: There is never any one filter that does everything well.

Similarly, choice of α also affects the impulse response - the response to a single non-zero input surrounded by all zeros. For this application, the step would ideally be completely blocked. The response, seen in figure 6, shows that greater values of α result in less of the impulse being passed, but the output is perturbed for longer, whereas for small values of α, a larger proportion of the impulse is initially passed, but then the output returns to zero more quickly.

Figure 6: 1P-LP filter impulse response

It’s difficult to quantify how long the filter takes to settle after being perturbed by a step or impulse input, as being an infinite impulse response filter, it will take an infinite about of time to completely settle, regardless of the value of α. The only way to measure it is to pick a threshold beyond which it can be deemed to have settled sufficiently. Then, having chosen a threshold, one of α or the number of samples required for the filter to settle can be calculated from the other using the following equations, where p is the proportion by which the filter has settled (i.e. p=0.95 means the filter has settled by 95%) and k is the number of samples taken for the filter to settle.

$$\begin{align}
\alpha^k &= 1-p \\
\alpha &= (1-p)^\frac{1}{k} \\
k &= log_\alpha(1-p) \\
\end{align}$$

One final analysis tool that we’ll apply to this filter is the frequency magnitude response, which was mentioned earlier in the context of frequency-domain filters. This plot shows the amplification or attenuation that the filter causes to the magnitude of a frequency component of a signal at any possible frequency, and thereby offers a very useful visualisation of how the filter affects different frequency inputs or different frequency components of an input. The frequency magnitude plot shows the output magnitude (as an amplification or attenuation relative to the input magnitude) on the Y-axis, and the frequency on the X-axis. The frequency response can be produced using a mathematical technique called the “Z-transform” to convert the time domain representation of a filtering algorithm into a frequency domain representation in the form of a complex mathematical function, but that involves a lot of new mathematical concepts that would need to be introduced, so for this article, I’ll provide the frequency response of the filters described without explaining how to derive them, presenting them as a means of comparing the behaviour of specific filters, rather than as a general tool for filter analysis. For our 1P-LP filter, the frequency magnitude response is shown in Figure 7.

Figure 7: 1P-LP filter frequency-magnitude response

We can see from the 1P-LP filter’s frequency response that the filter attenuates, to some degree, all frequencies above the zero frequency, constant component of the signal, with increasing values of α resulting in more and more of the lower frequencies being cut, as well as the high frequency attenuation also increasing. This reveals yet another problem with this filter - that the single parameter α is also the only control we have over what range of frequencies is filtered out, and it only offers very imprecise control. We cannot set a precise cut-off frequency at which the attenuation begins, and because the degree of high frequency attenuation varies with α, it is not possible to create a filter that passes a wider range of low frequencies but still has good high frequency attenuation.

So this filter is not really a true frequency domain filter, despite what its name suggests, as it is not able to separate frequency components with any real precision, and has very poor high-frequency attenuation. It does have some time-domain filtering abilities, in that it could act as a DC-pass filter (as seen from its step response) and could suppress an impulse input, though it did not do either of these things particularly well. It does, however, have several benefits: It is simple to implement, fast to execute (requiring only two multiplication operations), and barely delays the signal from input to output.

Before considering alternative low-pass filters, we will look at the behaviour of the high pass filter used in the previous code example.

The One Pole, One Zero High Pass Filter

The code we previously introduced contained a high pass algorithm in addition to the 1P low pass algorithm that we just investigated. The implementation of the high pass filtering was extremely simple - it merely took the output of the low pass filter, and subtracted it from the input. The algorithm is shown in equation 3.

$$\begin{align}
y_{HP}[n] = x[n] - y_{LP}[n] && \text{(eq 3)}
\end{align}$$

The reasoning behind the design of this filter is that if the input contains all frequency components, and the low pass filter output contains only the low frequency components, then the difference between the two should be the high frequency components. So does this work? Well, no, not always. The problem is that filters don’t just attenuate different frequency components of the input signal, they usually also delay them, and by differing amounts. This delay is often thought of as a phase shift, in that a frequency which has a period of 12 samples, if it were delayed by 4 samples, could thought of as having been phase shifted by 120°. This is illustrated in Figure 8.

Figure 8: An example of a filter causing phase shift. Notice how the peaks of the output signal do not occur at the same time as those of the input signal. This is because the output signal has been delayed, causing a phase shift of 120°.

Subtracting a phase-shifted frequency from the original signal is not likely to have the desired effect. In fact, if the absolute phase shift is greater than 60°, subtracting the phase-shifted frequency component from the original signal will actually result in amplifying that frequency component, rather than attenuating it. So one could conceivably create a low-pass filter which had the side-effect of phase shifting low frequencies, and then attempt to modify it into a high-pass filter by subtracting the low-pass output from the input, and actually end up with a filter that amplified low frequencies rather than cutting them out! This can be seen in Figure 8, where the difference between the input and output signals actually has a greater magnitude than the original input signal.

Now, despite the existence of phase shift, it is still possible to use this technique. There is just one condition that is required for it to work perfectly: The original filter must either introduce no phase shift, or must introduce a constant delay that does not vary between the different frequency components of this signal (a property known as “linear phase”), and the input signal must be delayed by the same amount before subtracting the original filter’s output.

IIR filters like our 1P-LP filter can reasonably be assumed to introduce non-linear phase shifts, meaning that the delays that these filters introduce are not constant, but rather they vary with frequency. Fortunately, however, the result of breaking this condition is not necessarily catastrophic, rather it merely means that the resulting filter will not have a frequency magnitude response that is a perfect inversion of the original filter. In this case, it does actually still yield a high pass filter, just not one whose filtering is the exact opposite of the low pass filter from which it was derived. Specifically, the the sum of the response magnitudes of the mid-range frequencies of the low-pass filter and those of the high-pass filter would be a little greater than one, but the lowest and highest frequencies still behave as desired, due to the fact that the lowest frequency, being constant rather than periodic, cannot be phase shifted, and the highest frequency, having a period of only two samples, could only possibly be phase shifted by zero or 180°, and in the case of the 1P-LP filter, the phase shift at the highest frequency happens to be zero. (This can be seen in Figure 3.)

The filter that is formed by this operation is called a one (or single) pole, one (or single) zero high-pass filter, or 1P1Z-HP for short. (Again, the reason for this name is beyond the scope of this article.) Merely subtracting the 1P-LP filter output from the input signal does however give a somewhat naive implementation of this filter, due to the inadequate high frequency attenuation of the 1P-LP filter (as seen in Figures 3, 4 and 7) being passed on to this new filter, which results in the high frequencies being partially attenuated, rather than maintaining the same magnitude, as one would expect from a high-pass filter. Luckily, this inadequacy can be compensated for by amplifying the output of the high-pass filter to ensure no gain or attenuation at the highest frequency. The high frequency attenuation of the 1P-LP filter was given in Equation 2.2, and if this attenuated magnitude is subtracted from the input, then the resulting attenuation in the high-pass filter will be one minus the high frequency attenuation of the low-pass filter. We can compensate for this by dividing the high-pass output by this attenuation factor, as long as we do not use the resulting output as a feedback input into the high-pass filter. (If the naive high-pass filter were implemented by feeding its own output back into itself, rather than using the output of the 1P-LP filter, then that implementation would expect to receive the naive output through feedback, rather than the amplified output. Supplying the amplified output instead would cause subsequent outputs to be incorrectly calculated.) That gives the following equation for an improved high-pass filter.

$$\begin{align}
y_{HP}[n] &= (x[n] - y_{LP}[n])(1 - \frac{1-α}{1+α})^{-1} \\
&= (x[n] - y_{LP}[n])(\frac{1+\alpha}{1+\alpha} - \frac{1-\alpha}{1+\alpha})^{-1} \\
&= (x[n] - y_{LP}[n])(\frac{2α}{1+\alpha})^{-1} \\
y_{HP}[n] &= (x[n] - y_{LP}[n]) \frac{1+\alpha}{2\alpha} & \text{(eq 4.0)}
\end{align}$$

We can also modify this algorithm, if desired, to have it directly calculate the high-pass output, rather than having to calculate the low-pass output first.

$$\begin{align}
& \text{Inserting Eq 1 into Eq 4.0:} \\
y_{HP}[n] &= (x[n] - (\alpha y_{LP}[n-1] + (1-\alpha)x[n])) \frac{1+\alpha}{2\alpha} \\
&= (\alpha x[n] - \alpha y_{LP}[n-1]) \frac{1+\alpha}{2 \alpha} \\
&= (x[n] - y_{LP}[n-1]) \frac{1+\alpha}{2} & \text{(eq 4.1)} \\
& \text{Time delaying eq 4.1 by one sample:} \\
y_{HP}[n-1] &= (x[n-1] - y_{LP}[n-1]) \frac{1+\alpha}{2 \alpha} \\
& \text{Rearranging to get in terms of } y_{LP}[n-1] \text{:} \\
y_{LP}[n-1] &= x[n-1] - y_{HP}[n-1] \frac{2 \alpha}{1+\alpha} & \text{(eq 4.2)} \\
& \text{Inserting eq 4.2 into eq 4.1:} \\
y_{HP}[n] &= (x[n] - (x[n-1] - y_{HP}[n-1] \frac{2\alpha}{1+\alpha})) \frac{1+\alpha}{2} \\
y_{HP}[n] &= \frac{1+\alpha}{2}(x[n] - x[n-1]) + \alpha y_{HP}[n-1] & \text{(eq 4.3)}
\end{align}$$

Equation 4.3 describes an implementation of this filter similar to that of the 1P-LP filter, in that it can now be calculated from the input and its own previous output values. One notable difference from the 1P-LP algorithm, however, is that the 1P1Z-HP filter uses the previous input value in addition to the current input value and the previous output value.

The frequency magnitude response of the naive and improved high-pass filters are shown in Figure 8.

Figure 8: Frequency response of naive and improved implementations of the 1P1Z-HP filter for varying alpha.

The frequency response clearly shows that the improved algorithm is far better at passing high-frequencies than the naive implementation, especially at the lower values of α. Interestingly, the lowest and highest frequencies are now the same for all values of α, with only the mid ranges varying. While this behaviour is slightly better than what we saw with the 1P-LP filter, we are still not able to set a precise cut-off frequency above which there is no attenuation, and so this filter is not capable of separating arbitrary frequencies, as one might expect of a true frequency-domain high-pass filter.


Next, we look at the step and impulse responses for the high pass filter. The ideal behaviour, if our aim is DC-removal, would be for the step to be blocked entirely, and the impulse to be passed. The filter’s actual responses (using the improved algorithm) are shown in figures 9 and 10.

Figure 9: Step response of the 1P1Z filter for varying alpha.
Figure 10: Impulse response of the 1P1Z filter for varying alpha.

So it seems that greater values of α give better performance for the impulse response, allowing more of the impulse to be passed and less perturbation to remain after the impulse, but causes a greater proportion of the step to also be passed, and results in a longer period of perturbation following the step. This should not be surprising, given that the low-pass filter took longer to adapt to the step change for high values of α, and the the high-pass filter’s output depends on the difference between the input and the low-pass output. It’s also worth noting that this sort of impulse response - a reduced impulse surrounded by or followed by a series of negative values - is common for DC-removal filters. The negative values result from the filter interpreting a small portion of the impulse as being a DC (constant or piecewise constant) component, and attempting to filter it out by subtracting that amount from the signal, even when the input has returned to zero.

So in summary, while the 1P-LP and 1P1Z-HP filters perform opposite functions, they have very similar strengths and weaknesses, in that they both have some usefulness in time-domain filtering, but little ability to separate frequencies, and are simple to implement and do not significantly delay the signal. Next, we will look at a completely unrelated filter, one which has a finite, rather than infinite impulse response.

The Moving Average Low Pass Filter

Enter the moving average (MA) filter, another common and simple time-domain filter for noise reduction and DC-passing. The MA filter takes the current input value, as well as a fixed number of previous input values, and averages them. The number of input values used in the averaging is termed the “length” of the filter, denoted “M” in the equation below.

$$y_{MA}[n] = \frac{1}{M} \sum\limits_{m=o}^{M-1} x[n-m]$$

The moving average filter falls into a different family than the previously discussed filters, in that it is a finite impulse response (FIR), rather than an infinite impulse response (IIR) filter. This means that an impulse input will only cause the filter’s output to become non-zero for a finite period of time, as opposed to IIR filters, where an impulse input will cause the filter’s output to become non-zero and then decay towards zero forever. It also implies that the output of the filter depends only on a finite number of previous input values, and not on previous output values, so its implementation does not rely on feedback or recursion. In general, FIR filters offer superior filtering, but at the cost of greater computational expense, and introducing long delays between input and output. (These delays are unavoidable, and independent of speed of processing. The signal is delayed because the filter must wait for a number of samples following an event such as a step, before the effects of that event are seen in the output.)

Despite the description of this filter as “non-recursive”, its most efficient implementation is actually a recursive one. This is unusual for FIR filters, but in the case of the moving average, the fact that all inputs used are weighted equally makes it possible to maintain a running sum, without needing to add all M previous inputs for each sample. The recursive implementation can be described as follows.

$$\begin{align}
y_{SUM}[n] &= y_{SUM}[n-1] + x[n] - x[n-M] \\
y_{MA}[n] &= \frac{y_{SUM}[n]}{M}
\end{align}$$

So how does the filter perform? We’ll look at its frequency-domain behaviour first.

Figure 11: Frequency response of moving average filter for varying filter lengths.

The frequency response plot for this filter shows that it has very poor high-frequency attenuation, with huge amounts of ripple. (I.e. Its attenuation repeatedly decreases and increases with changing frequency.) This shows that it should definitely not be regarded as a frequency-domain filter - it would be near useless for separating different frequency components.

As a curiosity, the source of the ripple is easy to visualise: any frequency whose period divides evenly into the filter length will be completely removed, as the output of the filter will be the sum of a whole number of cycles of that signal component, and the sum of a whole cycle of any sinusoid is zero. E.g. If the signal [0, 1, 0, -1] is input to a MA filter of length four, the output must be the sum of those four values, which is zero. Similarly, the signal [1, -1] would be added twice and would also cancel to zero, as (1 +(-1) + 1 +(-1)) = 0.

As for the filter’s time-domain behaviour, being a DC-passing filter, we would ideally like it to completely block an impulse, but pass a step unmodified. The filter’s impulse and step responses are shown in figures 12 and 13.

Figure 12: Impulse response of moving average filter for varying filter lengths. Note: Lines interpolating the signal values have been added for clarity.

The filter’s impulse response is to reduce the amplitude of the impulse to 1/M, while outputting this same value for M samples. This is similar to the behaviour seen in the 1P-LP filter, except that the perturbation following the impulse has a consistently low magnitude for a finite number of samples, rather than starting at a more moderate magnitude and then decaying toward zero over an infinite number of samples. As such, the effect of the impulse on the output of the filter is generally less noticeable with the MA filter.

Figure 13: Step response of moving average filter for varying filter lengths. Note: Lines interpolating the signal values have been added for clarity.

The effect on the step response is to slow the step from an instantaneous jump to a linear increase, with a longer filter resulting in the rise having a longer duration. Again, this parallels the behaviour of the 1P-LP filter, where greater values of α caused the filter to take longer to catch up with a step change in the input. The major difference between the two is that the MA filter rises linearly in the response to a step input, rather than as an exponential decay. The rise time of the MA filter is also easily predictable - it is simply equal to the length of the filter, as for a filter of length M, once M samples have occurred at the new input value, the old input value no longer affects the output of the filter in any way. This makes the choice of the length of the filter an easier task than the choice of the parameter α in the 1P-LP filter, as one can select a suitable rise time based on how much time is permitted to pass between the occurrence of an event, and the detection of that event in the filter’s output.

Aside from the moving average filter’s step and impulse response behaviour, it is also worth mentioning that it is extremely effective at noise-reduction, also known as smoothing, causing the magnitude of noise in a signal to be attenuated by a factor equal to the square root of the filter length. The level of noise attenuation it can achieve is therefore limited by maximum filter length that could be chosen without causing unacceptable signal delay or compromising the intended signal features. An example the effects of this filter on both noise and intended signal features is illustrated in figure 14.

Figure 14: Example application of moving average filter for noise reduction on a noisy, rectangular pulse input. As the filter length increases, noise decreases, but step rise time increases and the rectangular pulse becomes more triangular.

This brings us to the choice of filter length. Firstly, moving average filter lengths are usually chosen to be odd numbers, as the symmetry this creates, where the filter is the sum of a single central value, with an equal number of surround values on either side, makes the delay introduced by the filter more manageable. The delay introduced by a filter is equal to the time taken for an input sample to reach the filter’s midpoint, or, for a filter of length M, is (M-1)/2. This can be used to determine a maximum filter length, where is a delay of 10 samples was acceptable, then the maximum filter length would be 21. 

The effect of the filter on intended signal features must then be considered. In the case of DC extraction, the aim is to remove these features and preserve only the DC component, whereas in the case of smoothing, the aim is to remove noise while preserving these features. So a MA DC-extraction filter needs to have a length significantly greater than the longest transient features, in order to minimise the effect of those features on the extracted DC component, whereas a MA smoothing filter, at a minimum, should be kept shorter than the smallest simple feature that must be detected, or even shorter if the shape of the feature must be preserved. For example, a MA smoothing filter of length 21 would turn a rectangular pulse of length 21 into a triangular pulse of the same height, but this wouldn’t be a concern if we were only trying to identify the rectangular pulse by detecting when the signal rises above some threshold value. However, if the filter were of length 101, then the rectangular pulse would be reduced to approximately 1/5 of its original height, which might not exceed the threshold that was used to detect pulses, and so would cause the pulse to go undetected. In the case of more complicated features, such as a cardiac rhythm, a noise-reduction filter would need to be kept much shorter than the overall length of the feature, instead treating smaller components of the feature as individual features to be preserved.

The Moving Average DC-Removal Filter

The preceding comments discussed the use of the moving average filter for DC-extraction and smoothing only. But what if your goal is instead to remove the DC component of a signal? In this case, the same techniques used to create the 1P1Z-HP filter can be applied, in which the output of the low-pass filter was subtracted from the unfiltered signal. In fact, for a symmetric, odd length FIR filter, this technique works perfectly, producing a filter that is the exact opposite of the initial low-pass filter, rather than merely an approximation. While the resulting filter could possibly be called a moving average high-pass filter, we will instead refer to it as a DC-removal filter, as this name better reflects the fact that the filter is best viewed in the time domain rather than the frequency domain.

When inverting the previous MA filter, the only catch is that before subtracting the low-pass output, the unfiltered signal must be subjected to the same delay that the original low-pass filter caused to the signal, which for an MA filter of odd-numbered length M, is equal to (M-1)/2. Hence, re-using the previous recursive implementation of the MA filter, a DC-removal filter can be created as follows:

$$\begin{align}
y_{SUM}[n] &= y_{SUM}[n-1] + x[n] - x[n-M] \\
y_{DC\_REM}[n] &= x[n-\tfrac{M-1}{2}] - \frac{y_{SUM}[n]}{M}
\end{align}$$

We won’t bother looking at the frequency-magnitude response of this filter, since it is just a perfect inversion of the MA filter’s frequency response, which merely showed that it was near useless for separating frequency components. Instead, the impulse and step responses of this filter are shown in figures 15 and 16.

Figure 15: Impulse response of moving average DC-removal filter for varying filter lengths.

The first thing worth mentioning is how clearly the impulse response of a DC-removal filter shows the delay that the filter introduces to the signal it is processing, in this instance delaying the impulse by (M-1)/2 samples. Surrounding the delayed impulse is a slight negative dip, where a small proportion of the impulse has been counted towards the DC component of the signal, and as a result subtracted over the length of the filter. In contrast, the 1P1Z-HP filter did not delay the impulse, but had a sudden and comparatively large negative dip following the impulse which then decayed down towards zero indefinitely.

Figure 16: Step response of moving average DC-removal filter for varying filter lengths.

As for the step response, following the step in the input, the filter outputs a negatively increasing ramp which, at the midpoint of the filter’s length, steps up to a positive value and then ramps back down to zero. Ideally, we would have liked the step to have been blocked completely by the filter, so this output does appear unusual, however there is an intuitive explanation. During the M samples following the step, an increasing proportion of the step is included in the moving average, and subsequently subtracted from the delayed input signal to form the DC-removed output. So over these M samples, the value of the moving average, which is interpreted as being the DC component, linearly increases from zero to one, causing a linear decrease in the DC-removed output. After (M-1)/2 samples, the delayed step occurs and the delayed input value, from which the average is being subtracted, jumps from zero to one, causing the output to jump from negative to positive values. The average then continues to increase until the entirety of the moving average has passed the step, at which point both the average and the delayed input will be one, which cancel to give the eventual zero output. The most notable differences from the 1P1Z-HP filter’s step response are the existence of the delay, the perturbation around the step consisting of a negative values prior to the step and positive values following it, rather than only positive values following it, and the perturbation linearly increasing and decreasing rather than decaying exponentially. The maximum magnitude of the perturbation seen in the MA DC-removal filter was also less than that of the 1P1Z-HP filter, which had an initial jump to (1+α)/2 in response to the step, while the MA DC-removal filter’s output values all had magnitude less than 0.5.

Selecting the filter length for a MA DC-removal filter is identical to the DC-extraction described above - the length of the filter should be significantly greater than the longest transient features that the filter is intended to pass, but its length will also be limited by the maximum delay which it is tolerable for the filter to introduce. Finally, if both smoothing and DC-removal are needed, then it is possible to apply both a MA smoothing and a MA DC-removal filter to the signal in sequence, where the output of one filter is fed into the input of the other. The length of the DC-removal filter would need to be much greater than that on the smoothing filter, and it would make no difference which order they were applied in.

Conclusions

We have examined low-pass (or DC-extraction) and high-pass (or DC-removal) filters of two very different designs - single pole recursive filters, which are infinite impulse response (IIR) filters, and moving average filters, which are finite impulse response (FIR) filters. Three tools were used to assess the filters - the frequency-magnitude response, which indicates how the filter amplifies or attenuates different frequency components, the impulse response, which shows how the filter responds to a sudden, transient event, and the step response, which shows how the filter responds to a sudden and permanent change in input value.

The frequency-magnitude response for all four filters showed that none of them were suitable for separating frequency components, as they did not form distinct frequency bands where frequency components would be either almost entirely stopped or entirely passed depending on which band they belonged to. This meant that these filters were better considered in terms of their time-domain behaviour. (As an aside, true frequency-domain filters are more complicated than those presented here, and have not been described as they are less likely than time-domain filters to be useful for the intended audience of this article.)

The time-domain behaviour of the filters showed that the undesirable perturbations around step and impulse responses exhibited by the moving average filters were less problematic than those of the single pole filters, with the former showing perturbations with lower maximum magnitudes and duration which was not only finite, but easily predictable. However the drawback of the moving average filters was that they introduced a delay whereby events would not be seen in the output until after a number of samples had passed since when those events had occurred at the input.

Overall, the more favourable output of the moving average filters suggests that these filters are a better choice when delaying the signal is acceptable, but for filters which must operate in real-time, then IIR filters such as the 1P-LP and 1P1Z-HP may be a better option if a moving average filter of sufficient length to achieve the desired filtering would, by necessity, introduce an unacceptable delay.

Now, hopefully this article will have given you a reasonable grounding from which you can implement and apply some basic digital signal filtering techniques. However, it should be noted that this article has really only scratched the surface of what is possible in signal filtering, intending only to provide enough information to be able to apply these common filters, without delving into the complicated mathematical foundations of digital signal processing. The capabilities of digital signal filtering extend well beyond what has been presented in this introduction, and the field of DSP is far broader still, with filtering being only one aspect of it.

For those who are not afraid of a little math and are interested in learning more about digital signal processing, Rice University’s online course, Discrete Time Signals and Systems, is an excellent starting point. It is available on edX in two parts, though is currently archived, so while all lecture videos are available, assessment material may not be.

Discrete Time Signals and Systems, Part 1: Time Domain
Discrete Time Signals and Systems, Part 2: Frequency Domain

Following from that course, another excellent resource is Steven W. Smith's The Scientist and Engineer's Guide to Digital Signal Processing. It is a very practical guide to DSP, available to purchase in hardcover, or free to browse online. For those seeking a more mathematical treatment, MIT's online course 6.341x Discrete-Time Signal Processing, previously offered on edX, and EPFL's Digital Signal Processing, previously offered on Coursera, are both thoroughly worthwhile, though not currently available at the time of writing.