In spite of the sometimes abstract mathematical derivations that we've carried out so far. We should remember that digital signal processing is a very practical discipline. And we want to be able to implement everything that we have derived so far in very concrete terms on a computing platform. So today we will look at some algorithms for implementation of constant coefficient difference equations. We will study some block diagrams and we will look at the caveats when performing real-time processing. So here's an old friend in a slightly different form. This is a C-code implementation of the leaky integrate. And the way we implement the leaky integrator is by defining a function that we call each time a new input sample arrives that produces an output sample. So here's a function that takes one argument, a double, x. And inside the function we have some initialization. And then we have the computation part. Now, when it comes to the initialization, we first define the value for lambda. This is a constant value. And then we define local variable y that we will use to hold the previous value of the output. Remember, that CCD forward leaky integrator is simply y of n equal to lambda * y[n-1] + (1- lambda) x[n]. So we need to remember the previous value of the output when we call the function. However, we also need to first initialize this memory cell. Because the first time we will call the function, we will assume that the filter was at rest and therefore, all the output was zero from the first call to minus infinity. This is a very important point when we deal with practical implementation. And then the computation part is just a straighforward translation in C code of the CCDE. We multiply the output by lambda and we sum the scale input. And then we store the output value already in the memory cell before we will return and we return the value there. We can test whether this works or not. We write a little main function that calls the function repeatedly. And as the input passes the delta function, this is a cheap way to write the delta function. When n is equal to zero, we pass one to the function. Otherwise, we pass zero. And when we run this program, we get this output here, which is really the exponentially decaying impulse response that we expect. They key points that we learn from implementing the leaky integrator in practice is that one, we need the memory cell to store the previous output. We also need to initialize this memory cell before we use the filter for the first time. And in terms of computation, we see that we need two multiplications and one addition per output sample. Now we implement this in C. But the language is really irrelevant when it comes to implementing a digital filter. And this is really the power of the DSP paradigm. So here is the same guy implemented in Python. Python is an object-oriented programming language, so we can do something a little bit more elegant that what we did before. We can define a class that we can parametrize with different values of lambda. And therefore, we can instantiate different leaking integrators with different values of the leaking factor. So here, the initialization procedure gets its own lines of code. And what we do is we initialize lambda to a value that we pass and we initialize the memory cell that we need to have as well. And then finally, we can compute the output of the filter. Since we're in Python, it's very easy to pass an array here rather than a single value. And so we iterate over the array and compute the output for each member of the array. And what we produce in the end is another array that contains the output values. So again, we can test this. We instantiate a leaky integrator with a value of 0.95 for lambda. And then we compute the leaky integrator for a signal, which is again the delta signal. And to show that we can place our origin wherever we want, we actually provide a few zeros before the time n equal to zero. And the output gives us exactly what we expect. Some zero points before we hit one and then the exponentially decaying impulse response that we expect. We can try to do the same with another old friend, the moving average, who's the CDE is this one. And if we try to implement the moving average in C using the same paradigm as before, we define a function that we call each time a new input sample arrives. And then we do some initialization here. So we have to decide on the length of the moving average, this will be a constant M. Once we have decided on the length, we have to allocate an array of memory cells that contains the input samples that we're going to use to compute the average. And we also have to define an index on this array of memory cells. We initialize this index at minus one, this is a trick that will be clear in the second. This is standard C bookkeeping and this last year are still part of the initialization procedure. So remember, ix is initialize to minus one, so the first and the first time only that we call the function, we will enter this part of the code. And this part of the code is simply sets the memory cells to zero for first use. After which ix is initialize to zero and will never be minus one again. This part here simply stores the current value into the memory buffer and updates the index of the buffer, modulo capital M. We will see how that works in the next slide. And finally, this part computes the moving average simply by summing all the elements in the array and normalizing by capital M. The modulo M operation we saw before implements what is called a circular buffer. Which is a neat trick used in most DSP applications. Suppose I need to store five values of the input. So I define five memory cells like so, which I initialize to zero. And then as the samples start coming in, I start putting them into the memory cells. So I use an index ix to keep track of the next free cell for use. And so I initialize the thing at 0 and the first sample will go in here. At the next step, the sample will go on the next cell and I will update the index to the next free cell and so on and so forth. When I get to the last cell and I fill the buffer, I update ix modulo n, so that it will point back to the first cell of the array. And as you can see, now my array is full with the first five samples that have come in. In the next step, I will still need five samples, but I can forget the oldest sample. And so I will put x5 into the position number 0 and start filling the array once again. The implementation of the moving average teaches us a set of lessons that are very close to what we'll learn with a leaky integrator. In this case, we need M memory cells to implement the filter, the memory requirements are higher for the moving average. We need to initialize the cells once again before we first use them and for each output sample, we'll need now M additions and one addition. So when M grows, you can see that both memory and computational requirements grow as well. As I said, the details of the programming language that we choose to implement a filter are completely irrelevant. And we can talk about algorithms and even derive some optimizations just by taking into account the abstract blocks that are used to implement digital filters. And remember, these blocks are the addition between samples, the multiplication of samples by a scalar and the delay operator. With these three building blocks, we've seen this little bit before in module two, we can describe any constant coefficient difference equation. So here's for instance, the leaky integrator once again and we can draw this block diagram. Where we can see that the input is being scaled by 1 minus lambda and the output is fed back into the input via a delay and effect of lambda. We could do the same with the moving average. So here we have a series of delays that we'll store past values of the input up to n minus m plus one. So here you have the input, here you have one delay and so the output of this delay will be x(n- 1). Here, the output of the second delay is x(n- 2). And so on and so forth until the last delay where you have x(n- n + 1). [INAUDIBLE] these delayed samples are sum together and finally their scale by 1 over m. Another interesting structure that you're likely to encounter a lot in signal processing is the second order section. This is a transfer function where both the numerator and denominator are polynomials of the grid two is zeta minus one. So we'll have two poles and two zeros. We can implement the second order section in a very straightforward manner just by translating into block diagram language, the structural transfer function. So here you have the numerator of the transfer function delays that delay the input and scale it by the three coefficients of the numerator. And here you have a feedback loop that takes the output, delays it, scales it by the coefficients of the denominator and then fits it back into the system. So here's the interesting part. Because of the commutativity of the convolution, we can actually invert the order of the two sections. We can put the denominator first, followed by the numerator. So we first filter by 1 over A(z) and then filter by B(z). So nothing has changed in terms of the output. But now if you look at the structure, you see that the contents of these delay cells are going to be exactly the same at all time. So whatever goes in here also goes in here, so this is equal to this and this is equal to this. So we can lump the delay cells together and obtain the second order section in what is called direct form II. Where delays have been lumped together and so we have a net gain of two memory cells for this type of implementation. So you can see that we have managed to optimize the implementation of the second order section without necessarily having to write code in any specific programming language. Finally, a few words on real time signal processing. So if the input samples arrive every T seconds, where T can be very, very small. For instance, in audio, T is typically 1 over 44,000 seconds. Then each output sample must be computed in utmost capital T seconds. So the number of operations becomes very important and your choice of filter will depend on that. For instance, using an IIR versus an FIR. Some common tricks in applied signal processing is it uses circular buffers of size 2 to the power of K for a suitable value of k, so that the modulo operation is a simple bit mask. Or you can exploit the parallelism of some processors that allow you to fetch the data and compute a multiplication and addition in parallel in order to speed up the pipeline. Another problem with real systems is accuracy. Processing units will always have a finite precision, even if they use floating point. So overflow and underflow are a real problem. You can use more complex filter and structures to combat some of the accuracy problems. And other tricks involved using double precision inside the filtering loop. So for instance, if your architecture is 32 bits. Inside the filter and loop, you will use 64 bit arithmetic. You can split the filter into lower dissections or you can use floating point which although slower than the fixed point, can help combat some of the overflow issues.