Welcome back to the course on audio signal processing for music applications. In the previous two theory lectures, we presented the same sort of model, the analysis part. Now, in this third and final lecture, we will finish the topic by talking about the synthesis part of the model. Therefore by generating a sound out of the analysis that we did. We'll first review the model and the concept of spectral peaks and spectral tracks. And then, we will focus on the concept of sinusoidal synthesis which is also called additive synthesis. And then, we will finish by talking about the complete system, the complete sinusoidal model system that does analysis and synthesis. So as we showed in the last lecture, the sinusoidal model considers that the sound is the sum of time variants. It's expressed by this equation, which the output signal is some of sinusoids of time bearing sinusoids. So a good way to show the results of the analysis is to plug the frequencies of the time bearing sinusoids on top of the magnitude spectrogram of the sound. Of course, it's sinusoid apart from the frequency, it has also magnitudes and phases which are not shown. So for example in this plot, here we are showing the spectrogram of a flute sound that we can listen to. [SOUND] And then on top of that, we are showing these lines which are the frequency tracks of the sinusoids that have been identified. Of course, again there is much more and it's a quite good compromise. It gives us quite an intuitive visualization of the sinusoidal model. But now from these values, the ones we have analyzed, we want to synthesize the sound. The standard way to synthesize sinusoids is to use additive synthesis. And this is the standard block diagram of an additive synthesizer, in which we have a series of oscillators. Each one from an input of a given amplitude and frequency. It generates a sinusoid, a time variance sinusoid, and then we can sum all these things together to generate the output. But, how do we generate each individual sine wave? The most straightforward to generate a sinusoid is to use the sinusoidal function directly. So this function that you see here can be easily implemented in a programming language like Python, and like you see here. So we can have specific amplitude frequencies to control this function. But it is quite expensive, especially when we deal with complex signals in which we might have to call these functions hundreds of times, and add every sample. So let's purpose another way. We can use the DFT to do synthesis. So if we start from the spectrum of sinusoid and do the inverse So in this equation we show how to do it, basically we start. From the magnitude spectrum of the sinusoid and the phase spectrum of the sinusoid, and we just take the inverse DFT of that. So here, the plots show a very special case in which we would have. The spectrum of a sinusoidal that has one of the discrete frequencies of the DFT or of the FFT. So we just have one value and then we just take the inverse DFT of the whole array and we get this nice looking sinusoid of length in 64. Too bad that things are not that simple. This only works for this frequencies that have a very discrete frequency, one of the DFT ones. But we know also how to represent sinusoid that might have, frequency is different from these frequencies. So, set the sinusoid does not have a single spectrum value. It has values for all frequencies. Values that depend on the window that we can trigger. And then, to generate a sinusoid in the frequency domain, we have to generate the transfer of the other window. And place it at the right frequency face. So in here, this equation expresses this idea that the spectrum of the is in fact the spectrum of the window shifted to the right frequency. And multiplied by the right amplitude and again the spectrum is also the face spectrum shifted to that particular location. And the plots show two examples of two different windows for the same sinusoid with a different window applied to it. Of course is expressed with different magnitude and face spectra, and then when we take the inverse of that we get a nice looking sinusoids. Of course with the windows applied to them, but in the time the main, all samples are the same weight. And one great advantage in the frequency domain is that, not all samples have the same weight. So we might take advantage of that in order to make this whole process a little bit more efficient. For example, this emphasizes the idea of the main lobe of this window. Those samples, the samples of the main lobes, are the ones that carry the most weight, the most amplitude. So in the examples before with the Hamming and the Blackman Harris we can now plot just the spectrum of the main lobe samples. Here we see the main lobe samples with a dark red, and the rest of the samples with a light red. And then if we just take the inverse DFT of just those samples, of the samples of the main lobe, well we get these blue shapes, these sinusoids. Which from a first look they look okay, but If we actually measure the signal to noise ratio. So basically how this would be distinguished from a real sinusoid, a very synthetic sinusoid that would be perfect. We see that the humming window has a bigger signal to noise ratio than the window. Because of course, the samples of the main lobe of a Hamming window carry less weight than for the case of the Blackman Harris. So in the Hamming window the signal to noise ratio is 63 dB. And in the case of the Blackman Harris it's very good, it's 102 decibels. So basically, the distortion or the noise is insignificant for audio applications. So it's clear that the Blackman Harris is a good choice for generating sinusoids in the frequency domain. We just need eight samples, which are the main lobe of Blackman-harris has eight samples. And if we just take those eight samples and do the inverse, we can generate a sine wave as long as the FFT we have. And then, if we want to generate several sinusoids in the spectrum, we can just generate several main lobe. So from this equation, we can see that the sum of sinusoids in the frequencies of main is the sum of main lobes, the sum of the main lobes of this window. In this case, the windows because as we will see is the one that we'll be choosing. So in this example, we are generating three sine waves. One at 1000 hertz, another at 4000 hertz, and another at 8000 hertz. So we place the main lobes of the windows in those locations, each one with a different amplitude. We also generate the faces for those main lobes. And we can just take, then, the inverse DFT of these combined spectrum, and we get a signal which is in fact, the sum of these three sine waves, of course, multiplied by a Blackman window. Okay, and then we can put together an analysis synthesis approach using this idea. So we start from a fragment of a sound, in this case a oboe sound, then we compute the spectrum. We find the locations, and these we can use any window, any DFT size, any FFT size, whatever is appropriate for the analysis. Then, we basically do an additive synthesis approach in the frequency domain by generating main lobes From these cross valleys, from these peak locations, the FFT can be a different size. The window, in this case, is a different one because we are using a Blackman-Harris window and then we do the inverse of that. And hopefully, these resynthesized sound is identical to or similar to the original one. Of course, with now we have a Blackman-Harris window applied to that, but we have one problem and that's a problem of overlap in the process of synthesizing. The Blackman-Harris window requires a big overlap. If we want to generate a time bearing signal, the overlap of Blackman-Harris should be very high, even bigger than one eighth of the window length. So the solution is to undo the Blackman-Harris window and apply a window whose overlap factor is a little bit better. And this is a proposed approach that is commonly used, which is, as we see in the top plot, is our synthesized signal, which has this Blackman-Harris window applied to that. And then what we're going to do is divide by the Blackman-Harris and multiply by a triangular Function. But we're going to do it on not the whole size, but only on half of the size. So in fact, what we're going to do is multiply by this third shape, so we will divide by the and multiply or multiply by this A third shape that we see here. And then, the result is a similar function to the previous one, but now this function which is half as long has been multiplied by a triangular function that can be overlapped by 50% so that's pretty good. So it would be like 25% of the initial size Okay so this is our approach that we will be doing and which we can do another lap of 25%. That's the required with this process. So we start with the spectrum with the Windows, then we do the inverse with the [INAUDIBLE] and then do it and apply the triangular function and that's it. We are done. We can do analysis synthesis using sinusoidal approach. Like we did with STFD, we can put it all together into an analysis synthesis system. So we start from x of n, our signal, our complete signal. Then we multiply by our analysis window. Compute the FFT, obtain the magnitude and the spectrum. And now, we are starting with sinusoidal analysis. We're starting with the detection of the peaks detecting in the magnitude spectrum, the location and the amplitude and the face of the peaks. And then we can track those peaks in time by applying some of these constraints of continuity of the frequency of values. So that we construct sinusoidal tracks. So we have a track per every time varying sinusoid and this is our analysis result. We have these tracks of sinusoidal values, and we can synthesise with what we just explained in this lecture. So we can do it in the spectral domain by generating this main lobes of windows, which is quite efficient. We just need to generate eight or nine samples per sinusoid in the frequency of main. And then compute the inverse transform. And obtained the synthesized sound, which we can undo the window of the Blackman-Harris, apply the triangular function in a way that we have an overlap at process that works at 25%. And it reproduces quite well the original sound in some way or at least a part of it. Let's see an example. So this is an example of an analysis synthesis using the sinusoidal model. So this is a sound of a bendir, which is this Turkish drum which is one stroke of it. Let's hear that. Okay, and then the analysis shows just the tracks, the sine and Soto tracks, that we have obtained. So we see only the frequencies of them plotted in the time frequency space. And then from those, we can synthesize the sinusoids, and hopefully, recover the sound and let's hear that. [SOUND] If you pay a little bit of attention, well, it's not exactly the same, specially in the attack you can hear some difference. And that's one of the thing that we'll be trying to work on in the next few lectures. So for the references in these topics, as I said there is not much, for additive synthesis there is quite a bit. But you can look yourself in Wikipedia and again in Julius' online book you can find some relevant material. And this is all for the sinusoidal topic, at least from a theoretical point of view. We will also demonstrate, do more programming approach to that. In these three lectures we have started from the shorthand of fourier transform. And we have seen how to analyze sinusoids on top of that. And this is a way to simplify the spectrum. So in fact, it can be used for compression. But at the same time, it's a very useful representation for many other things, as we will see in future classes. So we'll continue in this lecture towards a more useful and higher level representations that capture some of the prominent essence of a given sound. So next class we will take advantage of some of the characteristics of the sound. To apply some more constrain in the model so hopefully we'll get some other aspects of these analysis synthesis that can be of interest for many applications. So thank you for your attention, and I see you next class.