Welcome again to the course analyzing on processing of music applications. This week we're talking about the sinusoidal model. And in the previous theory lecture we presented the actual model and a little bit about how this sinusoids can be visualizing the spectrum domain. So let's continue with that. We'll first overview a little bit the model. And then focus on the idea of detecting peaks in the spectral domain which are going to be our sinusoids and also were going to be talking about how to track them in time. So these sinewaves vary in time in a particular sound so how can we track these time-varying sine waves in a spectrogram. So the sine model is simple, it's the sum of time-varying sinusoids. And in the previous class we made emphasis on this idea that we need to detect the sinusoids in the frequency domain. Therefore we need to choose a window size that is good enough to be able to isolate those sinusoids, or at least be able to visually have separate peaks for these sinusoids. In this particular case, these two sinusoids, 440 and 490, we need a window size of 5,291 samples in order for being able to have these two main lobes separate. But how do we measure the values of the sinusoids in the spectrum? Well, if we choose the window size correctly, as we showed, we will have a peak for every sinusoid in the signal. Then, to measure the frequency and amplitude of each sinusoid, we have to just find the tip of the peak and its height. So, we defined a peak as the location in the magnetic spectrum. So, piece of art, the peak is going to be the magnitude spectrum and the location case of zero, when the previous location so the location of spectrum went k sub 0- 1 is smaller, and the next location so the location at k sub 0 +1 is also smaller. So that's going to be what we call a local maximum. So in this plot we see local maxima for the different peaks, okay. So the crosses, the blue crosses are local maxima. And also in the phase spectrum what we basically do is just read the location of these local maxima in the phase spectrum. So we have, with this procedure, we have the location of the peak, which is going to be k sub zero, then we have the magnitude which is going to be mx of k sub zero. And we have the phase which is going to be px of k sub zero. But, there's a little problem. In fact, you can visually identify this problem, there is not enough resolution. So one solution to this resolution problem is to do zero paring. We use a net 50 size, much larger than the window size. And we had been talking about that. So for example, in this particular example, we have increased the 50 size of the previous example by two. The window size was the same but we added zero to the n up to the next par of two. And this results into a much smoother spectrum. And now the local maxima computed in the same way, so these blue crosses are much better. Visually you can see that they are located in a better place. They are located in a place that looks like it's more the center of the peak and therefore the magnitude and the phase will be better. But, if we want a good enough resolution we will need to increase the precise plot and that would be quite computationally expensive. So is there any other solution? Well, yes we can do a cheaper interpolation by using for example parabolic interpolation. The parabola is a function with a shape quite similar to the tip on the lobe of the spectrum of the analytic window in the divi scale. So this is the equation of a parabola where a p is the center of the parabola. A is the concavity measure and b is the offset. And this is a sample version of the parabola. And so we have the different values n. And so this is very similar to what we would see as a sample function okay? So to perform a parabolic interpolation on a spectral peak we can just use the three highest values of the magnitudes of the peak. For example in this case the k-1, k and k+1 locations of the mX spectrum can be considered to be the three highest values of a parabola. So x[-1], x[0] and x[1]. And then we can apply the equation of the parabola in a way to find where would be the center of this parabola. And the center of the parabola will be defined by this equation, which k sub p, which will be kind of the frequency or the location of the center. Would be equal to this k, but that was the highest point. Plus these values, which is going to be the previous and the next values and computed in this way. And then it's going to be very easy to read what would be the amplitude of these values so we can basically plug in the interpolated case of p into the parabolic equation, and obtain the amplitude and the height of this peak. Okay, so with this equation, basically, we can re-find the peak positions. And we can combine it with the zero pairing. So to get a much better result. So this example is exactly that. There is some parabolic interpolation together with some zero pairing. If you would zoom into this you would see that this is quite a bit better than the two previous examples, the one without any zero padding or the one with just zero padding. So this will give us quite a good measure of frequency, amplitude and phase of all the sinusoids present in a spectrum. So we will get these. So we'll get a value of the frequency location of a peak, k sub p. And it will be computed with this equation. Then we can convert this k into hertz, into frequency by multiplying by the sampling rate of the value by N. Then we can compute the amplitude by using this parabolic idea. And finally, the phase will be very straightforward. We can just read the phase of that location and these doesn't even have to have any interpolation by just reading the closest that should be enough. But these are the values of a single frame and the sinusoidal model is about time varying sounds. So time varying sinusoids, how do we go about that? So we have to deal with spectrogram and find varying sinusoids. And we will define sine varying sinusoid as a stable peak track in the spectrogram. So this variability will be constrained. And we will define a sort of peaks that evolve in time that sort of move in time but not so much. And this stability will be measured by the frequency and amplitude of successive frames. We could also look at phase derivative in time frequency for that. In practice we're going to just focus on the frequency and amplitude variability but the phase is also a very interesting value to look at. So, the condition for a peak of a frame to be part of a track is define by these equation. So a peak, a particular frequency of a peak, f sub p, of a given frame l, will be part of a track if the distance between that frequency and the frequency of the previous frame so the truck is coming from the previous frame. If that the absolute value of this difference is smaller than threshold. And also, if it exists a track for a certain amount of time. So we will be making a constrain in terms of the frequency, in terms that it doesn't change that much and at the same time we will make it a constrain in terms of how long has this track been in existence. If it's too short, it means it's like a short burst and that's not really a sinusoid. So this is an example of tracking these sine waves in a sound. In a. Is a drum using turkey and several other places. This is a non-harmonic sound so it's sound that is quite complex. But it has a partial it has a sinusoids and we can track them so with this argorhythm so with this black lines correspond to the tracking. So let's hear the bendy first. [MUSIC] Okay so its an instrument doesn't have a clear peach and therefore the tracks are a little bit kind of very unstable and they keep changing and appearing and disappearing. So this is what we'd find. And also we can look it the same tracks in the phase background. So if we take the phase spectrogram, which is the derivative of the phase, and we plot those black lines that the frequency tracks on top of that, we get this result. And it's quite interesting because clearly we see that the black tracks are in the places where the color is clear, and these are the locations where the phases changes less, where there is more phase continuity. So these indicate that also the phase information will be quite relevant in tracking. These sinusoids because the face information allows us to distinguish this variability, instability aspect of a given partial. So sorry, they're not that many references on these topics. So apart from the Wikipedia entry for sinusoidal model or, of course, Julius Smith's discussion on spectrum analysis on his online book and that's basically it. So this is all for this lecture. We have presented sinusoidal model, sound or presentation built on top of the short-term transform that reduces the amount of information and that get reads of non relevant spectral components. However, its use for the analysis of sounds is not as easy as the use of the STFT. We have to understand a bit more about spectra and about windows in order to make good use of this model. Hopefully by now you're starting to get a grasp on that. Now we have to complete the explanation of the sinusoidal model by describing the synthesis part. And this is what we will be doing in the next lecture. So I hope to see you then, bye-bye.