You remember that we tried to whet your appetite with respect to 4E analysis, by showing you a mystery signal, by showing you that by changing the bases in which the signal is represented, we all of a sudden could see some structure in the data. So now we can go back to that example and analyze it, applying what we know about DFT. So we take the DFT of the mystery signal, we plot the real and imaginary part of the DFT, and we see that the structure that appears is a couple of spikes at symmetric positions in the spectrum. Now we already know that this indicates the presence of a strong sinusoidal component. And more importantly, a sinusoidal component that has a frequency which is a multiple of the basic frequency for the space that the signal lives in. Now if you look in detail at what happens in the spectrum, we see that the peaks are for k = 64 and for k = 960. We also see that the peaks appear only in the real part of the spectrum. And we remember that when this happens, the underlying sinusoid, the sinusoid represented by this peaks, is a cosine. So from this simple visual inspection, we can already write our signal as such. There will be a cosine component, and we will have to determine both the frequency and the initial phase of this cosine. And there will be another part that we can probably call the noise component, in the sense that it doesn't have any structure. Since the imaginary part of the four year transform doesn't exhibit any particular peak, we can assume that the phase of the consign is zero. And, as far as the frequency is concerned, we know that the peak occurs at k equals 64. We are in a space of 1024 points, and therefore omega will be equal to two pi over 1024, the basic frequency for the space, times 64. If we plot the two components separately at this point, we see that we have, indeed, a cosine that oscillates 64 times in 1024 points and we have an additional noise component that doesn't really seem to have any structure. Let's now look at another signal, one that we showed in the introduction to this class. And that is a time series that maps the solar activity since the 1700s. Astronomers have noticed that solar activity could be related to, what they call, a sun spot number. The details are not really important, but fundamentally a measure of how many solar spots are at a given point in time on the face of the sun. So we have a data set that goes from 1749 to 2003, that is equivalent to 2900 months, so our spots are computed every month. And as we plot this data, we see that there is an oscillatory behavior. We're interested in finding out is there's any fundamental periodicity in solar activity, so we can take Fourier transform. The sunspot time series is a real signal. Now, if we're interested just in the magnitude of the DFT coefficients, you remember, we need only show the first half of them. Even so, if you look at the first 1500 coefficients, you see that after the 100th coefficient or so, their magnitude becomes too small to be relevant. So, in this plot for clarity, we just show the first 100 DFT coefficiency magnitude. And indeed, we can see that there is not just one peak, but a series of peaks. Now, what is the fundamental period? What is the most dominant mode of this time series? The main peak happens at k = 22. This tells us that there are 22 cycles over the 2904 data points that we showed in the previous picture. Therefore the main period of the signal is 2904 over 22, which corresponds to approximately 11 years. Which means that solar spot activity has an inherit periodicity of 11 years. We can perform the same analysis with the data set that records the daily temperature over a total of 2920 days. If we take the DFT of this time series, we see that there is actually a very pronounced periodicity in the spectrum. There is basically just one peak, and if we plot to normalized DFT coefficient, so if we divide the DFT coefficients by the length of the temperature vector, we can actually extract some information about the temperature values in this time series. So remember, for instance, the DFT coefficient for k equal to zero is the non-normalized average of all the data points. Because x(0) is simply the sum of all the points in time series, for n goes from 0 to the big N- 1. So if you plot the normalized coefficients, the coefficient of 0 will be the average temperature for the time series, which in this case happen to be 12.3 Celsius degrees. And then, we remark that there is a peek in the DFT for k = 8. So if we sum up what we learn from the DFT about the temperature signal, we know that the average value of the temperature is the 0-th DFT coefficient normalized, so 12.3 degrees Celsius. The main peak is at k=8 for a value of 6.4 degrees. What that means is, there are 8 cycles of the temperature signal over the entire duration of our data set, which 2920 days. And so the period is 2900 divided by eight, which happens to be 365 days. So indeed the temperature has a yearly periodicity, as we all know very well. Now the value of the DFT main peak is 6.4 degrees Celsius. Now if we had a sinusoid of the form a cosine of omega n, and we take the DFT of the signal. We will have a peak in magnitude for some index k, that has a value of A over 2. Remember the definition of the DFT of a cosine function. So from this, we can say that the temperature excursion around the average is twice the value of the DFT main peak. And so we can say that the yearly temperature at the point of measurement for the year, was an average of 12.3 degrees, plus or minus 12.8. So, this is a second example in which, in order to find the real world frequency of a certain DFT component, we take the total length of the signal, and we divide by the location of the peak. This is actually a general method to label the frequency axis of a DFT plot. And it will be very useful in the future when we start analyzing generic signals that have been sampled in a variety of contexts. So if you remember, a few lessons ago we informally introduced the notion of a clock for a digital signal processing system, and this is really equivalent to associating a certain time interval, Ts, measured in actual physical seconds between successive samples in a signal. So for instance, for the solar spot signal, we had a Ts of one month, whereas in the temperature signal, Ts was equal to one day. We will see that for audio signal, Ts becomes very small because we will need to take at least 8,000 samples per second. Now, if we have a value for Ts, which is determined by the experimental setup, we can reason like so. The fastest positive frequency in a digital signal is omega equal to pie. So a sinusoid at that frequency, needs two samples to do a full revolution, remember the unit circle. Something that moves at the speed of pie will be here at instant n, here at n plus 1, here at n plus 2. So two samples to complete the full revolution. Now the clock Ts can also be expressed as 1 over Fs, where Fs is the frequency of the system. This is the standard relationship between period and frequency for any system. So if the real world period for the fastest sinusoid in a digital system is 2 times Ts, measured in seconds, the real world frequency for the fastest sinusoid is Fs over 2. So the maximum frequency once we've fixed the period between samples, is Fs over 2, or equivalently, 1 over 2Ts. So let's take a concrete example. Suppose I give you an audio file that records a train whistle. And I'm asking you to find out which notes make up the whistle. [SOUND] And of course your idea is to take the DFT of this file, and find out the peaks and if we were transform, which will probably correspond to the fundamental frequencies of the notes that are being played. So in order to do so, you need some extra information and namely you need to know the sampling frequency of this file, enable the time between successive samples. So the sampling frequency is Fs equal to 8,000 Hz, which means that Ts is 1 over 8000 seconds. This is the time between samples. And also, the file contains 32,768 samples. We will see later why we often choose signal lengths that are a power of two. So if you take the DFT of this file and plot its magnitude, you see there are three peaks that probably corresponds to the three notes being played. Now in order to label the axis, remember what we said. The highest frequency for a digital system corresponds to half the inherent sampling, right. And on the DFT vector this point will correspond to the midpoint in the vector, namely the point for K = N/2. So here indeed we have the midpoint of the DFT vector, and therefore this will correspond to a frequency of 4 kilohertz. We also have three peaks that take place for different values of K. To find the real frequency there, we just apply a linear mapping from 0 to 4 kilohertz. And so the frequency of any intermediate point will be 4 kilohertz divided by the total number of points, multiplied by the index K. With this rule, we can find the value in hertz correspondent to the three peaks in the DFT. And the values are shown here in the figure. And if you look up the closest note in the western tuning that these frequencies correspond to, you obtain that the train whistle is nothing but a simple B minor chord.