Welcome to our demo notebook here on stationarity. In this notebook, we want to dig deeper into what stationarity is. We'll do this by, again, building out each one of our own samples so we can see how we can construct a stationary series and a non-stationary series, and that non-stationary series in different ways. The learning outcomes should be that you have a practical understanding of what it means for time series to be stationary, you have a better understanding of ways to actually identify whether or not a series is stationary, and then, at the end, we'll discuss several common ways to transform non-stationary time series, and we'll do this again later on in the lecture and dive deeper into it in the lecture as well. Import are necessary libraries. Then, here, just to recap what it means to have a stationary time series, we want constant mean, constant variance, constant autocorrelation structure, and no periodic component. We're going to start off here with a random series, where each one of the values are normally distributed, but random, and that will ensure that we're at least starting off at a stationary series. So we run this. Then, we're going to create this run sequence plot, and all this is doing is really plotting out our x and y. Our x is going to be just that sequence np.arange100 that we have here. So just values 0-99. Then, the actual time series will be our y-axis. We'll say the values at each one of these different time steps. We can plot out our x and y, we can provide a title, an x label and a y label, and those will just be passed into this plot, same as the PLT functions that we've used throughout. We call this run sequence, and we see here our stationary series, which is just essentially random noise. We notice here that there's no clear trend or a seasonality, it has that constant mean, constant variance, that constant autocorrelation, a little bit difficult to tell right away, but it doesn't have those. What we really are worried about is when that autocorrelation is one or greater. That would mean that they're too highly correlated with past structures, and then that'll change the mean throughout, often, when we're actually working with those types of time series. We're actually going to see that here when we create toy time series here with some strong autocorrelation structure. We just set a random number here, and then we're creating this lagged data, and this lagged data is just going to be very highly correlated data. We start off with just an empty dataframe that's the same shape as our time data, and that time data was what we had up here, which was np.arange100. Then for t in time, so for every single value in our new array, we're going to set that value equal to first starting off at that seed, which is 3.14, plus some random noise with a mean of zero and standard deviation of 2.5. We're going to keep taking that 3.14, let's say the value comes out as one, then we'd have 4.14 as our first value. Then our seed is now equal to that value. So then, at the next position in the for loop, if we had 4.14, it'll be the seed at 4.14, plus some random noise. Autocorrelation of essentially one should be our average autocorrelation here. Then we just have random noise added on to that. The idea being that we can predict given the last lag that the value will be very low. The best prediction would be that exactly last value, and then we add on random noise, which we can't model perfectly. If you look at what this turns out to be, given the high autocorrelation with past values, you'll see that these values tend to trend up. As the value goes up, it'll continue to go up. If you think it went to 4.14, it would be highly correlated with 4.14 versus 3.14, continue to trend up, can continue to trend down if it gets a couple of negative values, and so on and so forth. At this point, you're probably wondering how to check if a time series is in fact stationary, if it meets our four conditions of having same mean, same variance, no seasonal components, and the autocorrelation structure. To figure out whether or not a value is actually stationary, you'd actually be starting your analysis with a run sequence plot. That's going to be an effective way to actually eyeball whether or not your values look stationary. To drive that point home, we're going to discuss some ways to generate non-stationary time series, such as those with trend, such as those that have changing variance or are heteroscedastic, those with a periodic component or have seasonality, and then a combination of both trend and seasonality. To get our trend, as we did in our last notebook, we're just going to multiply each value in our time by 2.75, and then we just have that straight line up. If we were to ask ourselves, why isn't this stationary, we should be able to immediately intuit given our definition of stationarity that the mean is changing over time and that's why it's not stationary. We're now going to look at heteroscedasticity. This time we're going to pin two different random variables together, one with a scale. So it's going to be a normal distribution, except that our standard deviation is one, so the variance is one, and these are both a size 50, recall that each one of our arrays had been of size 100, and the other one's going to have a variance of 10 or standard deviation of 10. There's going to be larger variance, one we're working with level 2 rather than level 1. If we append these two together, we see that at first, we have this very low variance, and then that variance jumps up later on in the last 50 values. Why is this data not stationary? It should be obvious that the variance changes over time, and therefore, it's not stationary. Then we have seasonality, which we're using the sine function to come up with these periodic components. We plot this out and we see there's this periodic component that's very regular. We can say, why is this data not stationary? It's because there's a periodic component. Then we do trend and seasonality. In order to do that, we add on trend plus seasonality. We're also going to add on that random noise from our stationary model. Just add a bit of noise to that. But again, that noise was added in much smaller scale than either our seasonality, or our trend, so we can just see this normal periodic components that trends up fairly linearly. Why isn't this stationary? It's because the mean changes over time and there's a periodic component. So that closes out our video here. In the next video, we're going to pick up with the exercises to create a time variable called MyTime and plot out these different run sequences that we have within datasets that aren't toy datasets and start thinking about whether each one of these is stationary. I'll see you there.