In the previous lesson you learned how to derive the solution to the ordinary weighted least squares problem. First, we collect a set of data points x, y, where the x component of every data pair is a difference in state of charge over some time interval. And the y component of the data pair is the accumulated ampere hours of current flowing through the battery cell during that same interval of time. Then we compute a summation that I have called c1, which is equal to the weighted summation of the x squared points. And we compute a variable, c2, equal to the weighted summation of the product of the x and y data points from every data pair. Then when we're done, we simply compute Q hat equal to c2 divided by c1. It's possible to make this operation very efficient computationally. Suppose we have already collected many data points and can computed an estimate of total capacity for that number of data points but we make one new measurement, we get one new data pair of x and y. We don't need to recompute both summations when this happens. Instead notice that the value of c1 computed for n data points is equal to the value of c1 computed for n minus 1 data points, plus an additional term that depends only on the most recently collected data pair. Also notice that c2 computed for n data points is equal to the same c2 computed for n minus 1 data points, plus an additional term that depends only on the most recently collected data pair. So we don't need to store an ever-increasing vector of data points that grows without bound over time as we make more and more measurements. Instead, we simply store the most recent versions of the variable c1 and c2, which is two numbers, and we update those quantities every time and new data point becomes available. Then, once we update the c1 and c2, of course, we update the total capacity estimate, Q hat at time n equal to c2 at time n divided by c1 at time n. This approach is called a recursive approach because the present computation depends only on the previous result, plus whatever new data become available. This approach minimizes storage requirements, as we've already talked about, and it evens out the amount of computation that's required every time a new data point becomes available. If we recomputed both summations every time a new data point became available from scratch, then we would have to do an ever-increasing number of operations inside of each summation every time a new data point became available. And we're summing up more and more and more data every time. And so this requires more and more floating point operations every time, and it slows down. Instead, though, if we use this recursive approach, there's a constant number of operations every time we get a new data point. We simply update the old c1 using a constant number of operations, and we update the old c2 using a constant number of operations, and then we do the division. So this recursive approach is much more conducive to being implemented in a battery management system having finite resources. One challenge is introduced, though, when we use the recursive solution, and that is that we require initial estimates of c1 and c2 before we can begin the recursion. We might simply set both variables equal to 0 and hope that the filtering nature of the updates done recursively are somehow going to correct for some error in the initialization. In fact, over time this does tend to happen, but it's not the best approach to initializing these variables. Instead, what I recommend is we can create an artificial or synthetic zeroth measurement. In other words, we're not making an actual measurement of a data point at time zero, but we pretend that we have made such a measurement. And we pretend that we have this measurement by performing an experiment on the cell. We pretend that we have taken a battery cell that starts at 100% state of charge and we have fully discharged it down to 0%. And we've recorded the ampere hour difference over that process, which of course, is the entire capacity of this cell. And so we're going to approximate that with the nominal capacity of this cell, assuming that the manufacturer is fairly accurate at predicting what the capacity of any particular cell is going to be. So then we compute the value for c1 and c2 by setting x0 equal to the different in state of charge between these two points, which is the difference between 100% and 0%, which is 1. And we set y0 equal to the nominal capacity, which represents the amount of capacity discharged over this range. So we have an idea of what x and y are for the imaginary or synthetic data point, and we still need to select a value for the variance of the uncertainty of the measurement. And ideally, this value is going to be the manufacturing tolerance of total capacity for all battery sales of a particular design. So overall, we would set c1 at time 0 equal to 1 divided by the variance, and we would set c2 at time 0 equal to the nominal capacity divided by this variance. So that gives us a method to implement the ordinary weighted least squares method efficiently using a recursion. But one consequence of the cost function that we've used is that it assumes the total capacity never changes. It assumes that total capacity is a constant value at all points in time. And in fact, you know this is not true because you know that the battery cell is going to degrade, and that when it degrades, it's total capacity will tend to decrease over time. In order to account for this in a cost function we would need to change the cost function so that it places more emphasis on recent data points and less emphasis on earlier data points, so that the estimator has more flexibility to track capacity as it changes over time as the cell ages. And this is called the fading memory approach. So there are different ways for implementing a fading memory approach, but I'm going to present one here that is computationally very efficient. Here I share a cost function, a revised cost function, that I call the fading memory weighted least squares cost function. The only difference between this cost function and the previous weighted least squares cost function is the addition of a multiplicative term in every element inside of this summation. And this is the term that places more emphasis on recent data points and less emphasis on previous data points. You can see that this weighting is equal to a variable named gamma, raised to some power. If you look at this carefully, you will notice that gamma has an exponent of 0 for the most recent data point, and it has higher exponents for previous data points. So if we use a gamma that has a value between 0 and 1, then the weighting of different data points is going to look like the figure drawn to the right. The most recent data point has a weighting factor of gamma to the power of 0, which is 1. The prior data point has a waiting factor of gamma raised to the power 1, which is simply gamma. The previous data point to that has a waiting factor of gamma squared, which is less than gamma, and so forth. So overall, because of this waiting function we're forgetting the past, or at least we're putting less emphasis on the past inside of the cost function. And that's why I call it fading memory. And you can see in the description of the cost function in the subscript, I used the terminology FM for fading memory to point out that this is what we are doing. I do not go through all of the steps of the derivation here, but to optimize the fading memory cost function we're going to do basically the same steps that I showed you for the standard cost function in the prior lesson. When we follow all of these steps, we find that Q hat is equal to the quotient of two summations, just as it was before, but now you can see that these summations have a forgetting factor inside of them, whereas, of course, they did not in the previous lesson. The good news when choosing this particular method for incorporating fading memory or putting less emphasis on the past, is that it can be computed in a recursive manner. We define a new variable, c1 tilda, that is very similar to the previous variable that we called c1. In this notation the tilda tells you that the variable is one that's being used for a fading memory cost function instead of for a standard cost function. The recursive relationship says that the new value of this variable is equal to the forgetting factor gamma, multiplying the previous value, and then plus the new input. Similarly, when we're considering c2 tilde, we find that we compute its new value as the forgetting factor gama multiplying the previous value, plus a new input. So overall this revised fading memory solution has essentially the same computational requirements as the standard solution. Two more multiplies I guess, but it gives a lot more flexibility in an implementation. In summary, we've now derived the weighted least squares and the fading memory weighted least squares methods to estimating the total capacity of a battery cell. Both of these methods do have some nice properties. They both give closed-form solutions to estimate Q hat. They both require only simple operations, like multiplication, and addition, and division. There's no complicated floating point mathematics or linear algebra or optimization libraries needed to implement either of these methods. Both solutions can also be computed in a recursive matter, which means that we don't require storing the entire history of measured data points, and we can even out over time the computation that is required when we update any particular measurement. And finally, we can easily add fading memory to either one of these methods so that the adaption of Q hat places more emphasis on recent measurements, and so the recent total capacity of this cell, and less emphasis on very distant past measurements. And this allows the total capacity to adjust more easily when the total capacity of the cell actually changes. But remember that I've told you, I've shared with you already that all of the weighted least squares methods are biased. The reason for deriving them is so we can use them as a baseline against which to compare the more accurate solutions. And also the derivation method that we've used here is going to be very similar to the one that we'll use to derive these more advanced methods as well. And so I wanted to start out with something that was a little bit simpler, where we could go through the method, and then you could see how the method was used, before we applied it in the more complicated condition. So now that we've derived the weighted least squares methods, it's time to move on to derive the weighted total least squares solution, and that's what we're going to do next.