Now, let's talk in more details about
our first class of dynamic weight and variable models, namely State-Space Models.
I remind you that state-space models are dynamic weight and variable models,
where both the hidden and observed states are continuous.
So they are appropriate when your data is continuous.
The problem of modeling continuous data with
a dynamic weight and state is formulated as follows.
Assume that we have T observations of the observed signal y,
which can be either univariate or multivariate depending on your settings.
We want to build a dynamic latent variable model with a hidden state x
^(t) that would capture the dynamics of the system and filter noise out.
If we can do this the hidden state
becomes sufficient statistic for predicting the future.
As in a general case that we introduced earlier in State-Space Models we
assume that the dynamics of the continuous hidden state is first-order Markov.
The absorbed state depends only on the hidden state and nothing else.
In particular it does not directly depend on the previous value,
as will be the case for a regular Markov model.
This means that the probability of the total times
serious path of VPRXY is given by the product over
all time steps T of a product of
transition probability for X times
the mission probability for Y given the value of facts.
We can understand this expression better if we take a special case.
Let's say T=1 so we have only once Yorum in the product.
So in this case the likelihood of complete data is just the product of two probabilities
one for the transition of the hidden state
and another for the emission of the observed signal Y.
Okay, this is the likelihood of complete data,
but we do not have complete data as state X is hidden.
So to compare actual data we have to marginalize over x.
That is to integrate over all possible various effects.
This gives the second formula here.
Now if we rename the variable x1 to Omega.
For example, this formula is the same as a continuous mixture model,
where we take a continuum of probability distributions pure of white,
conditional on omega model parameters data and
integrate over all parameters Omega with sound waits for 4 omega.
If you filter discretous this integral you obtain now
find a mixture model for example a Gaussian mixture model can be updated in this way.
This means that if we are only interested in
the marginal distribution at some fixed future times e. As seen from
today we will not need any states-based model
irregular non dynamic mixture model for
example or Gaussian mixture model would work for these task.
But, if you knew the model some distributions for
future random variables simultaneously for a few or many a different time horizons,
these were states-based models become useful.
Okay let's now come back to a general case when we have more than one times step.
So we are back to our original formula with a product over
all time steps for the probability of the total path of X and Y.
As we just saw it involves the history of the hidden state that we do not really have.
So what would be the steps needed to do times serious forecasting in this framework.
If you look at one step forecast then we need to compute
the probability of the next well of Y conditional on
the previous values of y and the current value of X.
Such probability can be expressed in terms of
some model that is parametrize by some vector of parameter data.
So the model will have two sets of unknowns the hidden state x and model parameters data.
Therefore, the problem of forecasting for dynamic latent models including
both states-space models and
hidden Markov models amounts to two tasks inference and learning.
In the task of inference we learn about the hidden state,
the model parameters are assumed to be fixed in this task.
And the task of learning we learn model parameters
while keeping our assumptions about the hidden state tweaks.
In a specific case of the em algorithm as a way of training camp machine learning
algorithm the inference task respond
to the Estep and they're learning task respond to the Mstep.
A special and highly tractable case
of states-space models is called linear-Gaussian state-space models.
In this specification the next state X is linear in a previous state with
the Gaussian noise and w. Also the observation Y is linear in the current value effects,
plus another independent Gaussian noise V. Now,
the second equation for the observable y here generalizes Factor Analysis.
In factor analysis we had the random variable X that did not depend on time.
Now, we have a random process X of T that changes with time.
Inference and learning in such model can be done using again the EM algorithm.
The Estep of the algorithm learns the posterior over
the human variables X given observe values y and model parameters theta.
For this specific case of
Linear-Gaussian models this procedure is called the Kalman Smoothing.
The Mstep then estimates all parameters given the fixed distribution of human labels.
It turns out that due to a particularly simple structure of
linear-Gaussian state-based models the Mstep in these models can be non analytically.
We will not go into details of that,
but you can always do it on your own once you know the name of the method,
then understand what it does at the high level.
Instead I would like to mention here a couple of examples
of using state-space methods in finance.
One of them deals with analysis of firms leverage,
which is the ratio of debt to the total firm value given by the sum of debt and equity.
A popular assumption regarding the dynamics of leverage is that
firms try to adjust their leverage ratios to some optimal value.
Sometimes referred to as the target leverage.
A such optimal target leverage is not directly observable and can also change with time.
It can be modeled using a dynamic hidden variable.
The paper by Roberts constructed such latent variable model by
modeling factors describing the through unobserved values of marginal tax rate,
probability of bankruptcy, firm size,
investment opportunities, and average industry leverage.
Now observed values corresponding to these factors as well
as the leverage itself are obtained as hidden values those blasts noise.
This was formulated an estimate that as
states-space model in these papers by Roberts in 2002.
Other research for example,
by a Loeffler and Mauller in 2009 has found that
dynamic models of firms leverage help
improve default prediction models, for counterparty defaults.
And finally, I want to mention that
various stochastic volatility models
popular in equity which is affects communities and so on,
can also be formulated an estimated as states-space models.
Where volatility plays the role of a hidden label.
Hey. So let's take a break at this point and
continue with hidden Markov models in our next video.