Welcome to course three, we're going to build upon everything we learned in course one and course two, with regards to Bayesian statistics and Monte Carlo Simulations. And we're going to be introduced to the framework PyMC3, which is a relatively scalable probabilistic programming framework in Python. So we'll start up with an overview of probabilistic programming, what it is, and how it can be used to solve many statistical problems. Then we will introduce PyMC3 the programming KPI and work through various examples on how to use PyMC3 in real life applications. A probabilistic programming framework, is essentially a tool that allows you to work with data and distributions. It allows you to write the relationships between input data and output data through models and equations. And allow you to perform estimation of parameters, and then use those parameters to make predictions for the future. So an overview of popular probabilistic framework is given in this post by George Ho, who's one of the developers of PyMC3. He outlines the components needed for a probabilistic framework in this figure here. So some of the components where you need a way to represent distributions, and transformations of data and distributions. Almost all these frameworks have a set of inference algorithms in it and, some sort of optimizer that's either custom made, or it uses a optimizer that's based off a tool like PyMC3. When you're trying to determine a proper probabilistic framework to use for your project, it's probably a good idea to select one that is a robust set of diagnostic measures. Either through numeric metrics, or through visualization techniques. So what is by PyMC3? It is a probabilistic programming framework for performing Bayesian modeling and visualization. It uses Theano which is a symbolic computation engine used for running on CPUs or GPUs, it's kind of similar to TensorFlow PyTorch in that regard. PyMC3 has algorithms for performing Monte Carlo simulations as well as variations inference algorithms. It also has a very powerful diagnostic visualization tool called ArViz, so we'll be using that extensively throughout this course. So PyMC3 can be used to infer values of parameters of models, that were unsure about by utilizing the observed data. So, we can look at an example from the website shown here, so the first example that we're looking at is that of an ODE or an ordinary differential equation, that represents freefall. So in this example here we're trying to estimate the parameters of air resistance from this ODE represented here, which is dy by dt is a term or the derivative of position with respect to time. And we're relating that to the position here through a term air resistance, and you have n which is the mass of the object and g is the gravitation constant. So we have some understanding of the physics behind freefall as represented by this ODE here. And we have some observed or measured variables such as the mass position, and the velocity, but we don't know what the parameter of air resistance is here. So we can use PyMC3 to perform inference and, get an estimate of the distribution of potential values of air resistance. [SOUND] A key point to note here is that the more information we have regarding other variables, the more certainty we have, in our desired value of interest in this case that's air resistance. Suppose we're unsure about this gravitation constant g here, the way that we would implement that is by applying a prior distribution, as opposed to using a constant value for g. So in that case we get more uncertainty in the air resistance variable that we're trying to infer as well. So what does the general structure of a PyMC3 program look like? So it consists of phenomena that are represented by equations made up of random variables, and deterministic variables. The random variables can be divided into observed variables, and unobserved variables. The observed variables are those for which we have data, and the unobserved variables are those for which we have to specify a prior distribution. So now let's look at how to represent both of these variables within PyMC3. So you'll notice the first thing that you see here as we have this context manager, that's this pm.Model(). So pm stands for PyMC3, and Model is the method that we're calling to set up that context manager. So any statements regarding model building has to happen within this context manager here. So, this line is how we would represent or set up an observed variable, it's fairly simple, all we're doing is saying, we have a variable named x, that can be drawn from a normal distribution, with mean of 0, and a standard deviation of 1. And finally, we're providing the observed values here, as a parameter to this observed keyword. The unobserved variables can also be represented in a similar fashion, you would notice the only difference here is that we have no value for the parameter observed because, we have done to provide. So all we're doing here is saying that we have an unobserved variable named x that can be drawn from a normal distribution with a mean of 0, and a standard deviation of 1. So now we'll look at an example of linear regression to understand some of the fundamental features of PyMC3.