So, imagine you have a corpus of data, and if you can assign a label, or a flag, or a value to each record in your data, then you can supervise machine learning. So, let's consider this table here. That's table from hypothetical production machine, where we measure max temperature, min temperature and max vibration. Note that those are already aggregations and asperity. So, asperity is our label. So, given temperature and vibration data, we want to predict asperity. This is a labeled dataset, and that we can use for supervised machine learning. Because using our dataset plus the label, we can train an algorithm in order to predict the future for unlabeled data. So, let's consider unlabeled new data. You fit that into our algorithm, and the algorithm will give us back the label or the value. So imagine we're giving the algorithm the value of 35, 35 and 12, then the algorithm will predict a value for asperity of 0.32. This is called regression. So, now imagine our label is not a continuous value, it's a binary value or at least a discrete value. So, in this case, the algorithm won't predict a number, it will predict a class variable zero or one, a broken yes or no. This is called classification. If you have only two classes, it is called binary classification. And if you have more than two classes, you have multi-class classification. That's basically all. So supervised learning, is either regression, value predicted continuous value, or classification, and there you have two different ways of doing it. You have binary classification, if you have only two labels or two target values, or its multi-class classification if you have multiple. But now, let's imagine you don't have a label to your data, and that's unfortunately true in most of the cases. So, I guess around 80-95% of all those data is unlabeled, maybe even more. So, what can we do there? So, let's have a look at our three dimensional dataset. If you pluck those three dimensions, we can obviously see without a label that there are three clusters That means that the machine has learned, we have learned something. So, we learned from the data by just plotting it that the data is centered around three clusters. Obviously, for high dimensional data, we cannot simply plot it but there are [inaudible] which are taking care of this. So, no worries here. This process is called clustering. In contrast to drawing lines, which is called classification. Imagine, we have more than three dimensions, five in this case, we cannot simply plot. So, there are two ways of doing that. We can use a clustering algorithm, which we will cover later or we use a process called dimension reduction, which we will also cover later. So, dimension reduction basically compresses our dataset to a smaller number of dimensions without losing too much information. That's a very cool topic and we will also cover it later. So in summary, just to give you an overview. Machine Learning contains supervised machine learning and unsupervised machine learning. Supervised machine learning contains regression, where you predict the continuous value and classification, where you predict a discrete value. In classification, you either predict two values, this is called binary classification or more than two values, this is called multi-class classification. On the other hand, in unsupervised machine learning, you have clustering and dimension reduction. But we will cover all those later in the course.