[MUSIC] Another index that is useful to select interesting concepts in large layers is concept probability. One problem with stability is that small intents are usually more stable than large intents. But this is just because small sets are more likely to be closed. So the idea behind concept probability is to actually compute how likely is it that a certain Intent is indeed an intent, that is closed. Okay. So let's define something that we call Naive probability of an attribute set being closed. We have a formal context G, M, I. And for each attribute m, we denote by pm probability of an arbitrary object having the attribute m. Where can we get this probability from? Well, maybe it's given a priori. So maybe we know that the probability of this attribute is 0.3. Maybe this comes from some background domain knowledge that we have. If this is not the case then we can estimate this probability by simply looking at the frequencies of attributes and we say that pm is the frequency of attribute M in our context. The number of objects that have attribute M divided by the number of all objects, by the size of G. Okay, and this is where the naivety comes from. PB denotes the probability of an object having all attributes from the set B. And we compute it simply by multiplying individual probabilities of attributes from B. So, this is naive because here, we assume that attributes are independent of each other. So, if they are independent, then this is a correct definition for probability of a set B. So we assume that attributes are independent. We compute the probability of an attribute set being closed. And if it turns out to be low, so the probability is low but the set is still closed. This means that this is not by chance that these attributes from this set they do indeed have something in common. They're not independent from each other. So that's the intuition behind the notion we're going to define next. So by n we denote the size of G. And this is how we compute the probability of a set B being closed. So the probability of B being equal to B double prime. Well the probability of B being equal to B double prime is that, some of the probabilities of the event B equals to B double prime and the size of B prime is k, where k goes from 0 to n, where n is the size of the object set. So, how do we compute this? Well, how do we compute the formula under the sigma? Under the sum sign? There are n over k ways to choose k objects from n objects. For each such choice, we need to compute the probability that these are precisely the objects that have B. That have attributes from B. So, first of all, all of these k objects must have all attributes from B. So, we'll have pB raised to the power k. This is the probability that all of these k objects have all attributes from B. Then it's also important that all the other objects that are on this k objects, they don't have at least one attribute from B. So it's not true that they have all attributes from B. And 1- pB is the probability of an object not having all attributes from B. So we raise this to the power n minus k because that's the number of objects that remain. Also it's important for B to be closed that these k objects don't share any other attribute which is not part of B. So pm to k is the probability of all k objects having attribute M and one minus pm to k is the probability that at least one of the k objects doesn't have this attribute M. So we iterate over all attributes outside B and we multiply these probabilities. And then we multiply all this together and we sum over different values of k and we get the probability of B being closed. Well the idea here is that if we see a concept that has low probability, well if it has low probability, it shouldn't be there. But we still see it. We still observe it. That means that there is something about its attributes that really connects them together. So it has a low probability under the assumption that attributes are independent. But we observe this concept. This means that our assumption is probably incorrect. And so these attributes, the attributes of this concept are not independent. There's something that makes them connected to each other. Well, this notion of probability alone is not enough to filter out noisy concepts. But it can be used to see some interesting concepts. And it can be combined with other criteria to remove noise and to find concept, interesting in some sense. So let's look at our Zoo data set. Let's look at those concepts at the least probable concepts that have size at least two. So these are the extent of this concepts. Wasp and honeybee, while it really makes sense to group them together. Chicken, dove, parakeet. These are birds which are in a way, similar to each other. Sea lion and seal. Well, that's almost the same thing. And so on. We can also raise the threshold on the extents size to let's say, five. And then the least probable concept is going to be this one, pussycat, pony, a reindeer, girl, goat, and calf. Or its a little bit strange, why girl is part of the zoo, but for some reason it is in this data set. The intent is hair, toothed, catsize, backbone,domestic,milk and breathes. So this concept is interesting in probably different sense. It is really strange. First of all, we can see that a girl is here and it shouldn't be there. So maybe there's some error in our database. And why reindeer is classified as domestic? Well, maybe it's another error. So this concept is certainly interesting and probability let's us see it. [MUSIC]