In this lesson, we continue in our discussion of Descriptive Analysis. Most specifically, we look at measure of central tendency and dispersion measures for ordinal data. We also discuss how for ordinal data the measure of central tendency would be the median and how dispersion can be described visually. The median is a value separating the higher half of the data from the lower half of the data. A perfect example was income level because this question is relevant not only in marketing but also in the news media. A common question in news media surveys is to indicate your level of income. People can choose less than 20,000, between 20,000 and 30,000 and so on. Once you run the survey, you might obtain the following tables. You can see that less than 10% of the respondents say that they earn $20,000 or less. 10% of the respondents would earn $50,000 or more. And the rest is in between. To sum, 10 + 27 + 14 + 39 + 10, should sum up to 100%. Now the question is, what is Median Income in this data. We see that if we sum, 10 + 27 + 14, was we obtain, will be like 51%. What we can infer from this table is that the median income is somewhere between 30,000 and 40,000 because more than 50% of the people would be below that 50%. And so the median is going to give you the sense of measure of central tendency. Similar to nominal data you can describe dispersion visually for such data. And so here again is a bar chart representing the responses in the data. As you can see, there seems to be some mild version disresponse. The last type of variable that we introduced earlier is interval data. I'm going to spend more time on this integral data throughout this module because they're very amenable to statistical analysis. The measure of central tendency for interval data is the Mean. So what is the Mean? The Mean is the arithmetic average of a set of numbers. Mathematically the expression of the mean would be as follows, X-bar in this formula would be the mean, Xi would be the response given by respondent i. And n would be the total number of respondents. And so what I do here is that I'm going to sum all the responses from respondents and then divide these number by the sample size. The measure of dispersion for interval data is a Standard Deviation. Standard Deviation informs you about the average distance of individual respondents from the mean. Hence, Xi- X-bar. If people are very similar from each other, Xi's are going to be very close to the average X-bar. If respondents are very different from each other, Xi's are going to be very different from the mean. And so, that's why it's a measure of dispersion because there's a notion of distance from the mean. If you are very close to the mean, it means that people are very homogenous and therefore, the standard deviation should be low. On the contrary, if respondents are very different from one another's, their Xi's are going to be very different from the mean of the sample, and therefore the standard deviation is going to be high. Let's look at an example. Imagine that you work for a company called Burger King, and you are interested in knowing how people rate you on the value of food items that you have. This question could be important for example, if you're thinking about redesigning the menu of Burger King. You could have asked the same question for competitors of Burger King's, let's say McDonald's. And this question could then be used to compare how respondents would rate McDonald's versus Burger King, which we'll do later. We have this question, on the scale of one to five, how does Burger King rate on the variety of food items they offer? And, you have the data for 10 respondents. Two, three, three, four, two, five, one, four, three, three, right. So, to compute the mean, what you need to do is sum all these numbers and divide them by 10, which is the total number of responses. And that will give you three. If you apply this formula, the formula of the standard deviation for this data, you'll get 1.5. These formulas are very useful in order to compare similar brands or competing brands because that will give you a sense of understanding how different the brands are in the mind of customers. Don't rate, more further than not, this formula the mean, the standard deviations are already included in softwares such as Excel, MATLAB, or HPSS, SaaS data, Qualtrics and so on. So you don't have to remember this formulas, what is more important is to understand what this formulas tell you about the data. Again, the Mean is measure of central tendency. And the Standard Deviation give you a sense of how much the responses differ from one another.