0:03

The next step in data abstraction is to identify what we call attribute types.

So we said, we have a collection of items and collections of attributes.

Now, we have to be able to define what type of attributes we have.

That's very important. So, we define three main types

of attributes: categorical, ordinal, and quantitative.

Let me describe each one.

So, a categorical attribute is an attribute

that contains values that describe categories.

And these categories don't have any particular order.

An example here is if we have a dataset describing people,

hair color would be blonde, brown,

brunette; or gender, male, female.

These are categories and there is no inherent order among them.

The second one is ordinal.

An ordinal attribute is very similar to categorical in

the sense that the values of a categorical attribute are categories,

but the main difference here is that these categories can be ordered.

Ordering them is meaningful.

For instance, economic status could be low, medium, and high,

or education level can be elementary,

high school, undergraduate, and graduate.

One thing to notice here, though,

is that even though these values can be ordered,

we just don't know what the distance among these categories is.

So, it's not really meaningful to perform

any arithmetic operation between these categories.

So, we have a collection of categories and the only thing that

we know in addition to categorical data is that,

in this case, these categories can be ordered.

The order is meaningful.

The last one is quantitative.

In a quantitative attribute,

the values represent some measured quantity.

For instance, height of a person or weight of a person.

These are all numbers that describe a measurement of something.

In this case, the distance between values is meaningful and it can be computed.

And in general, with quantitative attributes,

you can perform any kind of arithmetic operation among them.

Now, let's go to a few examples.

And I'm going to use a new dataset to show you how to identify attribute types.

In this new dataset,

we have a collection of product sales.

Imagine data coming from a company that keeps track of sales.

You can imagine, for instance,

something like Amazon or eBay or any other online retailer data.

Okay. In this dataset,

every single item is one order and

every column represents attributes of this order.

So here, I'm going to show you examples of

these attributes and giving you information about what type of attribute each one is.

So the first one is product category.

Product category defines what type of products are included in the order.

So for instance, here, we have office supplies,

technology, furniture, and so on.

This is a perfect example of a categorical attribute.

It's a number of categories and there's really no inherent order among these categories.

Next one, order priority,

low, medium, high, and critical.

Again, these are categories,

but these categories can be ordered.

And for this reason,

this attribute is considered ordinal.

The next one is sales.

The attributes sales captures information about the dollar amount

for that order and is a typical example of a quantitative attribute.

It's measuring a very specific quantity.