[MUSIC] We know that tabular data is two dimensional data, data that has rows and columns. And there are lots of different ways that we can store such data in Python, and in this video we're going to take a look at one of them. We're going to look at how we can use nested lists in order to store tabular data. That is, we're going to have lists inside of lists, right? So, let's take a look. Now, I really want to emphasize that there's no one correct way of storing tabular data in Python. In fact, there's not even one correct way of storing tabular data as lists of lists, okay? There are some choices that need to be made. So I don't want you to take this video as this is how you absolutely have to store your tabular data, but rather, this is one way you can store your tabular data, right? And you need to think about the data that you have and its properties, and how you might want to go about storing it in your programs. So, let's look at a table here. I have a table of the popularity of different programming languages over the years. And I've called that table popularity, and you can see that popularity is a list. You can see the open and close brackets enclosing the contents of this list. Now each list element in my popularity list is a list itself, okay? All right, so the overall list is going to be a list of rows in the table, all right? And each inner list is a single row in the table, right? And I've also made a choice here that the first row in the table is actually the headers. So you can see, here, it says language and then a bunch of different years, those are the different columns in the table. All right, so let's look at a particular programming language instead. Right, the first row here is the Java programming language. Okay, so the first element that appears in the first column of that row is the string Java. And then I have a bunch of numbers, and these are the rank of the Java programming language in each of the years for the different columns. So you can see in 2017, Java is the number one programming language, meaning it's the most popular in that year, right? And if we go back to 1997, it's the 15th most popular language. Now there is 0s in this table in certain locations. And those actually indicate that either it was unranked that year, or it didn't even exist, all right? And so in Java's case, in 1987, Java did not exist as a programming language, so I gotta make some choices here. How would I indicate that it doesn't have a rank. So in this case I chose to just use zero. So a zero here means there is no ranking for that language in that year. This is one of the kinds of choices you would need to make. How are you going to indicate the absence of data in your table. Notice also that my rows have both strings and numbers in them. So the inner lists are not consistent. They don't have things of all the same type. All right, maybe you want to do that, frequently, however, you want to make them all be the same type, and maybe they should all be strings, at least for the headers. But the data's actually different, so I made the choice to actually have different types inside the list. This is dangerous, if you recall, when I told you about lists, I said not to do this, so be careful about this particular choice. But if you look, I have a program language C, the name of the language is inherently a string. And the ranking, well, that's really a number, so I want it to be a number. Now, how would I go about printing out this table in a way such that someone using this program could actually read it nicely. Obviously inside the program here, the way it's structured, it's kind of hard to figure out, what was the popularity of PHP in 2002, right? It's going to take some effort to figure that out, right? Well I'm going to use a format string here. And if you've forgotten what format strings are and how they work, you should think about what you've learned in our previous courses, or you can go look it up in the Python documentation. All right, and so what I'm going to do now is actually print the table out. First thing I'm going to do is recognize that the first row of the table is actually the headers in the table, all right? So we take that out and we say here's the headers. And the header row, now I'm going to create a string here using that format string. And you'll notice that I pass *headers to format, what is that? Well, if you remember how the format method works, it takes a bunch of arguments, and each argument goes into one of the fields in the format string. The headers is actually a list, so that's not going to work nicely. Well Python, if you have a sequence and use star in front of it, what it does is it expands it out into its constituent elements. So this is just a way, *headers, to basically use a list where I actually needed individual arguments. So it'll take my header's list and turn it into eight individual arguments that it can then pass to the format method of my format string. This creates a new string, which I'm going to call header_row. I can print it out, and then I can take its length to put a nice line beneath it. Then I'm going to iterate over the subsequent rows, okay? So I'm going to use list slicing to get rid of the header row in my iteration, and again I'm going to use the format_string. And again I'm going to use the star operator here to expand each row list into its constituent elements, so I can pass them to format. And let's take a look at what happens when we do this. Okay, now my table looks much nicer. You can print it out and you see that hey, I do have my two dimensional tabular data stored in lists of lists. And it is possible to print it out nicely and get a nice looking table here. And now I can see, yes, back in 1997, PHP, its ranking was 0, meaning it was unranked. Now it's important to be able to print out my tabular data, but in general I probably actually want to try and use the tabular data inside my program. And so I want to be able to access the different elements of the table, all right? So it is a list of lists, so I can just index each list. So if I want to find Python's popularity in 1997, well, I have the popularity list. I know that Python is in the fifth row of the table. So I can do popularity[5]. And then I want to find out what was going on in 1997. So I figure out that's the fifth column. So I got back a list when I did popularity[5], which was the row. I indexed into that list, with another [5] that'll give me the 5th element of that row, and this should tell me what Python's popularity was in 1997. It was ranked 27th. Well, hopefully you immediately see the problem here, right? How did I figure out to use five and five? Well I actually had to look at the data myself, as a human, not the program. I'm looking at it and I'm counting zero, one, two, three, four, five to find the row and column indices. This is not really viable, okay [LAUGH]? Unless you knew this ahead of time somehow in your program, you knew that you wanted the fifth row and the fifth column, you're not really going to be able to use your tabular data very effectively in this way. All right, so how can we do better? Well, we can exploit the fact that my rows and columns actually have headers, right? I have a header row up at the top and the first column of each row is the name of the language, so that's effectively a header for each row. Okay, so I can write functions here that take a table and a column name or take a table and a row name and return the index of that row or that column. Okay, all right, and so if I use these functions, I can find the index for 1997, right? Which column is 1997, by calling find_col(popularity, 1997). I can find the index of the python row by calling find_row(popularity, "Python"), okay? And then I can use those indices to figure out the popularity of Python in 1997, let's do that, make sure it works. Okay, I still get 27, which if you look at the table is correct. And now I can programmatically find information about any programming language from any year simply by using those functions. Now this only works, all right, because my table is formatted in a very particular way. So if you're going to do this, you need to make sure all your data looks consistent. And then you can write functions that use these header rows and header columns in order to be able to figure out where anything is inside of your table. As I've said, there is no one correct way to store tabular data in Python. In this video, we saw one way to use lists of lists in order to store our two-dimensional tabular data. And I made choices here, right? My data had a header row and a header column, so I was able to store those, and that allowed me to more easily access data within the table. I also made another choice where I stored both strings and integers within the table. Remember, that could be a little bit dangerous. So you need to make sure that your data is consistent and that you're able to use it properly if you do something like that, okay? I encourage you strongly to study the code that was used in this video. Make sure that you understand it, so that when you have to make the decision you can say, yes, this makes sense, and here's how I'm going to access data in the table. I also don't want you to walk away thinking, hey, Python is really not a popular language, it was 27th. Notice that it was rising [LAUGH] in the rankings and it's actually pretty high up there now, right?