0:06
>> As I suggested at the end of the last video,
there must be a better way of dealing with CSV files than trying to write
our own complicated code to deal with all the possible corner cases.
As I've said, CSV files are very widely
used by a variety of different programs for storing tabular data.
Well, Python is going to help us out here.
Python has a CSV Module designed to help us read and write CSV files.
So in this video, I'm going to introduce you to this CSV module,
and we'll see a little bit of what it can do.
So, we know how to import modules in Python.
We have the import statement.
So up at the top of my file here,
I'm going to import CSV.
So that imports the CSV module for me,
and now I can make use of it.
I going to write a parse function similar to what we saw in the last video,
except I'm going to use the CSV module now.
So again, I'm going to take a CSV filename.
This is the name of the file that contains the data which we want to read.
And again, I'm going to parse it as a list of lists.
So I start out initializing my variable table to be an empty list.
And then, I open my CSV file.
Now, things get a little different.
I call a function called readers.
So I have csvreader = csv.reader here.
And that takes an open file as an argument.
And then, there's a whole list of options that you can also parse.
Here, I'm just going to parse one,
to show you that you can do so. I have skipinitialspace=True.
And what that's telling the csvreader is that,
"If there are spaces at the beginning of a field,
just throw them away.
I don't care about them."
Okay, so now I have this csvreader object,
I can iterate over it,
and that gives me the rows of the CSV file.
And in fact, it gives medium as a list. All right.
So all I have to do here,
is loop over the rows in my csvreader,
append it to my table, and I'm done.
Hey, that's a lot easier.
Let's step back for a second.
I want to make sure that you internalize what's actually happening here.
What does it mean when I call csv.reader?
Well, remember that CSV is a Python module.
I imported it. I have import CSV.
So when I have csv.something,
it now means access something from within the CSV module.
So reader is a function inside the CSV module.
So when I call csv.reader,
that means call the reader function of the CSV module.
So reader is this function inside the CSV module that creates one of
these csvreader objects for us that we can then
iterate over to get the rows in our CSV file.
In order to see how well this works,
we're going to want to print out the table that we get.
So we have the print table function which is
exactly the same as it was in the last video.
It takes in a list of list,
and prints it out as a nicely formatted two dimensional table.
So here, I have my table created by calling parce with hightemp.csv.
Again, exactly the same CSV file we used in the previous video,
and I'm going to print it out.
And let's see what happens.
Okay, that looks nice.
All right, it printed things out nicely,
but that's not that impressive.
We didn't have to write code that was very complicated to accomplish this.
I was simply splitting the lines on commas.
Where the CSV module is really going to shine,
is when we have CSV files that are not sort
of perfectly formatted like that hightemp.csv file.
Remember, when we tried to parse hightemp2.csv,
Well, things weren't so pretty.
Now, let's see how the CSV module and Python does.
Oh, wow. That looks almost exactly the same, right?
It has the addition of the country names in the cities,
but that was because that was in the file.
You'll notice that the quotes are not actually appearing in
the table because the quotes had to do with the CSV file.
They were simply telling you,
"This is what is in this particular field," or, "this particular column."
And so, the quotes are gone when we print things out.
Everything is lined up nicely because
the csvreader understood what the different fields were.
It got rid of the extra spaces because I had skipinitialspace set to
true and everything looks exactly the same as it did in the previous table,
except for the things that were actually different in the file which were the city names.
I just want to remind you what hightemp2.csv looks like.
And I encourage you to take a look at this file and
think about the code that we wrote to manually
parse the CSV file and then the code that we wrote using the CSV module.
And hopefully, you can now start to appreciate the fact that you don't have to
write a lot of complicated code to make things work out nicely here.
And this is only scratching the surface of what I can do with the CSV module.
Notice here, we have extra spaces,
notice that we have quotes around the column names in the header row.
So city is in quotes for example.
Remember, like I said,
the cities now have their countries with them,
separated by a comma.
So we need to be able to differentiate between that comma which is inside
the quoted column value and the comma
that is actually separating the fields inside of the row.
The CSV module handled all this seamlessly for us,
so that we didn't have to worry about it.
Now you've seen that Python has a nice CSV module that will help us read CSV files.
It'll actually also help us write CSV files but we'll get to that.
Now, the real power of the CSV module,
is that it handles messy CSV files gracefully.
As we've seen, it's not too difficult to write
a room parser for a CSV file that's formatted very nicely.
But once things get a little messy,
well, then it gets a little bit harder.
And so, this helps you deal with the CSV files
like the ones I wrote perhaps, hightemp2.csv.
Maybe there are other people like me that write things messily like that.
And if so, the csvreader is really going to come to our rescue and clean that up
for us so that we can get nice tabular data into our Python programs.
Now, the csvreader also takes a variety of other options that give you
even more flexibility for dealing with CSV files that are formatted in different ways.
And we'll explore that in future videos.