So here's the dilemma, you have quite a few tables that you need to bring together. This is the syntax for the UNION. Select all the fields that we want from one of these particular tables, UNION DISTINCT just write it out. And then you just do another table, and UNION DISTINCT and you do another table, and UNION DISTINCT and you do another table. And if you have more than 10 tables, your fingers will get tired of typing all these tables over time. So, I don't really want to type on UNIONs, it's going to make my code just extremely long for the query to read. So there must be a better way. And, now we introduce the concept of a Table Wildcard. This is pretty cool. So, much like using that like operator, when we looked at finding the charities that had help in the name, you could actually operate over the table names in your FROM statement and do a wildcard match in UNION, through the use of this asterisk here. This is pretty cool. So let's cover the SQL syntax there on the right. So same thing, you're selecting the fields that you want from all the tables. And then, you're adding the FROM clause, the data sets there. And for that table, you see it begins with a prefix gsod and then, there's an asterisk. So, any tables that are in that data set in its entirety that match that prefix are going to be included. And because the data set is very rigid and structured, so it's going to be year, year, year for each of those different tables. It's going to bring together all the historical, weather data tables from 1929 to current. You're talking about a lot of data, millions and millions of records, just with one single line of code there. All right, so what happens if you wanted to filter out for just a subset of the tables? So say you just wanted to find and match together all the temperatures from 1950 or after 1950. And here's another reserved keyword that's specific to BigQuery. You can use the table suffix to grab what you're actually matching on in that wildcard. And then, you can use that table suffix in a normal SQL operations. So here you can say, all right well, match everything but only include those tables that are after 1950 year. And, of course, you normally want to be as granular or make that prefix as long as you can for performance reasons just through limiting the amount that you're actually matching on. So table suffix and the wildcard are two very useful concepts. And that's the subject of our next key message. So, if you have many, many, many, many UNIONs across data sets that have data tables with structured standardized names, make use of that table wildcard. And also, what's really, really nice is this table suffix. You can use that to filter out the tables that are actually included. And, in addition, to filtering out in your WHERE clause, you can actually return the table name in your select statement by just calling that table suffix. So let me hop over to a quick BigQuery demo. We're going to show you exactly that. So, here we are in our demo. As I mentioned, the public data set for weather is extremely long. Look at this, it's going from 1929 all the way to current, and the scroll bar is real on this one guys. So what we want to do, so I can execute this query and we're using the table wild card, that's fantastic. But, say, I wanted only the temperature data after the year 1999. So what I could do is that WHERE_TABLE_SUFFIX is after 1999 and let's just do like a limit 10. I could run that. So interesting, right? So what this is telling me is that this is actually stored as a string. So we're going to encapsulate that and escape it as cause. And then, this is anything that occurs after 1999. And you can see the first one that I pulled just happened to be 2012. Now, what happens if I just wanted to do an interesting data quality exercise? So you're assuming, that the year information, this is given to you, that the year for each of these different tables that it's stored in matches the year that it's stored in for the table. So, one of the things that we mentioned is, you can actually store and recall the table suffix itself. Let me give that a separate name. Then let's take a look at what that returns. So, as you might hopefully expect the year, matches the year that's being stored for the temperature reading, matches the suffix on the table as it actually is stored in this public data set. And that's great news. So, if you want to actually apply that across all the data sets that we're looking at, we're going to remove this filter. And we want to see if there's any cases, where there was a human error, or a temperature reading was recorded on like New Year's Eve and it somehow got included into the next year. So what we can actually do, where the table suffix is not equal to the year in which that table is found. So hopefully, at the end of this again, seeing what we're doing here, is we're seeing where the table itself, it does not match the year of the temperature readings. Hopefully, if all goes well they should actually return zero results. But we're operating across all the different data sets here, it's looking over, let's see how much data this is actually going to process. So we have four gigabytes of data and it's looking at, it's comparing and saying, all right well for this record, compared against the table suffix for all the records after I compare against that table suffix. So it's doing a lot of iterative operations over the data here. And just after 32 seconds, the query is finished. And thankfully, our data quality check passes, the query returned zero results. It's giving me a notification basically saying, "Hey if you don't want to wait around for that 30 seconds we can actually notify you as well." But again, keep in mind what we're doing. We're doing a massive UNION of a whole bunch of these tables here on this left here. And then, we're applying this filter operation, so it's having to go through every single record here. Now technically, you wouldn't need the limit. But in case there was more than 10 results, I only wanted to see those top 10 anomalies. But, lucky for us it returned zero. So key concepts to review here, this wild card UNION, the table suffix which allowed you to access what actually is being compared in that asterisk there, that wild card. And, you can actually use the table suffix as part of your select statement which is pretty cool, and treat it just like you would any other field, you can use concatenation, that could be a part of your composite key. It's very fun to play around with. Let's hop back into the lecture's lines. So here are some pitfalls when it comes to doing UNIONs inside of SQL. So keep in mind the duplicative records piece that we mentioned before, when you want to remove the duplicative records when you're mashing multiple tables together. You'll be using a UNION DISTINCT. And when you want to just have all the records, no matter if there are duplicate records across multiple tables, you'll be using that UNION ALL. Now it does require that you choose a distinct or an all option. So, as you might imagine, you're hoping that the way these temperature recordings were recorded over time, that the field names haven't changed or the amount of fields that were required hasn't changed over time. Because if they do, when you're mashing these tables together vertically, if you're trying to put tables that don't have the same amount of columns, you're going to get a mismatch and you get an error in your UNION. So, it's very good to do a preprocessing exercise to make sure that unlike what we saw in cloud data prep, where you can have fields that are present in one day to step it down in the other, in a true SQL UNION, those tables need to match up exactly in the count of columns. And now you can do technically, you can have different column names and match together based on what's called the index of where that column appears. But generally, what I like to do just for my own sanity is make sure the names and the count of columns matched as well. So recap, we have join, UNION'd rather, we've matched together all that historical data from the past all the way to the current, and that's a lot of temperature readings. But, we still don't know where those temperature readings came from. And that's where we're actually going to be joining this massive amounts of consolidated temperature recordings with another data set, that is just a single table on that station reading information.