Data scientists often have to integrate and harmonize data from a variety of sources, and the data that you're integrating, is it always formatted in the same way? This can obviously cause problems when you're trying to blend this data together. Knowing a few tricks and how to change the fields you're looking at, whether it's a string or date, is very useful to know as you start to learn to combine data from multiple sources together. Over the next few lessons, we're going to go through a few ways that you can start to manipulate the data in order to blend it to make it work for your analysis. In this video, specifically, we're going to talk about things you can do with text strings. At the end of this lesson, you'll be able to concatenate or combine text strings together, trim text strings, use the substring function and change the case of your strings. All right. So let's begin. Strings are really important because we're going to encounter them a lot in text and categorical data. Strings pop up everywhere and are pieces of data we frequently work with. It's important, as we've discussed before, to retrieve the data in the format you need. So if there's something that you can process and do on the server versus your client application you're working with, then it's best to go ahead and do it there on the server. SQL supports a couple of different string functions. It can concatenate, which is a fancy data science word for linking things together. It can use substrings or SQL can trim the beginning and end of the string. So, let's discuss a few examples to learn these various functions. Let's go over concatenating something first. This function is really helpful when you're coming up with a unique ID for something. Maybe you want to concatenate the first couple letters of a person's first name and the first couple letters of the last name, or maybe you just want a unique category or field. To do this, we're going to do is use the pipe or vertical bar key. On most keyboards in the US, this is typically right above the backslash key and is usually accessible by holding Shift when hitting this key. In this example, I wanted to concatenate the company name and the contact name. What I've done is, I've used my select statement and I have pulled apart the company name and added to it the contact name. You can see, I called the company name here. Then I used the pipe to indicate that I want to concatenate, and then in parentheses, I indicate what I want the company name to concatenate with. In this case, it's contact name. Here, in this example, you can see I pulled in the company name and the contact name just so you can see them individually. Then you can see on this side the ultimate result from the concatenation. It's important to know that the different relational database management systems use different formatting than this. SQL server, for example, uses the plus sign instead of a pipe. Just be sure to look this up based on the type of application you're using. The next thing we're going to look at is trimming our strings. With this function, you can either trim everything off the front and the back, or you can just trim it from the right or left. To do this, we're going to use the simple function called trim. We also have Rtrim and Ltrim, for right trim and left trim respectively. Here, you can see my string is the thing that says, 'you the best' and there's trailing spaces before and after that. When I put it in parentheses and I have called the trim function before it, we can see that the end result takes care of all of the trailing spaces. This is just a really easy way to clean up your data and that will save you a lot of hassle in the long run. Finally, we're going to work with the substring. Substring is a useful function that allows you to pull apart just a portion of the string that you're looking at. Again, we talked about creating multiple unique IDs. Or maybe you want to combine the first couple of letters of somebody's name with the last few, or you're just trying to shorten a name because you only have so much space for it. If you use a function substr, it's going to return a specific number of characters for a particular position in the string. You can designate where you want it to start and where you want it to end. How this works is, you write out the function, substr. The first thing you're going to give this function is the string name. So where do you want it to pull it from? And then you want to say where is the string position. So that's what's the string starting position. Then what are the number of characters that you want returned? In this example, I'm going to look at the first name, then I'm going to have my substring start at the third character and I wanted to pull four characters starting there. As you can see, in the first example with Nancy, there aren't four characters. It just gives me whatever it can fill in with it. Then you can see where it starts. It didn't give me the N or the A. It started with the second N and then gave me this, NCY. The name, Andrew, is a little bit better example because I can see all four characters that I'm getting. But then if you go down to number nine, you can see that with N, I'm only going to be getting two characters. Another example of this is, again, if we wanted to just pull the first three characters of a person's name, say, I want it from first name and I'm going to start with the first character and go all the way through the third character. Finally, let's wrap up this lesson with one more quick way to clean up your data. Sometimes strings need to have their case changed. You get into this information with the free form fields or people's names. What happens is, some people use all caps or some people will capitalize their first name. It gets messy. Just to standardize your data and make it so that you can do comparisons or blend it with another field, it's really helpful to change everything to either upper or lower case. For this, again, it's just a simple function. You're going to say upper and then the column name that you want to change to uppercase. An alternative function that does the same thing is UCASE. Then, to convert strings to lowercase, you can use the function, LOWER. Okay, we've covered a lot in this lesson, so let's stop here. You should now know how to concatenate and trim strings, use the substring function to get a portion of a string and how to change a string's case when necessary. In our next few lessons, we're going to continue our discussion of data manipulation and we're going to move on to talk about day and time strings.