Welcome back. In this demo, we are going to look at compute autoscaling. Autoscaling is also referred to as horizontal scaling, where you scale out and scale in your compute instances to match the traffic these instances are getting. How does it work in practice? Well, in a previous demo, we had this setup where we had a couple of web servers running behind a load balancer for high availability. Now, it's very difficult to provision capacity like this because you don't know in advance how many web servers would you need. What we do in for practical purposes is we replace these individual web servers, for example, with autoscaled web server. What I mean here is, as you can see from the icon, the number of instances can scale out and scale in, meaning go up or go down depending on the traffic you are getting. This happens on an automatic basis, so you don't have to do it manually. You don't have to manually scale these out or in. The way it works is first you create something, which is called an instance configuration. It's nothing but a template that defines the setting to use when creating these compute instances, then you create something called an Instance pool. You use instance pools to manage multiple compute instances as a group, and then you create this thing called autoscaling configuration. You use autoscaling configuration to automatically adjust the number of compute instances in an instance pool. This helps you provide consistent performance when demand is high and reduce your cost when demand is low, and this happens automatically. Let's look at this in action. I'm logged in to my Oracle Cloud account, and to bring instance configuration, and instance pool, and autoscaling configuration, click on Compute from the navigation menu. You will see all these features listed under the compute servers here. The first thing we are going to do is we're going to create an instance configuration. Click on Instance Configuration and this will bring up the landing page for creating instance configuration. Click on Instance Configuration here. Provide a name. We will say this is an instance configuration demo. We are creating this in a sandbox compartment. That is fine. Placement is okay. For right now, I'm using a specific AD. Then for the image, instead of Oracle Linux, I want to use an Ubuntu image and so I'll select that. Then for the shape, let me just change to use Intel flex shape. That's fine. One CPU and eight GB of memory is fine, so I select the shape here. Then down below, you can see that I have already created a network called autoscaling VCN, and I've added a public subnet called autoscaling subnet in here. It's a public subnet so you can see I'm assigning a public IPv4 address. This is all good for me. As far as the SSH keys are concerned, let me just bring up our Cloud Shell. In a previous demo, we did create SSH private key and public key pair. Let's actually use that for this particular demo as well. These are the keys we generated and paste the public key here. Scroll down below, everything else looks good and I'll click "Create" here. Now what this will do is it will create an instance configuration. Now it's a template which basically defines what settings we are going to use when creating more compute instances. I could have done this using an already existing instance. I could have taken a configuration or a template from the running instance as well so we could have done that just as a matter of choice. With the instance configuration completed, let's go ahead and create the next step, which is creating an instance pool. I'll click on Instance Pool from the navigation on the left-hand side and click on "Create instance pool", give it a name here. Let's say it's an instance pool demo. Sandbox compartment is fine. Then it's asking to pick an instance configuration, the one we just created earlier. Then it's saying what's the default number of instances that can be provision that you want to be provision. Let's start with two instances. If you scroll down here, you can see all the details like operating system I'm running is Ubuntu, the particular shape I'm running, etc., etc. You can get all the details here. With two instances, I'm going to click "Next", and then it's asking how do I want my pool to be placed. This is where the graphic I was showing you at the beginning is relevant. I can choose to spread these instances across two availability domains. I'll pick my VCN, which we just created, and it has a regional subnet. I'll use that and I will add another availability domain here. I'll pick AD 2, I'll pick the same VCN, and I'll pick the subnet within that region and I can attach a load balancer here. If you recall from the graphic, we had a load balancer. In these instances, we're running behind a load balancer. I'm going to generate load for these instances using a different mechanism. I'm going to skip it here. But in a typical situation, you will be running these behind a load balancer. Depending on when your demand is high, when you get a lot of traffic, you will scale out, and then when you have less traffic, you will scale it. That's basically the functionality here. You could attach a load balancer. I'll click "Next" and then I'll click "Create" here. This will create my instance pool. Remember again, the whole idea why you create an instance pool is to manage multiple compute instances as a group. Once this is right now in a state of provisioning, once it's provisioned, you can click on attached instances here and you can see the instances that one up. Let me just hit pause here for a few seconds. It takes a few seconds to get provisioned, and once it's active, we'll resume the demo. You can see that my instance pool is now in a running state, and if I scroll down, if I click on attached instances, I can see two instances are attached. We have two instances because we give the initial size for the pool as a tool instances. If I click on Work request here, we can see the progress. We created a couple of instances, and you can see the history. If you click on each of these, you can see logs, etc, the whole process of how the instance has got provision, etc. This is good. Now, the next step which we need to do is, so now we can manage multiple instances as a group because we created an instance pool. Now, the last step is to create an auto scaling configuration. Click on "Auto Scaling Configuration" and give it a name. We'll say this is an autoscaling-config-demo, and now we need to pick a pool. We created this instance pool earlier, so we click on Next. Down below, you can see that I can do autoscaling based on metrics like CPU utilization or I could do autoscaling based on schedule. If I know the traffic is going to be heavy on Mondays, I could actually do that as well. Down below, here is the instance pool which we just created. Everything looks good there, there's no need to check and right below, I have something which is called an autoscaling policy. Basically it says, how do I want to perform autoscaling. For right now the performance metrics, choose a CPU and memory utilization. Let's choose CPU and say that anytime CPU is greater than, I'll say, "70 percent add one instance, anytime CPU is less than 50 percent , remove one instance." I can also provide scaling limits. Minimum number of instances I always want, let's say one, maximum I want, let's say three. The initial number of instances, which we already gave in the instance pool was two, so let's keep that. With this setup, click "Create" and my autoscaling configuration is now setup. It's active. What it's going to do is, it's going to look at the pool as the multiple instances running in it. It will look at the aggregate CPU utilization. If it is more than 70 percent, it's going to add one more instance, and if it's less than 50 percent, it's going to remove one instance. If you just scroll down and look at Attached instances, you can see both these instances are running. If I click on "Metrics" here, you can see the CPU tradition is around 28 percent. Let's give it a couple of minutes and we'll see how the CPU behaves. Then, depending on that, it will scale out or scale in. We had to wait a few minutes. As you can see here, the status for my instance pool change from running to scaling. It gives me some message here for service limits. If I scroll down, I can see, if I click on "Attached instances", I can see this instance is getting terminated right now. You might ask, why is this happening? Why is this thing is called scale in, and why is this happening? Well, the reason this is happening is if I click on "Metrics", I can see my CPU utilization is trending close to one percent. We added a rule which said that if the CPU utilization is less than 50 percent, then remove one instance. This is basically what is happening here. This is already terminated. There is only one instance which is running now. It still says the status as scaling. This is scale in where you remove one instance. Let's also do scale out. Let me SSH into this other instance and try to increase the CPU utilization and see what happens. You can see now the status has changed from scaling to running. We will use the keys we generated earlier and we will SSH into this particular instance. For this particular instance, I have already installed a package which is called stress. Basically can be used to span CPU threats, you can see here and that would simulate traffic flow to that. It'll increase the CPU load. Hopefully because of that, we'll see a scale-out in action. I'll span 20 threats and give it a few seconds because it takes a little bit of time to run. What you would see now is if I go back to my instance pool, you can see the status is running and it will again change to scaling. If you click on CPU utilization right now, you'll see it's trending close to one percent. It'll increase significantly. When that happens, autoscaling will be adding one more instance. Let me hit pause again and we'll come back once we see that scaling in action. That took a few minutes. As you can see here, the status for the instance pool again has changed from running to scaling, and this time it's scaling out, meaning it's adding one more instance. What's going on? If you look at the metrics because stress command is still running as you can see in cloud shell, my CPU utilization has hit 100 percent threshold. Anything beyond 70 percent, we said, spin up or add one more instance. Because we meet that criteria, you can see that the scaling is in progress. If I click on "Attached instances," you can see that now there are two instances running. Earlier, it was only one running. If I go to Work request, you can see Create Instance in pool. A new instances getting spun up, and that's basically what you're seeing here. One more instance got spun up because we met that threshold of CPU utilization beyond particular percentage, in this case 70 percent. This is a quick demo on how autoscaling works, how easy it is to setup autoscaling OCI. I hope you found this useful. Thanks for your time.