In a previous module, you leverage pre-trained ML APIs to process natural text. These are great options for seeing if your use case can just use a model that's already created and trained on Google's data. But you may want a more tailored model trained on your own data. For that, we will need a custom model. Let's talk about the different ways of building custom models. As we covered previously, here are the three ways you can do machine learning on CGP. You've already looked at pre-trained models on the right and did a few labs invoking those APIs. Now, we're going to visit the other side of the spectrum and build your own custom model and productionize it on GCP. There are a few ways of doing custom model development, training, and serving. We will highlight a few major ways and then focus on four for the remainder of this course, AI platform, CubeFlow, BigQuery ML, and AutoML. Briefly discuss each of these in turn. You worked with the notebooks component of AI platform earlier in the course when you connected BigQuery and ran commands in the notebook cells. Just as you wrote SQL in the notebooks to process data, data scientists and ML engineers can write custom TensorFlow code to train and evaluate their models right within the notebooks. Creating custom TensorFlow models in notebooks is out of scope for this data engineering course, but we do have series of dedicated TensorFlow courses for those data engineers who want to be cross-trained as ML specialists as well. For now let's highlight where are you as a data engineer are likely to be involved in Cloud AI platform projects. What is a high platform exactly? It's a fully managed service for custom machine learning models both training and serving predictions. It can scale from the experimentation stage all the way to production. You can also using features of TensorFlow include transformations on input data and perform hyper parameter tuning to choose the best model for your case. You can deploy your models to AI platform to serve predictions which will autoscale to the demands of your clients. AI platform also supports CubeFlow, which is Google's open source framework for building ML pipelines and you'll have a lab on this later. Essentially, AI platform is the engine behind doing machine learning at scale on GCP. A data scientist can train and deploy production models from an AI platform notebook with just a few commands. Getting the data ready and managing the entire pipeline of machine learning is a much broader task. As a data engineer, you will likely encounter and create CubeFlow pipelines with your ML teams. CubeFlow is an open source project that packages machine learning code for Kubernetes. ClubFlow pipelines is a platform for composing, deploying, and managing end-to-end machine learning workflows. The main components of CubeFlow pipelines include a user interface for managing and tracking experiments, jobs, and runs, an engine for scheduling multi-step ML workflows, an SDK for defining and manipulating pipelines and components, notebooks for interacting with the system using the SDK. These tools are used to define, experiment with, run and share pipelines. The pipeline consists of pipeline components which are ML steps that are assembled into a graph that describes the execution pattern. The key benefits are reuseability and portability. You can run it on GCP or other Cloud providers, so you're not locked in. Before we dive more into CubeFlow, let's see the other two ways of building custom models on GCP which will be the focus of future modules, BigQuery ML and AutoML. BigQuery ML allows your team's to easily build machine learning models on structured data using SQL-like syntax. You can quickly get a model created for forecasting, classification, and recommendations right where your data already lives in BigQuery. Teams use BigQuery ML as a quick prototyping tool to see if machine learning will work for their data set and project. Then you can dive into more of the advanced features of BigQuery ML like hyper parameter tuning and data set splitting methods to fine-tune your models performance. We won't dive into the details here, but we'll see how we can easily create, evaluate, and then make predictions on our data all in BigQuery. It's just SQL commands like create model or evaluate, or predict just as you would write a normal SQL query. You'll do labs on this later. Lastly, another option for custom model building is AutomML. Assuming we have labeled training data, we can train, deploy, and serve predictions using AutoML without having to write any code. How do we generate predictions with an easy to use REST API? And again, since we're using AI platform and CubeFlow, we will often be thinking about using TensorFlow models. However, this isn't the course to dive into the details of TensorFlow, an example of where you can learn more about this is the intro to ML on GCP specialization on Coursera linked here.