So we use machine learning to analyze data and try to make some sense out of it. Again, we want to uncover hidden insights especially if given very high datasets with very high dimensionality, it can be very hard to uncover hidden insights from that. I'm not saying it's impossible, it's just as the number of dimensions goes up, the difficulty of the problem goes up. Whether it's a low sloping exponential curve, a linear curve, or increasing exponential upward curve, it depends on the dataset. We use non-traditional programming approaches like we saw in the machine learning; linear regression, support vector machines, k-means, etc. We can also bring in or augment purpose programmed algorithms. Based on my research, that seems to be the exception rather than the rule. Someone's having trouble with their machine learning algorithm and they're not getting the outcomes, they might build some purpose-built helper algorithms potentially pre-processing the data, and we'll look at more of that coming up, but there can be some purpose-built programming algorithms involved. So data and types of data. Static data. We got a database someplace, a folder of files, we're going to look at the Iris dataset that you can load with sk-learn Python library to point you as you got a bunch of data that you want to look at, and it's static, there it is. Then there's the streaming kind that's constantly changing; Facebook updates, Twitter feeds, stock prices while the market is open, etc. In IIoT, there's messages containing temperature and velocity and position and pressure and voltage, current, vibration going all the way back to the vibration centers and those coal pulverizers at that power plant that I referred to. Many of these are analog phenomenon and we then digitize and turn into binary values that a computer program can look at and a machine learning algorithm can process. In a streamed data analysis approach, we examine the newest data points and we make decisions about the state of the model and its next move or its next decisions or its next Theta values, if it's, say, a linear regression. We just look at the newest values that come in but another way to do it is to reevaluate the entire dataset or some subset of that each time a new data point arrives. So there's this notion of I've got a bunch of data and use that to train the algorithm with and I'm only looking at the new data that's coming in. Another equally valid approach is to say, "Okay, I've got some new data coming in. I'm going to back up so far into my previous data and make a prediction based on what happened before and in combination with the new values that arrived." So this is revisiting the five tribes. We've got symbolic reasoning, connections modeled on the neurons, and you know what I think about that, evolutionary algorithms, the Bayesian inference, which was most of what we've looked at, and the learn by analogy one. So in coding a solution, there's traditional programming, and I'm stating the obvious here. So got two inputs, 2 and 3, we implement an add function, we add them together, we output five. This is what we want to write, this is one of the simplest programs ever like Hello World, but it's purpose-built, it's purpose programmed traditional programming. In the machine learning, we get inputs and which are features in our dataset, we have some outputs. Sometimes, we don't know the outputs. For example, in the Iris dataset case, we're using k-means and those other unsupervised learning algorithms and techniques to try and find structure in the data. We may not know the function and there's a lot of human involvement here where we try many different functions, some problems may be very difficult. I use a support vector machine because it was available in an upcoming example but you may have to back up and revisit your algorithm choices. All this in a purpose or in an attempt to try and discover some function that produces the results that we're interested in. So, we train the function, we visualize its output, might want to visualize the features, data that goes into it, and we might want to visualize the output values as well and test it. Again, testing and validation is so important. How do you know you're getting good outputs? Well, you have to measure it. You have to measure the results that you're getting and see if they make sense, see if it's within some error threshold that is acceptable to you for whatever your purposes are. The secret to all of this in machine learning is the ability to make generalizations and discriminations. We saw that in the support vector machine when we went to that website and started plotting dots down, we did both of those things. So these things were the same, these dots down here or the one dots were up in the left-hand corner and then the other dots were in the lower right-hand corner and drew a nice hyperplane across there. So it's making discrimination in that case. Learning algorithms rely on three components. Representation; how do we represent the data. A part of the representation is to discover which features to use for the learning process. This can be a real challenge, a lot of time can be spent here, I can attest to that one. When you get to Thursday and you see the learning problem, an analytics problem that I gave myself to as an experiment to try and see if I can make these determinations, it's tricky. It's not clear always which features from this dataset to use to feed into your learning process, in your analytics process. So representation's a big part. Evaluation. An evaluation function scores a model because the model doesn't know good results from bad results, so you need some cost and error function that is integral to the learning process so that you can check your results as you're going on. Optimization; training, tuning, and making model selections. Remember in my linear regression, I've set the learning rate to one and my Thetas we're oscillating back and forth and got bigger and bigger and bigger and bigger and that run away until it exceeded double precision floating point and I had to turn it down to 0.00000, it was six or seven zeros there and a one. So, again, a lot of human involvement in this process. Just take a bunch of data, you hear me say that already, throw it into an algorithm., "Oh, wow! Look what we discovered." No. It's labor-intensive.