[MUSIC] Welcome back. In this video we're going to look at the different types of analyses you can perform once you have identified the business problem or opportunity, developed a hypothesis and collected relevant data. As processing capacity continues to increase, it has opened the door to a broad range of advanced algorithms and modeling techniques. That organizations can use to produce valuable insights from data. We will discuss a series of analytical techniques and how they are used in the real world. In the course materials, you have access to a quick reference sheet that lists out all techniques for easy future reference. >> Thanks Dan. I'm Lorie Wijntjes, managing director in our data and analytics practice with almost 30 years experience as a statistician. At PWC, I have worked on a wide variety of business problems involving predictive analytics, data management, statistical sampling, and survey design. Some fairly straightforward, and many that have been complex, across a wide range of business problems in industries including healthcare, financial services, and retail and consumer. In this video I will give you a high level overview of the different types of analysis that you can perform on data. As part of the course you will find supplemental reading that covers each of these analysis types and how they are used. Now, it's important to keep in mind that the analysis you choose to perform will depend on a couple of things. First, the problem you are trying to solve, and second, the data you can use to solve that problem. I'm going to start by talking about cluster analysis. Cluster analysis is when you group a set of objects in a way that objects in the same group or cluster are more similar to one another than those in the other clusters. Cluster analysis is often used in market research when working with data from focus groups and surveys. A cluster analysis can be used to segment a population of consumers into market groups to better understand the relationships between different groups of consumers. This analysis can help answer questions such as, who are my target customers? How are they differentiated on behavioral, psychographic and demographic characteristics? Are there groups that have similar attributes so that products, services, price offerings, can be used to customize segments? Now, let's move on to decision tree analysis. A decision support tool that uses a tree-like graph of decisions and their possible consequences. Including chance event outcomes, resource cost, and utility. Decision tree analysis is often used to assist healthcare practitioners considering varying treatments along with each one's associated costs and probability of a successful outcome. For example, healthcare providers can use this analysis to assess options and deliver more cost effective treatments that minimize the risk of hospital readmission. To analyze large numbers of dependent and independent variables, we might use factor analysis. This type of analysis can help detect what aspects of the independent variables are related to the dependent variables. When we receive the data, sets that are fairly wide, meaning that they had more variables in observations or records. We need a way to identify the core set of variables or drivers that will help to gain meaningful insight. Factor analysis can help identify that reduced subset of variables, meaning some of those variables represent similar relationships as those not included, but perhaps in a stronger way. Machine learning is a type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed to do so. For forward-thinking retailers, the possibilities for advanced machine learning are limitless. Take for example a company trying to predict what customers will be buying next spring. Machine learning algorithms can determine availability of materials from outside vendors, incorporate various supply chain scenarios. And recommend the quantity, price, shelf placement, and marketing channel that would best reach the target consumer in a particular geographic area. These algorithms can also be used to optimize sales for an individual store. Regression analysis is a statistical process for estimating relationships between a dependent variable and one or more independent variables. Variables are the pieces of information. This type of analysis helps you understand how the value of a dependent variable changes when any one of the independent variables change. For example, a large insurance company wants to identify the characteristics including age group, income, gender, educational level, etc. Of customers that tend to make the most automobile claims. This type of analysis can be used to assess risk, and also assist with determining pricing for various automobile insurance products. Multivariate analysis is the observation and analysis of more than one statistical outcome variable at a time. This often includes as a first step correlation analysis, which can help you understand and visualize relationships between pairs of variables. Multivariate regression is a technique that estimates a single regression model with more than one outcome variable. When there is more than one predictor variable in a multivariate regression model, the model is a multivariate multiple regression. To understand the relationships of outcome effectiveness of a particular medical treatment, one may also need to understand confounding variables. Such as age, weight, gender, or other medications the patient may be receiving. There may be multiple ways to assess outcome and thus, more than one dependent variable and multiple independent variables. Segmentation analysis divides a broad category into subsets that have or are perceived to have common features, needs, interests, or priorities. Often, segmentation analysis is used to better understand customer needs by diving a large number of individuals into smaller groups based on a logical scheme. Segmentation provides a convenient mechanism around which to develop products, construct programs, and execute marketing tactics. Imagine that a bank was developing a strategy to become a leader in mobile payments and mobile banking. Traditionally, product penetration was driven by the bank's relationship managers and its branches. Segmentation analysis could help the bank gain market share by identifying key customer segments and developing product recommendations for those that are more likely to use mobile banking. Sentiment analysis is a process of identifying and categorizing opinions expressed in a piece of text to determine whether the writer's attitude towards a topic or issue is positive, negative, or neutral. This analysis often relies on natural language processing or NLP. Which is use to perform the data mining of the Internet. These allows companies to better understand what their consumers are saying about their product offerings and to adjust strategies where feasible to improve customer sentiment. Categorization of the information scraped from the Internet can then be use to develop models. Sometimes it's hard to determine if a system or a process will react to a change the way we think it will. One way to test changes to a system or process is by performing a simulation. A simulation is the imitation of the operation of a real world process or system over time. It requires a model that represents the key characteristics or behaviors of a selective system or process. Think about hospital admissions. For a hospital wanting to figure out how to reduce wait time, analytic tools could help simulate the admission process. Allowing the hospital to change the values of certain variables and see the impact on a patient's wait time. The last type of analysis we want to discuss is time series analysis. Time series analysis can be used to design a methodology to identify the factors affecting airline passenger demand on routes by leveraging macroeconomic, demographic, and other external data, at a local, state, and national level. Such models can be developed to produce a route level forecast of total demand for air travel. Helping to optimize route and capacity planning and identify new routes for market entry. Time series analysis comprises methods for analyzing data that are collected over time to extract meaningful statistics. Stock prices, sales volumes, interest rates, and quality measurements are all typical examples. Because of this sequential nature of the data, special statistical techniques accounting for the dynamic nature of the data are required. Now, let's answer one last assessment question for this segment. >> Lori covered some great material in this session. And as you can see there are many different analytical techniques that can be used to address a problem or opportunity. As I mentioned at the beginning of this video, there is a quick reference guide available as part of this week's materials. Use the guide to review the different analytic techniques. In the upcoming videos you're going to hear from some of PWC's subject matter specialists on tools used for data and analytics and for visualizing data. [MUSIC]