0:04

The results section of your final report is where you summarize and

Â interpret the results of each statistical analysis.

Â Typically when we develop a results section,

Â we start with the simplest analysis, which is usually descriptive statistics

Â in which we provide the reader with a little bit more information

Â about the distributions of our predictors in response variable.

Â Then we move onto a bivariate analyses where we describe the associations between

Â a predictor and response variable.

Â And then finally we report our multivariable analyses.

Â Typically our primary analysis with multiple predictors.

Â The results of our descriptive and bivariate analyses can be brief.

Â What we really want to do is provide most of the detail, the results for

Â the multivariable analyses since this is typically the goal of our study.

Â One thing to keep in mind is that tables can be useful for

Â summarizing a lot of results.

Â In many statistical analyses that we do, particularly machine learning methods,

Â we may have a lot of predictors in writing out the descriptive statistics in text for

Â each of these variables can make the result section very long and

Â difficult to read.

Â Summarizing all that information in a table makes it a lot easier.

Â If you have your statistics in a table

Â then you really don't need to report those numbers again in the text unless,

Â of course you want to highlight some particular variable.

Â And figures are also very important.

Â 1:56

Most of the time when we write a final report, they tend to be pretty brief so

Â we don't always have room for a lot of figures.

Â So we pick and choose those figures carefully by choosing the ones that really

Â show an important result or describe a more complex result.

Â When we write about our results and we include figures in the results section

Â we want to make sure that we refer to the figures in the text so that we can

Â point the readers to the figures while we're describing the results.

Â Here's an example of a results section.

Â We're starting out with our descriptive statistics.

Â You can see that I've depicted the descriptive statistics in a table with

Â the quantitative data analytic variables.

Â And I refer the reader to the table.

Â You can also see that I am not

Â repeating the actual statistics in the text because it's in the table.

Â Although I do so for

Â the response variable in order to highlight that particular variable.

Â In addition, there were a couple of binary variables.

Â The descriptive statistics for

Â those variables are not in the table, so I present percentages and

Â the number of observations for each of the two binary variables.

Â Then we move on to describing our bivariate analyses.

Â Here I present some scatter plots,

Â which provide a great visualization of the association between

Â our manufacturing lead time response variable and the quantitative predictors.

Â Again, you can see in parentheses that I've written

Â Figure 1 to refer the reader to the figure while I discuss the results.

Â Then I talk about what the figures are showing.

Â The scatter plots reveal that the manufacturing lead times were shorter when

Â there was a greater number of ingredient units in stock.

Â Then I add the results on my Pearson correlation analysis.

Â R equals -0.79, which is the value of the Pearson Correlation Coefficient,

Â and then the p value associated with that correlation coefficient.

Â The scatter plots also show that manufacturing lead times

Â increased when production workers had worked more hours on their shift

Â before beginning production.

Â And again I provide the Pearson correlation coefficient

Â 4:02

Note that I talk about the direction of the association.

Â So rather than just saying manufacturing lead times are significantly

Â associated with the number of ingredient units in stock

Â I talk about the direction of the association.

Â Otherwise, it's not especially informative.

Â When I talk about the direction of the association, what I mean is that I'm

Â indicating whether or not it's a positive or negative association.

Â Manufacturing lead time was negatively associated with ingredient units in stock.

Â And I described that negative association by writing manufacturing lead

Â time's shorter when there was a greater number of ingredient units in stock.

Â And then I do the same thing with the association between number of

Â hours production workers had worked on this shift before beginning production and

Â manufacturing lead times which was a positive association.

Â Manufacturing lead times increased when production workers had

Â worked more hours on their shift before beginning production.

Â And then I write that manufacturing lead time was not significantly associated with

Â the number of steps involved in the production of a batch, and

Â I provide the Pearson correlation coefficient, -.05 and

Â the p value and it was not associated with the number of hours of sleep that

Â production workers reported getting the night before batch production began.

Â And again, I reported the Pearson correlation coefficient.

Â 0.01, p = 0.71.

Â Note that the figure has a title that describes what's going in the figure.

Â Note also that I follow the standard graphing convention

Â where my predictive variable is spotted on the horizontal, or x axis.

Â And my response variable is plotted on the vertical, or y axis.

Â Note also, that the variable labels are explained.

Â Rather than just providing the actual variable name, the reader can understand

Â what the variable means without having to refer to a code book.

Â These are important characteristics of figures.

Â You want to make sure that you have a title and

Â you want to make sure that the variable labels are informative.

Â So in the previous slide, I summarized the association between

Â each quantitative predictor, and the quantitative response variable

Â using scatter plots In a Pearson Correlation Coefficient.

Â In this next section I discuss the association between each

Â categorical predictor and the quantitative response variable.

Â The appropriate bivariate analysis when you have a categorical predictor and

Â a quantitative response variable Is analysis of variance.

Â So I write analysis of variance indicated that average manufacturing

Â lead times did not differ significantly as a function of equipment failure, and

Â I provide the s statistic and the associated degrees of freedom

Â in parentheses, and the p value associated with the f-statistic.

Â And then finally, the r-square, which is the variance

Â in manufacturing lead times that is accounted for by equipment failure.

Â I also write that trainee involvement and

Â production is also not significantly associated with manufacturing lead time.

Â And again I provide the f-statistic, the p value and the r square.

Â And then I point the reader to figures 2 and

Â 3 to give them a visual of the association.

Â Notice also that when I describe the results of the analysis of variance,

Â I do not include the means in the text.

Â That's because I have provided the means in the figures.

Â Finally I discuss the results of my multivariable analysis.

Â I first point the reader to figure four, which showed that five of the six

Â variables were retained in the model selected by the Lasso Regression Analysis.

Â Only the number of production steps predictor was excluded.

Â The number of ingredient units in stock.

Â And the number of shift hours employees work before beginning production were most

Â strongly associated with manufacturing lead time followed by equipment failure,

Â training involvement in production and the number of hours of sleep

Â that production workers reported getting the night before their shifts began and

Â this is shown in table two.

Â Table 2 shows the last or least angle regression variable selection summary.

Â The average squared error, which is also the means squared error,

Â associated with each variable as it was entered into the model.

Â Then I provide a little bit of information about the direction of the association

Â with each variable, and the response variable manufacturing lead time.

Â Manufacturing lead times were shorter for batches that had a greater number of

Â ingredients in stock, and when production operators reported sleeping for

Â more hours the night prior to bad production.

Â On the other hand, working more shift hours prior to manufacturing,

Â equipment failure and having trainees involved in batch production

Â was associated with increased lead times.

Â Together these five predictors accounted for

Â 93.3% of the variance in manufacturing lead time.

Â Then I report the mean square error for

Â the test data set compared to the mean square error for the training data set.

Â To show that they differed very little, which suggests that predictive accuracy

Â did not decline in the last regression algorithm development and training data

Â set, with applied to predict lead manufacturing times in a test data set.

Â And again, I refer the readers to figure 4,

Â which shows the mean square error rates for both the tests and training data sets.

Â