0:00

The big question for this segment is, how are statistics used in management?

[MUSIC]

To make decisions in a complex world, businesses require information on which to

base their consideration of alternative options.

There is often a proliferation of data available which may be either primary or

secondary data.

Primary data is collected specifically for the problem or

question being investigated.

Whilst secondary data is obtained from already available sources.

Such as reports, or other records, or documents.

A business conducting a market research study to understand

the spending intentions of consumers over the next 12 months,

is an example of primary data collection.

However, if a business wanted to understand the percentage market share

of the largest companies in a particular industry,

it may turn to an already published industry report to locate those figures.

And that would be an example of secondary data.

The first point to bear in mind is that data by itself is not information.

The data must be collected and perhaps cleaned or simplified.

It must be analyzed and interpreted.

1:20

Statistical methods and techniques are analytical tools applied to

gain insight and support management decision making for

a range of different problem objectives such as, firstly, describing a population.

If a telecommunications company was interested in the time

that 18 to 25 year-olds spend using their mobile phones.

We could represent that data in a frequency distribution, or

we could describe the center of the data by calculating the mean, median, or mode.

If we are interested in measures of dispersion or variability,

we could calculate the range, variance, and standard deviation.

Secondly, comparing two populations.

If a pharmaceutical company needed to compare the effectiveness of two drugs,

we could use the z-test, an estimator of the difference between two proportions.

And thirdly, analyzing the relationship between variables.

To analyze the relationship between two variables,

we could use simple linear regression.

To analyze the relationship between multiple variables,

we could use multiple regression.

The key point is that for any statistical problem,

it is essential to identify the appropriate technique,

calculate the statistic with care, and interpret the results with

an understanding and appreciation of the assumptions inherent in, and

the applicability of, the method or modeling used.

2:47

The use of index numbers illustrates the temptation to stretch the utility of

a statistic beyond what is appropriate.

An index number is a statistic designed to show changes in a variable,

often price, volume or value, over a period.

A base year can be represented at an index number of 100 with subsequent percentage

increases or decreases pushing the index number up or down.

A collection of index numbers is an index series.

Some index series compare different geographies at a point in time.

The Economist Magazine Big Mac index which illustrates the concept of

purchasing power parity, is an example of such a geographic comparison.

Other index series compare changes over time.

The Consumer Price Index, for instance,

measures changes in retail prices paid by consumers.

Economic index series compare wholesale prices, house prices,

production, sales, purchasing, and costs.

Whilst various financial index series based on stock

exchange prices are widely reported and well-known.

However, such index numbers are often over interpreted.

And used as barometers of the broader economic environment, or of the state of

business sentiment, and even as proxy judgements on government policy.

There are major issues with drawing conclusions that a statistical tool or

technique was never designed to support.

And management's awareness of this shift from the supported to the speculative,

is crucial.

Correct use of statistics can solve problems.

Misuse of statistics can create problems.

4:43

The normal distribution, or bell curve,

with its characteristic thin tails is very important in statistics.

Taleb develops the concept of four quadrants.

In the first three quadrants, the statistical models either work, or

there is no harm in being wrong.

In the fourth quadrant, however,

statistical models based on the normal distribution don't work.

And the consequences of being wrong are severe.

This is the area of black swans.

Where we mistake absence of evidence for evidence of absence.

Taleb describes the great turkey problem,

where a turkey is fed every day for 1000 days by a butcher.

Each day strengthens the turkeys belief, with increasing statistical confidence,

that butchers love turkeys, until Thanksgiving.

When the turkey suffers a revision of belief with catastrophic consequences,

a black swan event for the turkey.

Taleb argues that in complex domains there is a great degree of

interdependence between elements.

And mechanisms are subject to positive, reinforcing feedback loops.

This gives rise to distributions with fat tails.

And blocks the operation of the central limit theorem which states that

the distribution of the sum or average of a large number of independent variables

will be approximately normal, regardless of the underlying distribution.

In fact, it will converge closer to the normal as the sample size increases.

The importance of the central limit theorem is hard to overstate

because it is the reason that many statistical procedures work.

6:35

Effectively, the low volatility the turkey experienced for

1,000 days, followed by the big jump in volatility the turkey

experienced at Thanksgiving, should give us cause to question any

risk modeling using volatility as an indicator of stability.

Taleb famously argued in 2007 that the complacency of both financial

regulators and financial institutions in under estimating exposure to,

and risk of, financial crisis was an example of precisely this mistake.

7:10

In the fourth quadrant,

what you do not know is far more relevant than what you do know.

This means that any risk management modeling,

based on the normal distribution, is invalid for what we do not know.

Quite simply, it is impossible to model and

predict what Donald Rumsfeld famously called the unknown unknowns.

However, in this realm where the power of white swan statistics in modeling

cannot be called upon.

The role of management is to assess exposure to the shock of

the unexpected and previously unobserved events.

And implement strategies to build resilience and

robustness to limit the downside impact of that exposure.

[MUSIC]