It's also likely then that our model would be statistically non-significant.
If an odds ratio is greater than one.
It means that the probability of becoming nicotine dependent increases among those
with social phobia compared to those without.
In contrast, if the odds ratio is below one.
It means that the probability of becoming nicotine dependent is lower among those
with social phobia than among those without.
So how do we calculate the odds ratio?
It is possible to do this by hand.
The odds ratio is the natural exponentiation of our parameter estimate.
Thus, all that we need to do is calculate the natural log
to the power of our parameter estimate.
However, we could also let Python do this for
us by adding the following Python code.
First we ask Python to print the title odds ratios.
Then in the second line of code, we ask Python to print the odds ratios
which are computer using the NumPy.exp, or exponentiate, function.
In parenthesis we add the object that contains the parameter estimates,
P-A-R-A-M-S, from our lreg1 model.
Here are the results.
Because both my explanatory and
response variables in the model are binary, coded zero and one.
I can interpret the odds ratio in the following way.
Young adult daily smokers in my sample with social phobia are 3.4 times more
likely to have nicotine dependence than young adult smokers without sociaphobia.
We can also get a confidence interval for our odds ratio.
Remember that our data set is just a sample of a population.
We do not have every young adult daily smoker in the US in our sample.
Even thought the odds ratio for our sample is 3.4, the true population
odds ratio might be slightly different due to random variation in sampling.
The code to print the confidence intervals for the odds ratio is here.
In the first line of code, we create an object called params.
P-A-R-A-M-S.
That includes the perimeter estimates from our lreg1 logistic progression model.
In the second line of code we create an object called cnof that uses a stats
model conf_int () command to return the confidence levels for
the parameters estimates.
In the third and fourth line of code, we create an odds ration object
with column labels of 'Lower CI' Upper CI and OR.
Finally, print the conference intervals using the numpy.exp function to compute
the odds ratios from the parameter estimates in the conf object.
The odds ratio indicates that there's a 95% certainty that
the 2 population odds ratio fall between 1.78 and 6.61.
It's important to keep in mind that the odds ratio is
simply a statistic calculated for the sample.
>> So looking at the confidence interval, we can get a better picture of how much
this value would change for a different sample drawn from the population.
Based on our model those with social phobia
are anywhere from 1.78 to 6.61 times
more likely to have a nicotine dependence than those without social phobias.
The odds ratio is a sample statistic and
the confidence intervals are an estimate of the population parameter.