In today's world, data is everywhere but being able to interpret the data takes an understanding of some of the pitfalls of population health reporting. So in this module, we're going to explore some of the difficulties that we encounter when we're using our interpreting population health indicators. We often run into paradoxical findings. We have to remember that the variation that we see is probably just a regression to the mean. And then also we're going to take a look at some of the distortions by using rankings rather than the actual population health statistics themselves. So paradoxical findings come in a couple of different forms. One happens when people physically migrate from one geographical area to another and it has an influence on the population health statistics. Another is when diagnostic criteria for diseases change. When those diagnostic criteria changes, who gets counted as changing across time? Will Rogers who's a reverent cowboy humorist from many decades ago once said, when the Okies left Oklahoma and moved to California, they raised the average intelligence in both states. That's the kind of paradoxical finding that I'm talking about and no malintent for California or Oklahoma. I'm sure your average intelligence is quite high. Well, the reason that we can see paradoxical findings is because distributions of individuals take different shapes and when we actually have migration or we have changes in diagnostic criteria, it really can make a big difference in both populations. So what I have represented here are some population pyramids. So this is an age distribution and how many people are in different age groups for Christian County, Kentucky and from Ouray County, Colorado, both in 2015. And what you can see is that there are a lot more older people in Colorado and there's more younger people in this Kentucky county. So, if we were to see a migration of people just in let's say the 50 to 59 year olds from Kentucky to Colorado for these counties, what would happen is that it would lower the average age in both of these counties. How can that be? Well, because the 50 to 59 year old is below the average age in Colorado and because it's above the average age in Kentucky. So that kind of migration lowers it in both counties. Another big contributor to paradoxical findings is when the underlying definitions change. Our case definitions are what we rely on for population health outcome statistics. What's represented here is the incidence rate in person years for diabetes. And this is from a nice study that was looking at data from the CDC and noting where the changes in the criteria came. So in 1997, there was a change in the diagnostic criteria of fasting blood glucose moving from 140 milligrams per deciliter to a new diagnostic criteria of 126. And so of course what you see is a jump in the number of diabetics as the diagnostic criteria became less conservative. In 2010, you can see the impact of adding a component to the diagnosis for diabetes. Here, they added to this fasting plasma glucose a criteria that somebody's hemoglobin A1C had to be greater than 6.5%. It's probably a better definition. But the thing is that in terms of our population health statistics, we would think, wow there's a steep increase in the incidence rate of diabetes up till 2010. And then something that we're doing must be helping and so now there's a decrease in incidence when really the only thing that happened was that the diagnostic criteria changed. So the paradoxical findings of not knowing what is underneath of this great improvement at incidents and it's just actually a change in diagnostic criteria I hope makes you aware that you should really look underneath of these population health statistics. And take into account the fact that it takes years to do a roll out of a change in guideline. So even the time period after for instance 1997 or 2010 could just be an artifact of the change in our case definitions. Another thing that we often forget when we're looking at time trends and population health indicator data is that there's an underlying variation over time that could just be normal variation and we ought to be focusing more on what's the average of that process? Sometimes we don't have statistics that are well suited because we're looking at rare events and short time frames where really we probably ought to be looking at larger time frames. Or maybe we should be finding ways to plot that average so that the rare events don't look like jagged variation. So even if there's this wild fluctuation, you can get a sense of how the average trend is moving. And that tends to be the much more stable population health indicator to rely on when you're making predictions or assessments or trying to figure out what's the real process going on in a population. The last pitfall of population health indicators that we're going to dig into is the use of the population health indicator to rank different groups or using them to really kind of create what's called the league tables. So that's a sort of traffic-light colored grading of performance, the green, yellow, red, but they're often misleading because what if all the subgroups or all the countries have high prevalence of disease? Maybe all of them got a low. We can't tell that by looking at rankings. The rankings don't show the valuable variation that's related to the meaningful differences that we're trying to discern. Here's an example of what I mean by being ranked. So, this is a set of developed countries and they're all ranked for 2019, 2017. We can see the change in rank and we can look at their health grade, their health score and their health risk penalties. But when I look at this entire summary of what is happening within these countries, I have no idea if they're all good, or if they're all bad. And what this variation really means in terms of for instance the number of sick people or the number of people that have died prematurely. So, this is conveying some information but often not information that we can use easily. This is another example now for healthcare systems and ranking them across many different elements. So we've got a better picture of kind of what they're looking at, care processes, access, administrative efficiency. But again because of the ranks, all we can really tell is the relative position of people, we can't really see, is the care and the access good or not? So, a better approach to the league and ranking tables is actually get the real population health statistics or try a different way of plotting. Control charts or funnel plots are two other graphical methods that allow you to plot many different for instance populations together by a particular population health statistic. A lot of what we're looking for are ways to take uncertainty into account, be able to see that variation. We also want to see the normal population variation patterns, so it's not just the statistic but often the standard deviation. The confidence interval is the range of values that different groups and subgroups in a population are taking. And then we can actually look at whether or not there's greater than expected variation because we're comparing sort of apples to apples. So here is an example where plotting the actual population health indicator statistics gives us a much better view than just the ranks. It's so much better to be able to see over time what is happening in this of the overdose rates involving opioids and what you can see is this huge acceleration that we've called the opioid epidemic. And if you just look at the lines, you can see that the basic rank of what were the causes of the overdose don't really change much until the last five to seven years. They all are following the same ranks, but the actual number of deaths per hundred thousand is quite strikingly different after the year 2010. This is where we really sort of start to gain our awareness of the opioid epidemic as a country and start to actually marshal resources. If we just kept a view of the ranks, we wouldn't know that there was something drastically changing in terms of the number of people that were dying from overdoses. Here's an example of what I was trying to describe in terms of the funnel plots. Typically on the x-axis, we have some health statistic, in this case it's a log odds ratio and then we've got the standard error around that health statistic. So, this is on a set of studies that were looking at the association between a health factor in a health outcome, but it could just be a population health indicator. And then you can draw the 95% confidence interval. This allows us to actually see whether or not we've got bias or heterogeneity because we're expecting that as the population health statistic and its standard error changes, that it's going to have this funnel shape. And so any big deviation outside of that funnel, which usually represents some kind of change in a process or for instance, it could be an error, could be easily detected. Last but not least are the control charts. These are great for interpreting trends. They also can help us with that regression to the mean because it's allowing you to see, here is my average line. Here is my 95% upper limit or lower confidence level and whether or not you have outliers that are exceeding those limits are an important thing that a control chart would easily be able to tell you. As you can see, there's a whole wild world underneath the population health statistics. So don't be surprised if you run into things that don't make sense at first. There's probably some good reasons and you just need to take a little bit of time and dig down into the details.