Every field needs good data. Population health needs data that everybody can see. Community members, city council members, businesses. We're incredibly lucky here in the United States, but the federal agencies have created a set of rich online tools and resources. So just about anybody can see the data about their home state or county. In this module, I'm going to outline the online resources that are available to assess population health indicators, and we're going to talk through the differences between primary and secondary sources of data. So where can you get good information on health in the United States? Well the Center for Disease Control by and large has the best collection. They've put it underneath of one system called the CDC Wonder online database because it is a wonder. It has things like the Behavioral Risk Factor Surveillance system, the NHANES Study, The National Notifiable Infectious Diseases Surveillance System Results, or things from the National Center for HIV/AIDS, Viral Hepatitis, Sexually Transmitted Diseases and TB Prevention, as well as the Morbidity and Mortality Weekly Reports. All of this underneath of one database makes it an incredible one stop shop. So the Behavioral Risk Factor Surveillance System has been around for decades. It's the largest telephone health survey in the world. They sample over 350,000 individuals annually. It's a state based system of health surveys that collects information on health risk behaviors, preventive health practices, health care access, and chronic disease and injury. One of the nice things about the BRFSS system is that there are some standard questions that every state uses and then states every year get to select particular questions that they're interested in getting feedback from their communities. So it has this both it can be compared across all the different states as well as providing local information on emerging topics. Another major study that has been in the field for decades is the National Health and Nutrition Examination Survey. It assesses the health and nutritional status of US adults and children by combining interviews and physical examinations. Literally the NHANES tractor trailers move into parking lots, set up an interconnected system, and then invite members of the community in a sort of systematic way to come through and do survey and evaluation of what they're eating, blood tests. Eye tests, foot tests, sort of head-to-toe evaluations of the health of people in the United States. Just some of the important data that they collect that we're using all the time in population health include things like their estimates of the percent of adults that are over the age of 20 that have undiagnosed diabetes. Another thing that NHANES does in a very detailed way is gives us good measures of physical exercise of individuals in the United States. Everything from the sedentary, light, moderate to heavy exercise components are well detailed out in their surveys and in their evaluations. Another major contribution of the NHANES study is their evaluation of caloric intake and the kinds of foods that people in the United States are eating. They do a very detailed sort of plate evaluation of what people eat in a regular day or week. And then last but not least is their evaluation of the percent of children that are under the age of five that have elevated blood lead levels. This is really important because of the impact of lead on developmental outcomes for children, and we know that there's a lot of vulnerability in our populations whether or not it's lead in the water or leads in the homes or schools of children. So trying to figure out how do we best understand what are the child populations at greatest risk is one of the things that the NHANES study does on a regular basis. Another important database that's at our disposal is the National Notifiable Diseases Surveillance System. This is a mandatory reporting surveillance system where over 60 notifiable diseases are represented and the CDC receives these reports from the state surveillance and puts them out within the morbidity and mortality weekly report the MMWR. Some of the notifiable diseases to give you a flavor include things like gonorrhea and hepatitis A. Things like rabies and diphtheria and even plague. So this is an indication of some of the more severe diseases. It's why there's this notifiable disease surveillance system. This is not for every day diseases, chronic diseases. This is really about those things that have kind of imminent threat to the population if we don't know that they are actually occurring. Another major source of high-quality data is the US Census Bureau. Most of us wouldn't think of it in terms of its relationship to population health, but it actually is the best source of information about the social determinants of health. So the Census Bureau provides it in a number of different ways. One is through a survey called the American Community Survey. About a 100,000 people or more every year are surveyed and we've got very good data on their demographics, their social economic information as well as housing and transportation. The American Community Survey has a very nice design where it's national survey is designed to provide communities information on how they're changing. In order to do this in a sensible way every year there's an estimate for geographic areas that have populations over 65,000. And they create three year estimates for communities that have only 20,000 or more. And then they get five year estimates for all geographic locations in the United States. So if there's some major population health improvement plan, let's say a policy has been implemented. You can go to the American Community Survey and actually look at the one year, three year, and five year estimates in order to really see. Did this have an impact? The census has more than just the American Community Survey. It has an online tool system called the American FactFinder that allows people to access more than just what is happening in that one survey. It includes things like the Puerto Rico Community Survey, population estimates. There's a whole program for getting good population estimates. So you know how many people are actually in your county or in your state. There's a nice way to access economic information, things around labor and around job resources or unemployment as well as an annual economic survey. So I hope that what you can see is that while we think of census as kind of looking at people in their demographics. It actually is a huge tool for the social determinants of health. One last important resource is the Community Health Data Initiative that was launched by the US Department of Health and Human Services. It's a collaboration between governmental and non-governmental agencies. The real goal there is having Community Health Data that has a standard interface across many of the different set of state and county level data resources. Sometimes they're not all looking the same. So this is a place where they're harmonized and it's got a nice way in which you can look both across time and across geographical regions. Probably it's only detriment is that sometimes it doesn't have the most recent data that's published by a State Department of Health. So if you need something that is much more current you may have to reach out to your State Department of Health. In order to use this data not just in a superficial way, but to actually download it and interpret it. You need to be sensitive to the case definitions because they are different across different surveys or different ways in which the data is collected. I'll give the example of hypertension based on a survey question. Like have you ever been told by a physician that you have high blood pressure? Well, you're going to have people that have forgotten. You're going to have people that didn't hear it that way and so while they might have hypertension they're not going to tell you that they had it because they don't think they do. Relative to somebody who for instance went through the NHANES system and had their blood pressure taken three times and then they would evaluate whether or not the average of those three resting blood pressures is greater than 140 over 90, which is the current standard for defining a hypertension and they're currently taking antihypertensive therapies. That data point within a data set you would think of that's a pretty strong clinical definition of hypertension relative to the survey response that an individual would give. So case definitions matter. And you're going to find the case definitions inside of code books that actually describe the list of variables in the database and the complex abbreviations that are used in naming those variables and how often they relate to what the data was. So you might see within a variable in a person's response. Whether or not that response was from a survey question, a laboratory measurement, an ICD-9 code, a diagnostic code, or maybe a relative of the case definition, a derivative. So we want you to be sure that when you're thinking about high-quality data sources and you're using this to keep in mind both case definitions and looking at the code book. The last piece you have to consider when we're talking about high quality data sources is whether or not its primary or secondary data and whether or not it's a primary source or a secondary source. So what do I mean by this? So primary data is really data that you collected yourself and secondary data is data that somebody else collected that you're now analyzing. So you could easily say if you've downloaded the BRFSS data. Well, I'm conducting secondary data analysis using the BRFSS data. Well, we also could talk about primary and secondary sources. So a primary source of data might indeed be the BRFSS data. So that's where I have direct access to the actual individual level data and I'm analyzing that individual level data inside of an agreement with my institution as well as with the source of that data. A secondary data source often can mean that we're using a derivative of that data, a summary or an interpretation, a statistic. So I could say something like I'm using the summary statistics from the BRFSS data that was published in another source. Let's say it's the county health rankings. So how we use these terms primary and secondary tell people whether or not we have access and at what level we're analyzing. There are many ways to use these data sources. We can use them in community health assessments or to communicate some kind of urgency or need for change. We could also use these data to look at trends over time, compare neighboring counties or states. And ultimately a lot of people use it to prioritize areas for population health improvement to help them focus on the key things that are going to help their community.