Laboratory tests are an important data type for Computational Phenotyping. A number of clinical diagnoses use laboratory values in the criteria for diagnosis. For example, the American Diabetes Association requires a specific hemoglobin A1C level to diagnose a patient with diabetes. Laboratory data in general are fairly accurate, though as with all test, incorrect values may occasionally be recorded. In the context of Computational Phenotyping, lab data is typically very specific, especially when those data are part of the diagnostic criteria for the disease. However, the tests are not always very sensitive. For example, patients who have already been diagnosed with diabetes are discouraged from getting routine measurements of hemoglobin A1C. So, if the patient was diagnosed many years ago, they may not have the lab measurement in their EHR. The challenge with using laboratory data in Computational Phenotyping is actually the complexity of these data. The same lab value may be present in multiple test results. For example, blood glucose measurements may be ordered and resulted as a single test, or as part of a basic or complex metabolic panel. Depending on how the laboratory data is stored in your clinical data warehouse, it may be difficult to identify and combine all similar test values. Remember that Computational Phenotyping looks at data in the entire EHR, including for many years ago. Testing methods and reference ranges change over time. So, it may be difficult to compare the same laboratory test over different years. Similarly, the units of measure for a test may change based on the type of tests performed or it may not even be recorded. Thus, complex programming rules must be written based on the time and the type of the laboratory test to make this data usable for Computational Phenotyping. Another crucial nuance used in laboratory data is in identifying controls for your phenotype of interest. Let's say that we want to identify controls for my diabetes study. It may be tempting to require your controls to have a normal blood glucose measure, or a normal hemoglobin A1C to prove that they are not diabetic. However, it's important to remember that in medicine, we only order tests if we have a clinical suspicion for a disease. Patients who are getting tested for diabetes likely have some risk factor or symptoms that make the provider consider that they might be diabetic. Although they tested negative, these individuals are more likely to go on to develop diabetes and likely won't be good biological controls for your population. All of that said. As long as you are selective about the laboratory tests you use in your algorithms, you consider all possible sources for those data and to adjust for differences in values and reference ranges over time. These data can be a valuable addition to Computational Phenotyping algorithms.