Hi, and welcome to unit 8 office hours. This is our last unit, so we'll actually be touching on a bunch of comments. To kind of clean up from previous units. And we're going to start off with one about assortative mating on heritability. How is the fact that tall woman usually choose tall partners and university graduates usually turn, usually choose a partner who also has a higher education accounted in heritability? Since both parents would carry the genes for the above average height or above average general cognitive ability, aren't the chances that their children get the same phenotypes increased? >> Well that, who, who's that from? >> This is from Haleen. >> Haleen. That's a, that's a good observation from Haleen. She's right. >> Mm-hm. >> That if there is a assortative mating, then you increase the chance that the characteristics of the parents are transmitted to the offspring. The effect, we didn't get into assortative mating in this course, but the affects of assortative mating as you know, on, on, heritability are a little bit complex, but I'll try to just very briefly characterize them. First of all, if there is assortative mating, then you increase genetic variance, and that's what Helene is referring to there, a, a consequence of that is what Helene referring to there. And that is that if you have tall parents who tend to marry other tall parents, then they're more likely to have tall children. That's really reflected in increasing the variance of the trait. So you could say that, assorted mating leads to higher heritability in that sense. However, the effect of assorted mating on the estimation of heritability is different. So it increases the true genetic variance, but in a twin study which is the most common way that behavioral geneticists estimate heritability, it actually results in an underestimation of heritability. The true heritability goes up. The estimate from twin studies goes down. And the reason for that is that if parents become genetically similar to one another, via assortative mating, then that increases the genetic correlation between dizygotic twins. It's no longer one-half. It'll be a number greater than a half. And you could, you could understand that if you take it to the extreme. If you had two monozygotic twins mating, of course you can't have that, but if you had two monozygotic twins mating, then you know any children they had would also be monozygotic twins. They'd have all the same genome. So genetically similar parents increase the genetic correlation of DZ twins. But in the standard Faulkner model, the AC model we talked about. That genetic correlation is assumed to be a half. So, because you've, you misstate that correlation, you go through the mathematics of it. You end up underestimating heritability. And actually, interestingly enough, you overestimate the shared environmental effect. >> Interesting. All right, and kind of keeping on, different variance measures, our next questions are about shared and non-shared environment. And we'll start off with one that's, it's bit hard for me to understand how environmental factors can be so neatly packaged into shared and non-shared. I find it curious that, while the effects on psychological factors, of shared factors is usually low, the effect of non-shared factors is significant. Since the measurement of some psychological factors cannot only be, or can only moderate reliability, how do we know that what we think are the effects of non-shared factors is really not just measurement error? And kind of a follow-up to that is why is all of measure error applied to the share, non-shared environment? >> And that's from? >> It's a mix from Benjamin and Karen. >> Benjamin and Karen. So maybe the second one first. Why is measurement error included in the estimate of non-shared environment? Measurement error, the, the, the, the assumption in psychometrics, which is the area of psychology devoted to evaluating, the adequacy, adequacy of psychological measures, measurement error is assumed to be random. >> Mm-hm. >> And so measurement error would actually make an individual different at two different time points when they take a test. It because it makes the same person different over two administrations of the test, its all, obviously also going to or at least to me it seems obvious that it's also going to make monozygotic twins different overtaking two, when they take the test. So since the non-shared environmental component is everything that monozygotic, that causes differences in monozygotic twins, the measurement error would be included in it. And now the other question, aspect of the question is more substantive. And that is, if, if measurement error is included with the non-shared environment, how do we know then that there's really any important non-shared environmental effects. >> Mm-hm. >> And maybe the best way to understand that or try to respond to it is to contrast general cognitive ability. Where the monozygotic twin correlation is on the order of about 0.8 to 0.85. And measurement error, and the, there are good studies now that was suggested measurements error probably about 10% for good measures of general cognitive ability. So Even though the correlation for monozygotic twins is 0.85, suggesting that 15% of the variance in general cognitive abilities associated with non-shared environmental effects, probably the vast majority of that is like the question asker is asking. Is really just measurement error and there's probably not a lot there for not, for real non-shared environmental factors. There's probably some but it's probably pretty small. Contrast that with personality. Personality, monozygotic twin correlations are in the order of 0.5. Again estimates of measurement error are usually on the order about 10%. So, 50%, the estimate of for personality of non-shared environment is about 50%. Measurement error is about 10%. So there, for personality, non-shared environmental factors appear to be very important. >> Mm-hm. >> So it's that comparison of what the psych, psychometrist tell us is the component of measurement error. To 1 minus the MZ correlation. That tells us that probably they're very important for something like personality. Maybe not so important for something like general cognitive ability. >> Okay. And now moving on to shared environment. [SOUND] I know Professor McGue said it wasn't important at this point in our course to fully understand the formula, which is very lucky for me, but this one is bugging me because I can't, I keep trying to explain it to myself in words but can't do it. Can anyone explain why shared environmental influences is twice the DZ correlation minus the MZ correlation? Or to put it another way, why is shared environment not a factor when MZ correlations are at least twice that of DZ correlation? >> Okay well sometimes the formula are not intuitive and I would say that there's nothing that is especially intuitive about that particular format. I think the, the herita in the AC or the Faulkner model. The heritability estimate is kind of intuitive. It's twice the difference. Between the MZ correlation and the DZ correlation. And it's, I, I think one can understand why that's a reasonable estimate of a squared. I brought some slides on this, Bridget, because I, you showed me the question ahead of time, so just very briefly, I, I, it, we don't need to know this, but, if we're estimating c squared to, and, and if you don't get the. The conceptual basis for this. I wouldn't worry too much about it, but there're two factors that contribute to MZ twin similarity. Genetic factors, we call that A or A squared and shared environmental factors C or C squared. So the correlation in MZ has two components to it, genetics, shared environment. In the standard Faulkner Model, we estimate the genetic component as 2 times r of MZ minus rDZ. >> Uh-huh. >> And that's shown on the screen, here. Well, then, if that's A squared, then, if we take that A squared away from the monozygotic twin correlation, C squared should be; take the monozygotic twins correlation and subtract off A squared. Or take R of MZ and subtract off that formula. If you do the algebra, you get the answer. I don't know that that's intuitive at all, but that's why it works out. >> Okay. Our next question is about western sample biases. And it comes from Doug, and he asks. Another apology for studies with a Western sample bias, is there some reason for expecting that studies with Western samples and studies with non-Western samples would give different results? So it is wrong to generalize from Western samples or is it that we shouldn't assume that the results will be the same without verifying it? The results may or may not be the same. Or is it that we should try to be more unbiased In choosing samples and that the small representation of non-white, or non-western samples represents the predominance of western funded research efforts that simply do not bother to seek a representative global sample. >> It's a, it's a good question from Doug. And, and I guess, the short answer. Which of course I'll expand more. >> [LAUGH] >> But the short answer is, yes, to all of what he said. It's a little bit of all of the things he, he talks about in his question. I can't remember if I gave the statistics when we went through the, this is going back to I think, unit five or unit six, when we talked about race. >> Mm-hm. >> But it's probably on the order of >> Maybe 85 to 90% of CHeBA studies are based on European samples. And then of the ones that aren't based on European most of them are actually based on East Asian, so there is demonstratively that bias. I don't necessarily mean that as discriminatory. I mean that just is the way it is. Not all world populations groups are being represented. Why should we care about it? Now Doug had identified several reasons why. First of all, they have a slide on this I'll show just briefly. But we talked about this when we talked about race. If we're thinking about common variance. And actually what Gewas assesses is common variance. Then, in all likelihood common variance exists in all world population groups. >> Mm-hm. They might exist and they probably do exist at some different, frequency but because they're common, that probably means that they existed when humans all lived in Africa. And because of that they probably exist in almost, you know, virtually every population group. So those common variance, right, if you want to study them, in principle... You know, it might not be the most efficient way to do this, but you could study them in Europeans or Africans or Chinese or Asians, because they should exist in all of them. And, what the slide show, Bridget, is, maybe a little bit complicated, but I just want to make the point. This is a study, published a couple of years ago, and what they're doing there is they're plotting for GWAS, significant GWAS effects, that where they're both GWAS on European samples and East Asian samples. And they're just plotting the effect size. And what they show is that there's striking convergents across the two populations. So that gets to Doug's point, well okay maybe the results do really generalize. This slide suggests they do generalize from East Asians to Europeans and one would like to see the same thing for Africans. That's common variance. It's kind of different. In fact it's fundamentally different when we're talking about rare variance. Because rare variance can be population specific we have many examples of those, of that. We've seen a couple of that, of the, of those or at least I mentioned them in this course. Tay Sachs degr- disease is a population specific disease. It happens in certain Mediterranean groups. Sickle cell anemia. Certain types of cystic fibrosis. Something that I didn't mention but is one of my favorite genetic variants is, and some people that's taken the course are probably familiar with it, and of course you're familiar with it, is a variant, that, exists only in East Asians, and affects the way they metabolize alcohol. And if you've inherited this variant it it actually makes you, it makes you more likely to get sick to have a flushing response when you drink and therefore you don't drink as much. You're less likely to drink. That's a variant that only exists really in east Asians. You couldn't study that variant in other populations. So studying these var, variants which certainly scientists believe. Are important. Really requires going beyond just studying Europeans, the kind of convenient sample. One last thing on this, I, I think you're familiar with this paper. It, it gets way beyond the genetics here. But this was a paper published, I can't read the date, I won't try to read it here, in brain and behavioral sciences a couple years ago by, I think Canadian psychologists, and they talked about what, what they said is weird population samples, or weird samples in most psycal, psychological research, and weird is an acronym, Western Educated, I is, Industrialized, rich and democratic. And the point is, is that, when they look at the field of psychology. And we're not talking about behavioral genetics, here. When we look at the field of psychology, it's overwhelmingly biased towards weird, what they call clever. >> Mm-hm. >> Acronym. Weird samples. And there is an issue of, to what extent do these. Phenomena the psychologists study, do they really generalize? Apart from the genetics, we do live, we can live in quite different cultures. >> Mm-hm. >> And, it, it, it's very constraining just to study one culture, which apparently psychologists have done, so. >> Okay, our next question is about within subship comparisons, and it's coming from Phil. I never got around to asking these questions in time last week. When I first looked at the sub ship graph for the mothers who had quit smoking, I thought hey, there is a connection between maternal smoking and, and pregnancy ADHD, then I heard Professor say there wasn't. The sub ship results, the pink columns for moderate smokers and heavy smokers are very similar, but both are much lower than the total blue columns. Also, I thought I would see a blue column for non smoking mothers and that for sub-ship graphs that there would be a column for the children gestated when the mother was smoking and one for when the mother wasn't smoking. >> Okay. >> So I think that each of the columns must already represent a variance between pregnancy or, progeny of smoking and non smoking mothers and I think I need to have the math spelled out for me. Also an explanation of why the sub ship columns are so much lower than the other columns. >> Okay, actually I brought the graph, so this is just the figure, it >> It must have been unit 7. So this is exactly what they've seen before. And, so the first thing it is, that I wasn't actually reporting in this graph, the rate of ADHD in children. Rather, I'm reporting in the graph the odds ratios for having a mother who's smoking. So, the blue bars are the odds ratio associated with maternal smoking in the total population. So this is not within sibship. And the way to interpret it, it's, it's a little bit of a fudge, but it's, it makes it easy easiest to understand what an odds ratio is. We can kind of think of them as risk ratios. >> Mm-hm. >> So an odds ratio of 1 would mean no increase in risk. But the odds ratios for maternal smoking it, are between about 1.5 and 2. What that means, that's, those are the blue bars in the graph that was Phil. >> Mm-hm. >> Phil referred to it. That means that if your mother smoked, that your risk for having ADHD was between 1.5 to 2 times as great as the population rate. >> Mm-hm. >> So there's an increased risk. The red bars that he's referring to are within sibship. The red bars are not significantly different from the, from 1. In fact they're a little bit less than 1 in a, in a couple of the samples, and a little bit greater than 1 in, in the other. The way the red bars are computed is, the, you're taking mothers who had two children. One when she smoked, during the, her pregnancy and one when she did not and, and I won't go back and get into the studies. But these are based in Scandinavia, or I guess in Nordic country, someone corrected me, Finland's not in Scandinavia. The, the Nordic countries and they have good information on this. So, she has two offspring, one she smoked during the pregnancy, the other, she didn't. And what we're comparing is the rate of ADHD in the siblings when she smoked to the rate in, when she didn't smoke. But now we're controlling for all the family factors, right? because you have the, the same family, same mother. In this case the mother smoking is not associated with offspring risk for ADHD. The sibling who, who, in utero experienced maternal smoking does not have a higher rate of ADHD compared to his or her sibling who did not experience maternal smoking. Okay, so sorry I didn't explain that well the first time. >> All right. Our last question is about genetic prediction and it's coming from Hazel. And it reads, in your study, how can Naomi Ray forecast that we will in future be able to predict schizophrenia on the basis of genes with almost 100% accuracy, if according to twin studies, the heritability of schizophrenia is only about 50%? >> Okay, that's, [LAUGH] good for Hazel, because she caught me. I mean and that's, it's a very good observation and I think that kind of the short answer here, and I'll give a little bit longer answer again, is that I didn't really fully explain the graph from the Ray, or actually, I made the graph. >> Mm-hm. >> I took the data from Ray's paper and put it into a graph. Here's the graph here, for the students watching. And we'll take schizophrenia, which is the example she's using. And right, this green bar is what Ray projects to be kind of the measure of predictability when we know all about the genetics. When we've found every possible genetic variant, and you know, it's above 90%. But as she points out, the heritability of schizophrenia is 50%. If one MZ twin has schizophrenia, the risk to his or her cotwin is 50%, how could genetics ever get us up to 90%? The, the answer to that is, I, I, I don't think I should get into the mathematics of, of decision-theoretic rules. Is that what's being reported by Ray isn't actually a measure of accuracy in the way Hazel's thinking about it. >> Mm-hm. >> It's a different measure of accuracy. It's called area under the curve. >> Okay. >> And I think it's probably better not to get into exactly how area under the curve is computed. But maybe Hazel and other students can begin to understand the issue if I said, okay if I wanted to predict whether or not somebody develops schizophrenia in the population. Then if I predict everybody doesn't develop schizophrenia, how often am I right? Well from the very get go, I'm 99% correct, the, because right 99% of the population doesn't get schizophrenia. But that doesn't help me very much, because what I really want to do is, I want to predict individuals who are at risk for developing schizophrenia. So, that, in this decision theoretic framework, this area, this statistic called the area under the curve, gives you a way of telling the extent to which the information you have improves upon just chance prediction. The statistic varies from 50% which is pure chance to 100%, you're perfectly accurate. But you can't interpret it as saying that if it's 95%, it, it, you, really doesn't quite translate that 95% of the time I'm right. The, the, I tried to give rubrics when I gave this. The, the, there are general guidelines for using area under the curve. If it's above 75%, that's thought to be pretty good for screening populations to identify at risk individuals. In, in order to make individual diagnosis, it should be above 99%. So what Ray is saying is that probably we'll never get, I mean, you're not seeing this, but it, it gets pretty high. >> Mm-hm. >> But it doesn't get above 99%. Is that it'll be very good in, in based on her calculations, genetics will, could I guess is maybe a better way of putting it, if we could find all the genetic variants. It could become very good at screening populations to identify at risk individuals, but it's not going to ever be a diagnostic tool. And we, you know, we know that she's right. She's absolutely right. We know that from the MZ twins, because, right, two people with the same genome, only 50% of the time are they concordant for schizophrenia, so I thank her for the question. >> That is a great question. All right. Well that wraps it up for this. It's been a great session. Many thanks to Matt for running this. [LAUGH] >> Yeah, and thank you Bridget for organizing these, and thanks for the students for. >> Yes. A lot of wonderful feedback and comments, so thank you guys very much. >> Thank you.