So now we're gonna give you an extended example of people analytics at work and follow that with some detailed issues involved in many analytics projects. So this example comes from my work with the national football league. As a result of some research I did a few years ago, I've been pulled in to work with some professional teams on their evaluation of players. And especially their evaluation of college players. One team in particular asked me, who is good at evaluating college players? Which team should they be paying attention to, which team should they be copying? So, I'll tell you about that analysis, but first I wanna give you just a little bit of background on the NFL draft. So the NFL draft, this is the National Football League, this is American football. And we will talk some about football, but you should think about this as any hiring exercise. So this is a law firm hiring new law school graduates. This is Google hiring new programmers to work out in Mountain View. This is Credit Suisse looking for new financial analysts. Think of this as recruiting quite generally. In the NFL, the draft involves looking at college players, here Jamal Charles from the University of Texas. They watch his tape from when he was on college, then they bring him into the combine and they measure them in all kinds of ways, how high they can jump, how fast they can run. They slot them into orders, the preferences they have for these players, so here's a typical war room from the NFL. This is Jerry Jones, who's the owner of the Dallas Cowboys with the list of players. And all the teams have these war rooms with lists of players all year round. Eventually they make their choices. This is on the NFL Draft, this happens each April, now maybe May, and is a major sporting event in and of itself. And then finally, they get to see that player on the professional field. So this is Jamaal Charles playing for the Kansas City Chiefs. And the big question is, to what extent does what they see in college relate to what they see in pro? To what extent can we observe a student performing in a law school and forecast how well they're gonna be as an actual lawyer? That's what's interesting about studying an NFL, we have better data and more observable outcomes than almost anywhere else. So we can really dig into these decisions and evaluate how good a job the teams are doing and identify where the mistakes are. So, this is the process the team comes to me and asks me about. Who does this well? Which of the 32 National Football League teams does this well? One measure for how a team is doing in the NFL draft is, how many of their players make the all star game. In american football that is called the Pro Bowl and so we can look at how many players that they draft that eventually make the Pro Bowl. And we can compare across the 32 teams which teams are good at drafting players that make the Pro Bowl, which teams are not so good. So in this chart, we look at 11 years of NFL drafts and we follow the players' performance for their entire career, however we long we get to observe them through 2009. So we begin with the 1997 draft through the 2007 draft and we observe them through 2009. And count up the number of Pro Bowls played by players that were drafted by each team. So, in a year a team drafts seven or eight players on average. And then on average, a player might play three or four years, sometimes longer for very successful players. So, we can add these up over time and get some sense of which teams are having more or less success in the NFL draft. If you do that, between 1997 and 2007, this is what you see. At the top, with the Indianapolis Colts, you see that they drafted players that made 35 Pro Bowls over the time that we observed. And that ranges all the way down to the bottom teams, Cleveland and Detroit, who drafted players that made only five and six Pro Bowls. So there are huge differences in the quality of the players drafted by Indianapolis, and those drafted by Cleveland and Detroit. With a big spread in between, all the other teams falling in between. So, this analysis, in fact, was first done by a Journalist, Rick Ryley, who at the time, was with Sports Illustrated. And when he saw those numbers, he said the Indianapolis Colts, not just with Peyton Manning, but Dwight Freeney, Edgerrin James, Reggie Wayne these were genius picks. On the other hand, the Cleveland Browns screwed the Chihuahua. Their run of number one picks from 99 to 02 is the single worst stretch of drafting since the Iraqi Republican Guard. Were they using an Ouija board? So clearly Riley feels you can draw some very strong inferences from this data. And I wanna give Riley credit because at least he quotes the data, right? And one of the things he was saying in this article was, fans, don't worry too much at the time the players are drafted, you don't really know there's so much uncertainty you have to wait a while. And so he waited a while and he crunched the numbers and now he's certain. But the trouble is, there's reason for skepticism. Reason for Skepticism includes some work I did a few years ago with Richard Taylor on how overconfident teams are in the NFL draft. That work led me to appreciate how much uncertainty there really is and it seems unlikely to me that some teams are so much better than others given that. Baron and Hershey, our colleague Jack Hershey here at in my department actually, OID here at Warton. Studied something that came to be known as the outcome bias years ago. That we tend to judge employees, individuals, organizations by what happens. Good outcomes lead us to believe that the employees are good, bad outcomes, lead us to believe the employees are bad. Regardless of how much control they had over the process, so maybe we're guilty of outcome bias when we look at the Indianapolis Colts and see that Peyton Manning has been in all these Pro Bowls. And then finally, Matthew Rabin, behavioral economist, did some work where he used the Law of Small Numbers to evaluate some decision making biases that individuals showed. And he found that there's less difference in skill than people believe. He called it fictitious variation. And he said fictitious variation's the most important economic consequence of the law of small numbers that we infer too much from small samples. That's something we're gonna talk about a little bit more at another place in the module. But this academic research gives us reason to be skeptical. That these differences are truly the product of some teams being better than other teams at selecting. So, to dig in this in a little more detail, we'll drop to the data. And we're gonna get into more detail in this part of the course. In this part of the performance evaluation module more now than anywhere else. So, let's actually crunch some numbers. Here, we will look at Performance for Players Drafted 1991-2004. So these are players we can observe perform at least five years and have a pretty good sense of how they're gonna be as professional football players. We can look at how they perform in a few different ways. The patterns are always the same. The guys taken at the top of the draft are better than the guys taken at the bottom of the draft. So in this graph, we show you the number of starts that the typical player taken at the start of the draft plays relative to the guys taken at the bottom. About 250 drafts in each year, so, if you look at the blue line this is the number of games started by a player drafted, the first player taken in the draft. And that drifts down where by the end of say the fifth round, 160 picks later, those guys are starting about 20% as many games as the guys taken at the start of the draft. So, a big drop in performance. The drop is even steeper for the more extreme performance. So, Pro Bowls, we were just talking about Pro Bowls. Does the team, does the player make an all star game? The Pro Bowl measure drops from the guys taken at the top of the draft. By the time you're down to the second round, the bottom of the second round, about the 65th pick, you're below 20%, it's a very steep performance drop-off. This suggests that there is skill involved in drafting players, that the scouts and the general managers involved with identifying college players, know what they're doing, because the guys taken at the top are better than the guys down below. But, does that mean that there are differences in skill? Remember that the team that asked me to look into this wanted to know Which teams are better at this, which teams are worse at this, who should they be paying attention to. That suggests that some teams are better than others. Skill is involved, but are there differences in skill? This goes back to our question of can you separate signal from noise? Can you really infer whether the teams are good at this, or whether they just got lucky? So, that's where we go next. And we're gonna do one thing first. I said in an earlier module, that you have to be very careful, in performance evaluation, comparing employees to one another. Unless they're in very similar situations, you will have trouble comparing them one to another. So we're gonna make one adjustment before we start comparing these teams to each other. And that is we're gonna control for where they picked in the draft. Some teams pick earlier in the draft, some teams pick later in the draft, and so we want a norm for where they actually picked. So, we're gonna do that by looking at what would we expect. This line shows you what you would expect, for a performance measure, in this case, the games started per season. What would you expect for the first player taken in the draft, based on history? What would you expect for the 33rd player taken in the draft, again based on history? This line just charts that. So, on average, over, over a long period of time, we've observed fourteen, fifteen years. The top player in the draft averages about 11 starts per season. This is most, 11 starts out of 16, this is most of the season, he's starting for five years. This is a very successful player. And then drifts down rather smoothly to say two starts a year. By the guys taken 160th. So you get this smooth decline. This means that when a team drafts a player 33rd, you should expect that he's gonna play, what does it look like, eight and half, eight and quarter games per season. So, that's expectation. We wanna know what our expectation is for any given decision, so that we can evaluate it in terms of deviation from that expectation. We don't want to reward a team for picking a good player just because they were fortunate enough to have the first pick in the draft. We don't want to penalize a team for picking a bad player just because they had a later pick in the draft. So it's very important to contextualize in this way. We'll talk a lot more about contextualization else where in people analytics, but I want to use it here because if we are going to compare across teams. We need to norm their position, so that's what we do. This is what it looks like. This is an example of one team, the New England Patriots in 2003, their draft that year. They had a lot of picks that year. This looks about ten picks. And in each case, we plot where the player was drafted. As you move from left to right, the top player was Warren taken in about pick number ten. And it goes all the way down to Kelley taken late in the draft. And then we plot on the Y-axis how well that player performed over the first five years, which is a period of time we can be very confident he was with the team that actually drafted him. So the team that drafted him has generated the benefit of the draft. And what do we see? We see in general the guys taken early in the draft did better, the guys taken Later in the draft it works but there's lots of variation, right? So, consider Koppen. Koppen was drafted, he was the sixth player drafted, he wasn't taken until a hundred and sixty second, a hundred and sixty third. but he was the best performer over the next five years of anybody else in the draft class. He started almost every game for almost the whole five years. This was a wildly successful selection late in the draft, and so you get a very positive deviation when you evaluate that pick. Contrast that for example with Johnson. Johnson was drafted 37, 38, and when a person's drafted at that level we expect them start about seven games a year. He only started two. So this is way below expectations. We code that as an unsuccessful pick. So, now what we want to know is, do teams tend to have relatively successful picks? Do they tend to have relatively unsuccessful? Or is it just randomly distributed around what would be expected? That's where we're going when we're trying to determine is there skill here? Or is there luck. So, here's an example of one of the most famous draft classes in national football league history. The Pittsburg Steelers, famous draft class of 1974, they drafted four hall of fame players, Jack Lambert, Lynn Swan, John Stallworth, Mike Webster. All these guys were hugely successful, not just for a year, but for a career. Now, if the draft, if the Center Field draft involves skill. This is a team. They drafted these players because they're especially good at their job. What would you expect to happen the next year in 1975? So, assume you've got the same scouts, you've got the same general manager. They select all these great players in '74. If that's the result of skill, what would you expect to happen in '75. Or, what would you expect to happen in 73? Let's look at 73. In 73, their 2nd round pick never played a game. Their 3rd and 4th round picks were average at best. And yet, that was just one year before they had this Hall of Fame class. What about 75? What about the next year? In 75 it was even worse. Not a single player drafted started. Out of 21 picks at this enormous draft, and not a single player drafted started. And picks in each of the top six rounds. They only played a total of 24 games for the team. So, what do we think about a process where you draft one of the best classes ever, probably the best class ever, in 74. And yet the 73 draft was completely average. And the 75 draft was actually tragically bad by any measure. What does that say about how much skill is involved in this? How much credit should we give them for that Hall of Fame class in 74? That's the idea, and that's a very general idea that's maybe the biggest lesson in the performance evaluation is. The question is, does it persist? Skill persists. Chance doesn't persist. And if the challenge is to parse skill from chance, the single most important test is persistence. Do you see it across periods? Do you see it over time? Do you see the positive performance measures persist? That's what we're gonna do here. This is one look at that for the NFL draft. This is the only real analytics we're gonna do in this example. We're gonna take all the years that we observed teams draft and we're gonna code them up in just the way we described. We're going to, evaluate does a player do better or worse than expectations. And for that year we're going to add up all of those deviations or the positive and the negative, do they add up to zero, whatever. We're going to evaluate every team and each year of the draft. And we're going to rank the league, one to 32 within a year on how they performed in a draft. And then we're gonna ask what happens to next year. So for all the teams that did best in a year, what happens to them in the next year's draft? And we're gonna do for the teams that rate 16th in a given year, what happens in the next years draft? That's gonna give us a test of persistence. If this is a skill based task, those teams that do well in the draft in a given year will do well the following year. Those teams that do poorly will do poorly. If it's completely chance, how a team performs in one year will have nothing to do with how they perform the next year. There'll be no correlation between the two. And if it's a mix of skill and chance, you'll see something in between. That the teams that do well one year will tend to do better the next year and the teams that do poorly will tend to do poorly but they'll regress to the mean. So that's the test. We'll find out what happens. This is what we find. I've shown all the teams, over all the years in our study here in gray. But then I've highlighted the bunch that were rated in number one, from the top in green. And then in the middle in blue, and the bottom in red. And what do we see? What do we see here about the relation between how a team does in one year and the following year? At a high level, we see essentially no relation. Consider, for example, the teams at the very top in green. These are teams that, in a given season, were the single top performing team in drafting players. What happens to them the next year? Well, one team was again the top-performing team. But, one team was also the worst-performing team. And you can see that there's a full spread, that from that number one position, they went to every other position from one to thirty-two, there was no predictive quality. About their first year performance in the next year. And conversely, the same at the bottom. You can look at the teams in red, which is 28th or so, and in one case, 28th was again 28th, almost perfect persistence. But, in the other cases, they drifted up, some were kinda middling the next year, and some were actually quite good in the following year. So, this tells us overall the correlation is slightly negative. Not different from zero but slightly negative. Essentially, zero, there's no correlation between how a team drafts at one year and how they draft the next year. And when there's no correlation, we can be sure, that means the differences that we observe are not the product of skill, the differences we observe are the product of chance. So, this is one performance measure, it's a starts, you can do this for other performance measures, this is how much a player receives in compensation, when he reaches free agency, you can use any number of performance measure and you get the same result. Most of the deviation goes away the next year which means that most of the deviations, most of the differences between teams are the result of chance and not skill. So, different performance stats, different player career stage. You can norm for additional factors like a player position. You can evaluate not the team level performance, but you actually look at the person in charge, whether it's a general manager or an owner, and you can track that individual's performance over his career, and again you don't see persistence. The vast majority of the variation is purely a product of chance. So, you might say, fine, fine, fine, this fancy regression to the mean analysis, persistent analysis, but what about this chart? Because it's hard to make sense of this chart, right? That's a lot of variation. Can you really get that much variation from chance? So, quick demonstration. In this case, we charted every player by what team drafted him. Let's do something else. Instead, let's look at every player by what day of the month he was born on. And we can ask are there differences in the likelihood of making the Pro Bowl as the result of being born on the 3rd of the month, or the 28th of the month, or the 10th of the month. We do that because we know that's random. We know that can't possibly be related to player performance. And we're curious if you do that, what pattern do you see. Well, you see a very similar pattern to what happens when you bend by which team chose them. So, if we get the exact same pattern, and in fact, it's the exact same pattern, when we bend by day of the month they were born on, as when we bend by the team that drafted them, it ought to give us caution in drawing any inferences about those teams. Because it shows that even these dramatically disparate outcomes can be purely the product of chance. And this is something we need to be reminded of again and again, we'll drop into some of those details subsequently, but we need to be reminded again and again that big variations in performance measures can be purely the product of chance. And so, we want to be very careful about drawing strong inferences about the skill or the effort that underlie the performance measure. One last note before we go. You might ask well maybe people already know this, people that follow football, maybe you don't follow football so you don't know. Do people think this? Have a quick study with Berkeley Dietvorst, a PhD student here on what people believe to be the driver of NFL draft performance. We asked people how much draft outcome is completely due to random chance versus draft outcomes are completely due to drafting skill? Where on that continuum, skill to chance, do you believe draft outcomes fall? And we asked this of NFL fans and we screened them, for actually following the NFL and what do you see, you see people, they don't think it's all skill. But the vast majority of people believe that it is on the skill side of the continuum. Almost nobody says that it's chance-related. By far the most are two-thirds, three-quarters, skill. And these are folks that actually follow NFL, follow the draft. They know that there's a little chance involved, but they greatly underestimate the amount of chance that's involved. I wanna close this example, this extended example with a quote from Michael Lewis. The author Michael Lewis started his career with Lias Parker, about Salmon brothers wrote more recently on WallStreet, probably best known for his book, Moneyball, on analytics and baseball and he said when talking after that book was published that people ask him Lewis you've studied more serious topics, why are you studying baseball now? And he said the following, which I think is very instructive. He said if professional baseball players, whose achievements are endlessly watched, discussed and analyzed by tens of millions of people, can be radically mis-valued, who can't be? If such a putatively meritocratic culture as professional baseball can be so sloppy and inefficient, what can't be? So this, again, should give us great pause if in this world of football with very good performance measures on the way in, great outcome measures on the way out, millions of dollars at stake, millions of people watching these decisions and yet we still see these kinds of mistakes made and not appreciated, we still see people misunderstanding skill versus chance in this very important task in that culture. What does that say about what's going on in firms that we don't have as much, or we don't have as many people watching. We don't have as good an outcome measure. It should give us pause at the quality of the inferences that are being drawn in the organizations that you and I work in.