Let's work with all network measures in R. You need the following libraries, make sure that you have them loaded up front. We'll start with dyads, triads, and other local measures. Before we start, we need to load all of all networks back in, the ones that we worked with in our previous lecture, friendship, professional loss, and network. I've already done so. Make sure you do too. I'll wait for you until you're ready. Ready? Let's go. Well, first we need to dichotomize our data. If you haven't done so already, here is the code that will allow us to do that. We're using both the sna and the igraph library for now and we're dichotomizing based on any values present. Which means that if there is a connection, no matter the strength, we're turning it into 1, otherwise it's 0. We dichotomize the friendship, both the graphs and the networks, the boss, the professional, and the support. It's a simple if else command and we're using the support matrix, boss matrix, friendship matrix, and other matrices we have created earlier. Now we turn our matrices into networks using as network commands or into graph objects using graph adjacency. Well, let's look at our networks with sna package first. Why we start with sna? Well, the sna and the igraph packages were created by different people for different purposes. Even though they have a lot of the same functions coded in, some of them do so differently. There are some functions that they prefer in sna, there are some functions that they prefer in igraph. Let's start with sna. We're detaching igraphs so it doesn't interfere. Some commands such as closeness or between the centrality or some other commands similar interfere with each other so packages don't understand what happens. If you're working with networks and you're realizing that you get error messages for command that should work, try unloading all of your packages and loading only the one that you're currently working with. Hopefully, we won't have the problem, I'll just detach the igraph and have my sna and network. We start with the network stats on density. First the network sites. We actually know it already, but let's check it again. Here we have 122 nodes. We can look at the number of edges, which means how many connections we have for each network. We use the command called network edge count. It's very easy to see in the output window that the largest network we have is the professional network, which is the way it should be after all, this is an organization. But then in terms of size, we have the friendship network, then the support network, and then the boss support network. They seem reasonable at least from the standpoint of our understanding of what is happening in that organization. Would I have liked to see support network larger than the friendship network? Probably. But we can talk about that when we talk about theoretical implications of our study. Network density is a direct function of the number of edges that we have. Of course, the most dense network is going to be professional. But let's verify that. No surprises here. Even though these networks are not very dense, the density is only 0.02, but it's 0.03 almost for professional network, 0.023 for friendship, 0.20 for support and actually quite not dense for the boss. Density is very much a function of the number of nodes. The more nodes we have, the less dense the network is going to be. I'm not concerned at the size or the density of these networks, I think they seem reasonable. Now, you know what? You can compare the networks and you can correlate them. Can you imagine that you can actually find the correlation coefficient for the networks? I know in the theoretical part of the material I didn't talk about that. But I'll show you much more in our R code anyway. Here is the simple function gcor that can calculate the correlation between the friendship network and the professional network, and they correlate at 0.84. Now, what they suggest we do is we actually build the matrix of correlations. Because when we look at densities alone, for example, the density of the friendship and the density of professional versus the density of the friendship and density of support, what we do not know is those densities are they between the same nodes or different types of nodes? Network correlations can help us figure this out. This is a little bit involves because first I calculate all period correlations between all of our matrices. Then I generate a matrix of names. Then I create a correlation vector and load that correlation vector into a correlation matrix. Then I add the names, and there we can look at our correlation matrix, the one you're used to seeing. The length of the matrix is exactly the length of our name, and here is the correlation matrix. Now, I think it helps if we look at the correlation matrix together with a network density numbers. Let's rerun them again. Now notice how our network density for the friendship network and network density for the professional network is 0.28. Here is very high correlation between them. Well, density of a friendship network is 0.23 and 0.20 for the support network. But the correlation between them is only 0.707. But if you look at the correlation of professional and support, well, here's that correlation it's only 0.61. Even though the densities are similar or close together, we can pretty much see from the correlation coefficients that the connections generate this density numbers probably happen between different nodes. We can examine this in a lot more detail, but I wanted to show you that correlations that we do on regular data are also possible with networks. We can even draw the correlation matrix and I find this the most fun. Of course it's more fun when we have positive and negative correlations. Here all of our correlations are positive, but it's fun nonetheless to see those correlations as bubbles. Keep this handy tool in mind. Now that we look at the basic measures. Let's talk about the concepts we've talked about the most, dyad and triad and count. Well, start with the dyad. We have 14,762. Why so many? We have 122 nodes and do me a favor and calculate 122 times 121. Here's that number. Well, that's because we have 122 nodes and each one of those nodes can make connections to the other remaining 121 nodes and that's the total possible number of dyads that we can generate in our directed network. Do we generate that many? Well, apparently not. We generate a lot, but about half of all of our dyads. Why? Well, because this null dyad actual counts both ways, so does mutual and only the asymmetric goes from one node to the other and not the other way round. Next week we can look at, is the mutuality index. Mutuality index actually will not provide you any more information that you already have here for mutual dyads. But so that you know that's true, let me run this command. Not surprisingly, of course, the most connected is our professional network, but we knew that already. It just makes it interesting to see which dyads are present and which ones do not. Much more interesting is the index of reciprocity. The index is very high. Why is it so high? Well reciprocity is the proportion of symmetric dyads, which dyads are symmetric. We have two types, mutual and null. They are both symmetric dyads except with mutual, we have connections going both ways, and with null we have absence of connections going both ways. Therefore, do not trust the index of reciprocity blindly because it's misleading. This number is so high because we have so many null dyads. It's not the true index of reciprocity if we think of connected dyads. Now, let's look at the triads. For triads, the command is triad census. Again, it makes it interesting to look at this networks and not surprisingly, the largest number, hundreds of thousands of triads we have, for now triads where we'll have zero mutual, zero asymmetric, and three null. We also have quite a few of triads with only one connection we have zero mutual, one is symmetric and two null, we have much fewer number, of course, of other triads. It makes it interesting to analyze in detail, which one expected by chance and which one are present there for some serious reasons. I'm pretty sure that the network that we have is not random. It's highly unlikely that we'll have lots of triads that have the same expectation as the random. We don't have time unfortunately, to go through all of it now. I want to show you the commands. I want to show you what it looks like, but you can easily explore this on your own. Now that you know how to do this, we have other interesting things to talk about, one of them is components. How many components do we have in our network? Here for the friendship component, we have 12. Is that true? Well, let's draw the network real quick and count those components. Here's one connected components, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. Here are our components, including disconnected nodes. Remember, a single node not connected to anyone else is a component. We can calculate components similarly for all other networks, of course, but we can also calculate the measure of connectedness of the network. In fact, it looks reasonable given the number of components and given the number of connections within the existing connecting components. We can also identify the cut points. Remember, cut points are the nodes that break up the network into two or more components if you remove the node. Here are those components and in fact, there aren't that many of them. These nodes, you can identify them and look at what those people are, so that they can see where are the cut points. We can also look at the geodesic distances, which is the shortest paths emanating from or ending at any particular node. This particular measure is calculated as an object. If you look at it, it's a list, which means it contains many other different pieces of information. Please go ahead and explore them on your own. What I want to show you is the summary of geodesic distances for the node 3, and we'll have two measures. Why? Well, think of it, our networks are built from row to column. When we have these numbers, the first one represents a row, and the second one represents a column. When we have number in column, number 3, that means those are all the geodesic distances going to the node 3. We have ranging from zero to a max of five, that's the length of those geodesic distances ending at Node 3. When we have number 3 in the row part, those are geodesic distances from node 3. It starts anywhere from zero and ending with a number 13. Even though it's the same node, the geodesic that end will start at that node are actually quite different. Now, you can obtain lots of statistics with SMA package. Please feel free to explore it on your own, but we have some interesting things. Well, one interesting thing I want to show you with an igraph package. For that, we have to detach the SMA and attach the igraph, but I want to show you the largest cliques. It gives you an error message, but nonetheless, it provides the list of cliques that are available. Here for the friendship graph, the largest clique is 7 out of 122 nodes, and actually, it lists the nodes that generate a clique. The clique remember, is a structure where everyone is connected to everyone. In our friendship network, we have largest clique of a size 7. For the boss, we have surprisingly cliques of the size 4. For professional, we have cliques of the size 4 also. That's actually quite interesting. As you're looking at those cliques, you can figure out why do people form cliques and maybe get additional information from who are the members of those cliques. With that, we're done with the simple measures. I hope you can explore that igraph package as well, and look at all other options for the measures of networks that the graph provides.