Next, we will talk about connectivity analytics with Cypher. If you remember in module two, we talked about connectivity analytics in terms of network robustness. In other words, a measure of how resistant a graph network is to being disconnected. Specifically, we used two kinds of methods. One computed the eigenvalues, and the second computed the degree distribution. For these examples, we're going to use the second one, degree distributions. And we will use the same graph network we've used previously. A simple graph representing a road network. And here's a listing of the query examples we're going to be applying to our network. My first query example finds all of the outdegrees of all nodes. Now, if you'll notice, this query consists of two parts, because there's a specific type of node, a leaf node, which does not conform to this particular constraint. So our first match statement finds all nodes with outgoing edges, as you can see here, this is a directed edge. And then, we returns the names of the nodes and the count as the variable outdegree. And for convenience, we order by outdegree. And we need to combine that with a specific query dealing with leaf nodes. We're familiar with how to do that from past examples. And so, we'll match all leaf nodes and return the name and the value zero for its outdegree. So when we submit this query we get this listing right here. The node P has 0 for its outdegree and all of the other nodes are as we might expect and they're ordered by their value of outdegree. Our next query finds the indegree of all nodes, which is very similar to our previous example. But, in this case, as you might expect, we're going to take into account root nodes instead of leaf nodes. And so, our match involves incoming edges. Indegree is a measure of all nodes connected to a specific node with incoming edges. And we return similar results and we union that with the specific query commands to find all of the root nodes, and then we return those names and 0 as the value of indegree. So when we submit this query, here's our results as we might expect. In this case, H is our only root node. So it has a value of 0 for indegree, and all the other nodes are as we might expect. And our third query example finds the degree of all nodes, which is a combination of outdegree and indegree. So in this case we're not including any specific direction in our match statement. And we're returning the name and the count for all of our edges. But we're using the distinct statement, otherwise we would be counting some nodes twice. And then, for convenience, we order this by the value of degree. And when we submit this query, we get the results as shown here. We have 1 column with the name and the other column with the degree and the values are as we would expect, we have a leaf node P with the degree of 1 and a root node H with a degree of 1. Our next query example generates a degree histogram of the graph since we're able to calculate the degree of each node, we can sort those into actual values of degree. So if we look at the distribution of degree among our nodes, we see there's 2 nodes with the degree 1, there's 3 nodes with degree 2, there's 4 nodes with degree 3, and there's 2 nodes with degree 4. So we're going to group those in the form of a histogram. So when we submit this query, we get this table. The first column list the degree value in ascending order and the second column list the counts of the nodes that have that degree value. So for those of you who are familiar with SQL you might recognize this as similar to the group by command, it performs a similar function Our next query example saves the degree of the node as a new node property. This provides an added convenience so that we don't have to calculate the degree of a node every time we're performing some sort of analysis. So we match all nodes with edges, and there's no direction in this particular edge definition. And then, we return distinct counts of each node's degree, and then we create a new property, called deg, and assign the value of degree to it. Then, we return the names and the degree values, and so when we submit this query, we see this distribution right here, with the names in the left column, and the values of degree in the right column. And we can verify that if we issue a command to return all of the properties of the specific node. So in this case I issued a command to match the node named D and return all of its properties. And sure enough we see that it has a property name and a property degree. Before we go to the last two examples, there's a philosophical issue that we need to remember with all databases. Every database will allow you some analytical computation and the remainder of the analytical computations must be done outside of the database. However, it is always a judicious idea to get the database to achieve an intermediate result formatted in a way that you would need for the next computation. And then, you use that intermediate result as the input to the next computation. We've seen that a number of computations in graph analytics start with the adjacency matrix. So we should be able to force Cypher to produce an adjacency matrix. And this is what we're doing here. So think of a Matrix as a three column table, in which, here's one column, here's another column, and the third column will be the values that we are calculating when we determine whether two nodes have an edge between them. And we're introducing a new construct in Cypher called case. This allows us to evaluate conditions and return one result, or a different result depending on the condition. Here, we're specifying that when there is an edge between nodes n and m, then we return a value of 1, otherwise return a value of 0. And we'll output those results as a value. And so, when we submit this query, we get our three column table in which the first column is the name of our first node. The second column is the name of our second node and the value is either a 1 or a 1 depending on whether the nodes have an edge between them. So in this case we see node A and C have an edge, A and L have an edge and so on as we would expect. So if we can calculate the adjacency matrix then we can calculate any matrix. You might remember from our module two lecture where we learned about this complex structure called the Normalized Laplacian Matrix. So let's go ahead and calculate that. We'll perform something very similar to what we did in the previous example. We'll match all nodes for the first column, and all nodes for the second column. We'll return the names of those nodes and then we'll use the case structure again to compare the names of each node and determine whether we have the same node. If we do have the same node then that is a diagonal of the matrix and should get a value of 1. If they are different nodes and contain an edge between them, then we calculate the normalized Laplacian with this equation here. And you'll also want to notice that here we're using the actual degree property that we assigned to the nodes in a previous example. This is an example of how that can become a convenient option. So when the calculation is performed, the value would be returned. If there's no edge between the 2 nodes, then the value of 0 will be returned. And these values will end up in the value column. So when we submit this query, here's the table that get returned. This is the first column with the source node. The second column with the target node and the values. So in the first row, the first node is P and the second node is P, so it's identical, which means it's on the diagonal of the matrix. Likewise, for A in this row down here. And then the first value of the Laplacian is calculated between nodes A and C, and so on. So that concludes our examples of how to perform connectivity analytics in Neo4j with Cypher.