Now let's talk about how random graphs help us model social processes. We already know that random graphs can help with tie formation, but what do they do with something more complex? Would anyone suggest that the network is explained only by reciprocity, for example, or only transitive closure, preferential attachment, brokerage, or homophily? Not really. We know that all of these processes are important for the network formation, therefore, we need the model that can examine a network for all of these processes at the same time. The basis for the statistical modeling of network rests on random graph theory. Simply put random graph theory asks, what properties do we expect when ties x_ij form at random? Then we compare our network to those random networks. We talked about the fact that the simplest random graph is the Bernoulli random graph, where x_ij is constant and independent. We can say that simply each edge of a graph has an independent probability of being one or zero. Typically, this is an uninteresting distribution of graphs and we want to know what the graph looks like conditional on other features of the graphs. For example, if we have our graph with 122 nodes that has certain characteristics, such as number of ties, we probably want to build the random graph with those characteristics as opposed to just any random graph with 122 nodes. We start with a random graphs we have some ties that form our network, and what we do is we simulate another graph where the same number of ties are randomly assigned to random pairs of actors. We don't get a graph identical to ours, in fact, we can get a graph very different from ours. But what is similar is we have the same number of nodes and approximately the same characteristics that we have in our network. Even though of course they would be different because triads and other structures would form differently. What we do, we compare the random network to the network of our nodes. Notice that we have the same number of actors and the same number of arcs in this published study. Notice that starting with reciprocated arcs, transitive triads, two stars in an out, the number start to differ. For the random network, the number of reciprocated the mutual ties is only six where in the network that the study is talking about, we have 44. Then the numbers change even more for transitive triads, it's 53 versus 212. See how with the same number of actors and the same number of ties, we can get very different network structures. We compare our network to the random network generated with some of the same parameters and we try to see how different is our network from random. In the process, if we determine that our network is not different from random, then there is nothing interesting about that network. Random graph distribution here is the set of all possible graphs, in this case of 38 nodes with a probability assigned to each graph. We can start with a uniform distribution of graphs with 44 edges. Each graph has equal probability, it has exactly 44 edges. We can have a Bernoulli distribution of graphs. Each edge in the graph occurs independently with a fixed probability. It's like tossing a coin many times. We can get the probability for each graph. For 38 nodes if we make this probability 0.06259 then across the entire distribution the average density will be that 0.06259, and the average number of edges will be the 44 that we had. Now that's what the distribution of graphs would look like. Notice how we actually get different number of edges. We have the different mean and we have a different distribution. A Bernoulli graph is only conditional on the expected number of edges. Effectively, we ask a question, what is the probability of observing the graph we have given the set of all possible graphs with the same number of edges. Now that may not be a very good question. Think of how many possible networks we can draw that would have nothing to do with our network. We might instead want to condition on the degree distribution, let's say by sent or received or all graphs with a particular dyad distribution: same number of mutual, asymmetric and null dyads. Closed form solutions for some graphs statistics like the triad sensors are known for out-degree, in-degree and MAN; mutual asymmetric null, but not all three simultaneously. Once we move beyond simple random graph models, we introduce dependencies among network tie variables. These express various types of network self-organization. A dependence assumption picks out certain types of network patterns, such as network configurations that are possible in the model. In other words, we assume that the network is built up of these configurations. For example, let's say we have certain number of triads of a certain type, we can actually condition on that one configuration and say, let's build all the random networks with the same types of triads. Will those random networks be different? Of course, because ties would be formed between different actors is just that some of the configurations, specifically triad configurations, would be the same. There are actually four generations of dependence assumptions. The first are Bernoulli graphs. Network variables are independent of each other. The other dependence for directed graphs is dependence within dyads. Then there is also Markov dependence and network variables are conditionally independent unless they share at least one node. There is social circuit dependence. Network variables are conditionally dependent if they create four cycles. There's also dependence arising from actor attributes. We have talked about the fact that in social network analysis, attributes are paramount to our understanding of why networks form. We can actually condition on certain attributes of nodes to see if they're responsible for the tie formation. Let's look again at the dependence assumptions. Suppose the edges are conditionally independent if and only if they share a node, but the number of structures that arise from that assumption is actually quite large. Frank and Strauss in 1986 showed that configuration this model comprises edges, stars, and triangles. Edges is just a dyad. But stars can be two stars, three stars, four stars, and so on and then of course, triangles that we know already. What we need is a statistical model for network, which can be parametrized so that important defects in the data are not extreme when the model is simulated. In other words, if we have those 2, 3, 4 stars in our network, we should be able to reproduce them in the random network generation. Important effects here means that the presence of small subgraphs called configurations that we can reproduce. Exponential random graph models or ERGMs can provide such models for a range of configurations relevant to social network theory. ERGMs are now widely used for model social networks and now exist with a variety of extensions. Now that we have all of our foundation in order, let's move to one of the most important and most exciting family of models, exponential random graph models.