In previous modules we talked about hierarchical data structures. Different places where you might find hierarchical data, whether it's through your file structure, through genomes, through family trees. And then we talked about, sort of, no-link diagrams for visualizing hierarchical data. As well as methods for taking any, sort of, generic data set and doing a hierarchical clustering on it. In this module we want to talk about another way to represent hierarchical data through the use of what we call a treemap. So previously we talked about node-link diagrams. And some of the problems with the node-link diagrams were, essentially, all of the, sort of, white space that might be left over in an image. Especially if the tree gets really long one direction we wind up with all of this, sort of, whitespace left on the screen. So people started thinking about what might be a space-filling representation. And so a space-filling representation, one idea was to create an icicle plot and so an icicle plot is shown here on the right. And how do we interpret this icicle plot. Well, each level is a level in our node-link diagram so this top level has just one bar so that's our root node. So let's see if we can [COUGH] understand how this looks. So our root node here. And then on the next level we have three nodes one, two, three. So that's how we have three splits here so one, two, three. And then the next split is one, two, three here so we have one, two, three. Again, under here we have one, two, three, one, two, three and here we have one, two. And now, with an icicle plot you might notice some of these bars are longer than the others. So in icicle plot we might encode different information to the hierarchy with respect to length, or width, or those sorts of things. Now, of course, challenges might be depending on what I'm encoding the data with. It could be that one of these becomes really, really small, it could be that one part of the tree goes deeper than the others. So I still wind up with [COUGH] elements that aren't there and being filled in and so forth. So people have been trying to come up with different ideas for space-filling representation. And the most popular became the treemap and this a space filling representation developed by Johnson and Schneiderman. Where children are drawn inside of their parent, essentially they alternate horizontal and vertical slicing at each successive level of the tree. Similar to, sort of, how we thought about those mosaic plots in our statistical graphics. But here we use area to encode other variables of data items, we might use color, we might be able to add text. So let's take a look at our picture here. This is [COUGH] a data wheel via Wikipedia, Wikimedia Commons so lets think about how we can interpret this. So first, we have to try to find our splits so remember, it alternated between horizontal and vertical slices. So the only place I see a split here with a vertical slice is right here, it splits our data into two sides. So we've got our root node and we've got our two sides. Okay, so let's just start with this side first and let's call this side A so this is A. So now, we had a vertical slice now, we want to look for horizontal slices that go all the way across. So we've got that one. So this also has two nodes and this splits between cashier and miscellaneous managers. So we've got cashier and we've got miscellaneous managers. Let's go ahead and look at this side too, as well. This side had one horizontal split because, remember, we're alternating between horizontal and vertical. So we get, again, two nodes. And one node is this, I got to erase this so I can read this, is the driver sales. So sales and this other one doesn't have a label yet. Okay, so let's continue down that tree a little bit so let me get some space here on my screen. To draw this a little bit better. Okay, so interpretation of a treemap. So, first, we have to decide if the first slice was horizontal and vertical. So I only see one vertical slice slicing at my whole tree so that's my first cut. So my tree gets split into two elements, this is my blue green side and this is my yellow purple side. Okay, so on the blue green side, since we did vertical first, now we do horizontal. And so this has two nodes where this is the miscellaneous and the cashiers. On the yellow purple side our vertical or horizontal slice goes there so we wind up with the drivers as a node. But notice this node has two elements, but the next vertical slice splits it there so we can go here like this. So we've got cooks and we've got construction. And we can continue going back and forth between these vertical and horizontal flips to get the whole hierarchical structure from the treemap. And now, really though, we want to be able to take a structure like this and go to this. And you want to be able to read a structure like this in this sort of manner, all right. So we want an algorithm to be able to draw this and the reason we talked about interpretation is because you want to be able to read the treemap. So with the treemap algorithm what we're going to do is first, take the root. And we're going to decide if our first cut is going to be horizontal or vertical, then we take our square and we split. So let's say we have a tree that looks. Like this. So here's our screen space so let's make our treemap. So we start with the parent and let's say we decide that our first cut is going to be horizontal. Okay, so we have three children so one, two, three, okay. So now for this level we're going to do vertical cuts so first we look at this guy so that's this box right here. So now we're going to do vertical cuts, it has two children so we're going to split that into two. Okay, this one had no children so no cuts are necessary. And this one had two children so, again, we split this into two. And, now, where we put this split may depend on different properties of the nodes. For now we're not going to worry about that, we're just walking through an example. Okay, so now we split it and we flip, again. We go to horizontal so we split this into two nodes so we've got our node here and our node here. So this node had two children, this node had zero children so we have our split. This one didn't have any depth, this one didn't have any further children. And so we wind up with our treemap that looks like this for this data source. And as all we're doing is just flipping back and forth between horizontal and vertical cuts. Now, we can also add some embellishments to our treemap, we can do a nested versus a non-nested tree-map. So, for example, here is a traditional tree-map. And, remember, the size corresponds to different properties of elements. And nesting is trying to add some sort of border here to give some embellishments to the visualization to make things pop out a little bit easier. And you can see the difference between the non-nested tree-maps on the left and the nested tree-maps on the right. And we can use the tree-map idea in a variety of domains. So people have used this in file directory structures, sports analysis, stock market data, software diagrams. And you might think to yourself, well, it makes sense for me for file directories, but what about sports analysis. Well, again, with sports you've got your game. Inside of a game you might have, so let's think about football for example, so you've got quarter one, quarter two, quarter three, quarter four. And inside of quarter one you may have different scores that occurred by team or by other things. So you can break this down into different chunks that way. This works for tennis, soccer, all sorts of different things. Stock market data, you might hierarchically organize the stocks into different commodities, [COUGH] show them that way. So here's an example of visualizing a tennis match from Jin and Banks TennisViewer. So this was published in IEEE Computer Graphics and Applications. So we've got the match level here at the top, and then the set level, and the game level. So they're going to analyze reviews and browse a tennis match using a space-filling treemap to show the competition tree. And the lens can show us shot patterns, color can encode the players. And so we wind up with this composite of game, sets, and match. And so we can get this nice tree view for showing even different things like sports. So we can start thinking about how we might be able to use this space-filling visualization for other sorts of problems. And here we can see the final of the Golden Triangle Open from Bond versus Michel. We could have, sort of, different overviews and cutouts to allow zooms and features for browsing. We can use different colors to represent different players to show who won. The different sizes may represent the number of serves or things like this. And so we can think about how we can encode all of our different visual variables into these elements to help us explore and understand our data. [COUGH] The benefits of a treemap is this is a good representation of two attributes beyond node and link, we can nicely encode color and area. The problem is it's not quite as good at representing structure. People have to think quite a bit about how to remap the tree back to its hierarchical structure. And if the tree is perfectly balanced you can get weird sorts of things in the treemap. So, for example, if I do a perfectly balanced tree and my splits always wind up like this, it becomes very difficult to reinterpret that. Likewise, we can also get really long and thin aspect ratios. The borders can help on a small tree, but they take up too much area on large and deep trees. So there's pros and cons. And the main thing you want to think about for treemaps is also even though you can apply a treemap to different representations. Say for example, in Tableau, Tableau will let you make a treemap out of almost any data types. It doesn't always make sense. And so I see treemaps being used with a lot of data that doesn't actually have some sort of underlying hierarchy there. It really then just becomes some sort of odd space filling plot that doesn't map to any sort of hierarchy. So we need to understand the tradeoffs, and benefits, and when to use a treemap. How to draw them and what are some of the different constraints. So in the next module we'll go briefly through some advanced treemap algorithms that you can go online and find code for and try out yourself.