We are now at the module where we will discuss computing resources. You may have experienced if you are following the honors track that some of the exercises become very heavy computational wise, and that you meet problems in actually doing the actual computations efficiently and faster, because the size of the problems grow quite quickly, or you may have experienced this in your own work, in your own laboratory. So, this is where we need to address the issue of computing resources, which can be a matter of not only having the computational power to treat the problems but also the memory to store the entire volumetric data that you are doing computations on. Not only that, sometimes we need to do as you heard in the case of the North Sea Chalk, you need to do many experiments to get a statistical significant dataset that can represent your large volume of material, and to do that you not only need to do the experiments many times, but you also need to do the analysis many times, and this you would rather not do manually one-by-one, but do in an automated fashion, in something that we typically refer to as a workflow. Brian Vinter from Niels Bohr Institute University of Copenhagen, he has some more details about what we mean when we talk about workflows. We might have thousands of hundreds of thousands of images, and a lot of you will have tried to read in images, run the script, take the next image, run the script, and we end up waiting for hours, or even days running through our whole stack of images in the 3D volume. We don't want that repetitive work, we need to automate that, and that is known as workflows. So, workflow and rhetoric are referred to as the pipeline, it's the same thing, we put something in, a lot of things happen to it, and at the end we get our results out. Now, we just want to put thousands of images into our pipeline, will have thousands of results coming out. An example of this workflow is what have been developed for the North Sea Chalk case, that you heard about earlier. Here you can see an outline of how this workflow or pipeline may look like for such a system. So, up in this left corner here, we have what is the projection reconstruction module where the raw data are coming in and are being reconstructed into three-dimensional volumes. This is typically a quite a computationally heavy step that requires a lot of computing resources, especially in terms of volume also. Depending on which reconstruction method you're using, it may require more or less in terms of CPU or Graphic Processing Unit GPU, processing power if one is using traditional filtered back projection it requires not so much computational power, whereas if you need the algebraic reconstruction techniques where you use for example known information about your sample to put into your reconstruction model in order for example to suppress noise, this requires a lot more computational power. So, this can be something that requires a lot of computing power. The next step after this here, is the image processing and segmentation part, where the freebie volumes as you learned about in segmentation is divided into and labeled into domains that we allocate to certain faces. Once we have this segmented dataset, which by the way may or may not require a lot of computing resources depending on whether the segmentation is done for example in 2D slices, or on the complete three-dimensional volume. Once this is done, we actually have a reduced data volume because the segmentation requires less data to represent, and this then means that from the segmented dataset, we can relatively easy extract metrology parameters like sizes, length scales, porosity for example. Whereas, if we go in the other direction here into this module which we call modelling, which is addressed in the final week of the course, the computational power required again increases typically a lot, because some of the modelling calculations are very heavy. In the following [inaudible] representing the North Sea chalk case, will again give a bit more detail about what is happening here in the reconstruction module, and in the segmentation module, and finally, the module where we extract parameters from our data. So, what we have worked with you is also to automate the processes, we are not developed all the different pieces, some we have developed further, or tested, or tried to change, or tweak towards how we would like it to be. So, here is a small is schematic of the procedure off all automated data handling. So, you can see you up in the corner where you have the projection data, this is what we collected the synchrotron, that we get about 1,800 projections for each sample, where we rotate the sample around, and essentially that's all what we almost do. We put that into the computer telling where is my data, and then this procedure is going along. Yes. So, in order to handle all of that data, and being able to do it in a reasonable time, we would follow this data pathway, putting in the rod projections as measured on our samples with the x-rays, and then all of this path is actually working on its own. Providing us with all of the properties in the end. So, the first part of the pathway, we would reconstruct the data, the reconstructed 3D image is then provided onto the next module which you see roughly in the middle here called image processing, where we tried to remove artifacts in the image. This can be ringing artifacts, or it could be different lines on also salt and pepper noise which is not good when we want to do with segmentation into what is mineral and what is pores. If we don't do that, we will have some mistakes there. So, after having done this image processing, we will try to span the biggest box we can insight which is fully encompassed by the sample. So we don't have air outside the sample as part of our image, which would also ruin our previous LD analysis afterwards. So, having done this and having done the segmentation, we can start calculating all the properties. The last part that Hening referred to here the extraction of parameters describing the material, and the modelling of parameters based on the 3D volumes extracted, is something we will return to in the modules covered in the last week of the course on modelling. In the beginning of Hening's talk you heard him say that this workflow is assembled from a number of modules created in different groups not only their own. Very similar to what you may have experienced if you following the honors track, when you are using for example a python programming, you're also drawing on a number of packages made by different people, and very much the same ideology is followed in creation of these workflows. Ryan has a few more words on this. So, think of it as a meta-program. It's a program that's specified a higher level description of your workflow. There are many workflow tools out there, and there are many textbooks on how to work efficiently with workflows. In our world where we have reasonably small teams doing the same thing, it's rather easy to do workflows normal. If you're in another field where one person might write a workflow and thousands of people may use it, it becomes equivalent to writing new programs actually. As you may have understood from Hening's presentation, and from words of Ryan, you may easily work with a workflow that has been created by many different scientists in different fields that you are working with yourself, and you can apply to your data and get results directly, but of course it's important to understand what is happening in the workflow in the various steps, in order to be able to recognize when something is going wrong, or when you are getting results that may not be actually representing of the volume that you are investigating. In that case, once you have identified this, you of course need to go back to the relevant field experts programmers, whoever made the various modules talk to them, figure out what is needed to adjust and modify to make the workflow appropriate for your particular case. In the next video, we will hear more from Ryan about how such modules in the workflow are created in an efficient way.