This course covers the design, acquisition, and analysis of Functional Magnetic Resonance Imaging (fMRI) data. A book related to the class can be found here: https://leanpub.com/principlesoffmri

Loading...

From the course by Johns Hopkins University

Principles of fMRI 1

260 ratings

Johns Hopkins University

260 ratings

This course covers the design, acquisition, and analysis of Functional Magnetic Resonance Imaging (fMRI) data. A book related to the class can be found here: https://leanpub.com/principlesoffmri

From the lesson

Week 4

The description goes here

- Martin Lindquist, PhD, MScProfessor, Biostatistics

Bloomberg School of Public Health | Johns Hopkins University - Tor WagerPhD

Department of Psychology and Neuroscience, The Institute of Cognitive Science | University of Colorado at Boulder

Hi, in this module we're going to keep talking about group level analysis, but

Â we're going to focus on the more statistical aspects of the problem.

Â So, whenever we apply statistics to real world problems,

Â we usually tend to separate between the model used to describe the data,

Â the method, a parameter estimation, and the algorithm used to obtain them.

Â So the model uses probability theory to describe the parameters of some unknown

Â distribution that is thought to be generating the data.

Â The method defines the loss function that's minimized

Â in order to find the unknown model parameters.

Â Finally, the algorithm defines the manner in which the chosen loss

Â function is minimized.

Â So each of these components come into play whenever we're doing a statistical

Â analysis.

Â And that's going to become important in this module.

Â Before performing a group analysis, we have to perform several preprocessing

Â steps in order to insure the validity of the results.

Â These preprocessing steps include motion correction, but

Â which is intrasubject registration.

Â Spatial normalization, which is intrasubject registration, and

Â a bit of spatial smoothing to overcome limitations in the spatial normalization.

Â So, motion correction is always important when we're doing analysis.

Â But here, spatial normalization is critical,

Â because we need the voxels to align Across subjects.

Â So if I take a specific voxel at a specific location in one subject,

Â I want to be able to obtain the same voxel in another subject, so

Â that the brains have to be aligned to one another.

Â However, spatial normalization isn't perfect, so we need a degree of spacial

Â smoothing in order to overcome the limitations of spatial normalization.

Â So all these preprocesses that should be performed

Â before performing a group analysis.

Â When performing group analysis, we often use multi-level models,

Â as Tor described in the previous modules.

Â And they often are performed in two levels.

Â The first level deals with individual subjects, and

Â the second level deals with groups of subjects.

Â And so, here's a little cartoon of that.

Â Where we have a first level model, where we have in different subject.

Â And then we have a second level module where we combine the subject data.

Â And again, all inferences typically performed at this massive univariate

Â setting, so this is the reason why it's so important that all the brains are aligned

Â to one another, because we want the voxels to mean the same thing across subjects.

Â So the first level model, basically here we have to go back and

Â look at the traditional GLM approach that we did when were talking

Â about single subject data, so this is going to be the same thing here.

Â So suppose we have data from n different subjects, and for

Â each subject we use the following model.

Â Yk = XkBetak+ek.

Â So this is just a standard GLM analysis such as we talked about

Â in the single subject setting, but

Â now we just put a k here to indicate which subject we're interested in.

Â And so again, X of k is the design matrix for that subject, and

Â it may look like this.

Â So in the first level, we have auto-correlated data, but

Â with a relatively large number of observations.

Â And so, we can combine the first level models for

Â all subjects in the following way.

Â We can just concatenate all the data

Â across subjects from Y1 up to Yn in this manner.

Â In this matrix Y, we do this for housekeeping purposes here.

Â We can combine all the design matrices for

Â the subjects into a grand design matrices for the whole population, which is X.

Â We can similarly do this with beta and the noise e.

Â And also the variance covariance matrix can be combined in the following manner.

Â The reason we do this is just simply for bookkeeping purposes.

Â And so, and so this just now, the y x beta e and

Â v are containing information about every subject in our study.

Â So, the full first level model using this notation can be written in this manner,

Â y is equal to x beta plus e.

Â So, note that this model is separable,

Â and it's possible to fit each subjects data individually, but

Â the reason we keep it this way is just simply for book-keeping purposes.

Â Now, once we have the first level model fixed like this,

Â we can move to the second level model.

Â And here we want to estimate the different beta parameters

Â from different subjects and tie them together in a groups to setting.

Â To the second level model it can be written as beta is equal to X of g.

Â So this is a new design matrix, the group-level design matrix.

Â Beta g, which are the group level parameters plus eta.

Â And eta is now normally distributed with the means 0, and

Â a variance co-variance matrix V of g, which is the group level variance.

Â Here x of X of v is the second level design matrix.

Â For example, we might separate cases from controls, and

Â beta g is the vector of second level parameters.

Â So for the second level model, we typically have IID data, so

Â usually the errors are independent, but we have relatively few observations, so

Â maybe we only have 20 subjects or something like that.

Â And so, here's an example of what the design matrix

Â might look like if we say we had four subjects, two cases and two controls.

Â We might have that beta g of zero is the amplitude for cases, and

Â beta g of one is the amplitude for controls.

Â And in this case, we want to kind of separate cases and controls, and

Â get separate estimates for them.

Â So this is the way we might put up, introduce the design matrix in this case.

Â So, the second level model relates the subject specific parameters contained in

Â beta to the population parameters which we're calling beta G here.

Â And so it assumes that the first level parameters

Â are randomly sampled from a population of possible regression parameters.

Â And this assumption is what allows us to eventually generalize the results

Â to the whole population, and this is what we want to do.

Â Okay, so let's depict this problem pictorially here.

Â So here let us suppose that we have a population of subjects that we

Â could want not study.

Â And let's suppose that for

Â each subject we have a specific beta premier associated with that subject.

Â And if we look at the distribution over this beta premiership across

Â the entire population, might get a normal distribution as the one seen here.

Â So let's assume that this normal distribution has a mean beta g, and

Â a variance sigma squared of g.

Â So beta g is the population average, and

Â sigma squared of g depicts the variance in its population.

Â So these are parameters that we want to be able to estimate.

Â However, in general, we don't have access to this distribution.

Â So what we need to do is we need to take a random sample from it.

Â So let's say that we have a study where we take a random sample of seven subjects.

Â And let's say that the seven subjects that we include in our study

Â take these beta values depicted by these red crosses here.

Â Now in some of these, we have people that have low values of beta,

Â and they're on the left-hand side.

Â And we have people who have high values of beta, which are on the right-hand side.

Â And then we have some people that are in the middle around the average value.

Â Now, this is going to be mapped onto the first level result is if we have a person

Â here who's beta value is beta one,

Â this person is going to tend to be a low responder.

Â Their beta value is going to be smaller,

Â and there's going to be less of activation from that subject.

Â Now, if you look at another person with a higher beta value,

Â they're going to tend to have a higher amplitude here.

Â So each subject has their own amplitude, but

Â these amplitudes are drawn from a larger population.

Â And here, that population is described by these parameters beta g and

Â sigma squared of g, and these are the parameters we want to make inference from.

Â So by making inferences about the population parameters, we can move

Â all our conclusions to this population of people that we're interested in.

Â And that's kind of one of the power of these multi-level models.

Â Is that, now rather than making inferences about the subjects that are in our study,

Â we're also making inferences about the people that aren't in our study, but

Â by assuming that we have a distribution, and

Â that the people that are in our study is a random sample from that distribution.

Â So statistically, now we can summarize our entire model by the first level model,

Â which is just Y = X beta p+ e.

Â And our second model, which is beta + Xg, beta g + eta.

Â So in this case, now we have the first level model for the individual subjects,

Â and we have the second level model with the group parameters.

Â And so, this model can be expanded further to incorporate more levels if we have

Â multiple sessions for subjects, for example, or what not.

Â And so, this two level model can be combined into a single level model as

Â follows.

Â So if we take the first level model, and we express beta using the second level

Â model as follows, where beta is equal to Xg beta g + eda,

Â we can re-express that the full model as follows.

Â And this is in statistics what we call the mixed-effects model.

Â In general we can write this as a single model saying that

Â Y follows the following distribution with the following mean and variance.

Â In statistics there's a lot of different ways of estimating such

Â mixed-effects models.

Â So now we come back to the terminology that we talked about before.

Â So now that we have this model, so we've been talking about the model.

Â We have to figure out a way how to estimate this.

Â And this is where these statistical techniques, and also algorithms,

Â will come into play.

Â So again, statistical techniques define the loss function that should be minimized

Â in order to find the parameters of interest in our model.

Â And so just like when we're talking about single subject GLM,

Â commonly used techniques include maximum likely hood estimation or

Â restrictive maximum likely hood estimation.

Â Algorithms are defined the manner in which the chosen loss functions are minimized.

Â Here commonly used techniques including, Newton-Raphson,

Â Fisher-scoring, the EM-algorithm or IGLS/RIGLS.

Â So let's talk about some of these statistical techniques here.

Â So maximum likelihood is basically maximizes the likelihood of the data.

Â And we've talked about that it produces biased estimates of the variance

Â components.

Â Restricted maximum likelihood, on the other hand,

Â maximizes the likelihood of the residuals.

Â So this produces unbiased estimates of the variance components.

Â So typically in our group level models,

Â we want to use restricted maximum likelihood as a loss function

Â if we want to get unbiased estimates of the variance components.

Â Now, the algorithms, there's a whole cottage industry of different algorithms

Â that we can use to find the maximum likelihood estimate, or

Â the restricted maximum likelihood estimate.

Â And many of them are common across disciplines.

Â For example, Newton-Raphson is an iterative procedure that finds estimates

Â using the derivative at the current solution.

Â And we have Fisher Scoring, which is very similar to Newton-Raphson, but

Â which finds the estimates using Fisher information.

Â And finally, we have the EM-algorithm, which is also an integer procedure that

Â finds estimates from models that depend on unreserved latent variables.

Â For example, the second level layer.

Â In general, how this is done depends on what software package that you use.

Â And different neuroimaging software packages have implemented

Â different types of mixed effects models, like the ones they have discussed.

Â And they differ in which method and algorithm that they ultimately apply.

Â However, as Tor mentioned, a simple non-narrative two-stage least squares

Â approach is what's most commonly used in fMRI data analysis.

Â And this is the so called summary statistics approach.

Â Here results from individual subjects are reused in the second level, and

Â this allows us to reduce any computational burden of fitting a full model.

Â And so just to summarize,

Â the summary statistics approach, here we fit a model to each subject's data.

Â And then we construct contrast images for each subject.

Â And then we conduct a t-test on the contrast images.

Â Now, only the contrasts are recycled from the first level models, and

Â only one contrast can be estimated at a time.

Â And so, this makes a number of assumptions,

Â but if these assumptions hold true, then this is a very simple and

Â straightforward way of doing a multilevel model, which kind of circumvents

Â some of the computational difficulties of using a full mixed-effects model.

Â When using temporal basis sets at the first level,

Â it can sometimes be difficult to summarize the response with a single number, and

Â this makes group inference difficult.

Â Because often times we want to make inference, on say, the amplitude for

Â each subject.

Â But if we have basis sets,

Â the amplitude's going to depend on all of the different basis functions,

Â so that makes second level analysis very tricky in that setting.

Â And here, we can perform group analysis using, for example,

Â just the main basis functions.

Â So, for example, if we're using the canonical HRF and it's derivatives,

Â sometimes we just use the amplitude corresponding to the canonical HRF, and

Â use this in our second level analysis in the summary statistics approach.

Â Another way of doing it is, use all the basis functions and do an F test.

Â And finally, a third way would be to reparmaterize the fitted response.

Â So basically, you take the different basis sets and reconstruct the HRF.

Â And estimate the magnitude for the reconstruct that HRF, and

Â then use this information at the second level.

Â So these are all different ways of addressing this problem.

Â Okay, so that's the end of this module.

Â And this is the last module on group analysis.

Â In the next couple of modules, we're going to be talking about the multiple

Â comparisons problem and how we deal with it in neuroimaging, and FRI in particular.

Â Okay, see you then.

Â Bye.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.