We will now move from analyzing ideal spectra to analyzing real spectra. This is one of the T-Rex spectra. And as soon as we come up with an idea of how the peptides that generated this spectrum look like, we can immediately start annotating peaks in the spectrum. For example, this peak has a mass explained by a prefix peptide of length 10, while this peak has a mass explained by a suffix peptide of length 3. In total, we were able to annotate six peaks in the spectrum using this conjecture about the peptides that generated this spectrum, and we say that the "Shared Peak Count" for this spectrum and peptide is 6. This is another idea about the peptides that generated the same spectrum. And in this case, the shared peak count is higher: we are able to annotate 10, rather than 6 peaks in the spectrum. Which peptide, the first one or the second one, did Asara propose as the peptide generating DinosaurSpectrum? You may be surprised with this answer because it may look like the peptide at the bottom is definitely a better candidate for explaining the DinosaurSpectrum. There's two alternatives. Bring up the question of how should we score an annotated spectrum? Should we score it as a shared peak count? Or maybe we should simply sum up the intensities of all explained peaks. Both options have some disadvantages. For example, shared peak count ignores intensity. Sum of intensity, on the other hand, may lead to problems when large peaks dominate the score. Our goal is to come up with a probabilistic model of spectra so that large peaks contribute to the score but do not dominate the score. To address this challenge, mass spectrometrists came up with this concept of a "Spectral Vector," a transformation of a spectrum of mass m into an m-dimensional vector: <s1, ..., si, ..., sm>. The value si, which is known as amplitude, approximates the likelihood that mass i is the prefix of an unknown peptide that generated the spectrum. You may be wondering how we can transform a spectrum into a spectral vector if we don't know the peptides that generated this spectrum, but this is a subject of a detour that I will not cover in this lecture. To elaborate on the concept of spectral vectors, we will first introduce a simple notion of "Peptide Vector." Given a peptide, we will take each amino acid in this peptide of mass m and represent it as an m-dimensional binary vector with m-1 0s in the beginning, and and single 1 in the end. Afterwards, we will concatenate all vectors corresponding to individual amino acids into a peptide vector shown in this slide. You can easily solve the following problem: Given a peptide, convert it into a peptide vector, and it is clear that we can also solve a reverse problem: Given a peptide vector, transform it into a peptide. We will now turn to a more difficult problem of converting a spectrum into a spectral vector. Let's start by analyzing this peak and converting it into an amplitude. In this case it turns out the amplitude for this peak is +9. Know that it is important that +9 is not the intensity of this peak. It is the likelihood that this peak will be annotated by a prefix of an unknown peptide that generated this spectrum. Let's press it for this peak, it's a very short one, hardly visible. The amplitude will be -5, and amplitude may be negative or positive. For this peak, it is +7 and for this peak, it is +3. We will combine them all into a single spectral vector, which is an integer-valued vector with m coordinates. And typically, the larger the peak at mass i, the larger the value of amplitude of the spectral vector, but the dependencies between intensities and amplitudes are complex.