Hi. Welcome back to Sequence and Time Series lectures.

In this lecture, we'll focus on what is known as

the Time Series Motifs Search problem and we'll learn

about algorithms to implement time-series Motif Search.

So first of all, let's remember what a time-series Motif is.

So, in this slide we have three time series,

the first red series,

once again if you remember from the earlier lectures,

is the popularity of the keyword Big Data over time.

So, in this case we are looking at the time series from 2013 up to today,

and we see how the term big data,

popularity of the term big data changes over time.

The second blue time series is the popularity of the term Machine Learning.

Once again, we track the popularity of

the term Machine Learning over time from 2013 until today.

The third time series, Deep Learning,

is in this case plotted with a yellow curve,

keeps track of the popularity of the keywords Deep Learning over time.

As you see here,

the time series that we have as examples show certain repeated patterns.

For example in the case of the Big Data,

we see a repeated pattern that occurs over time, almost identically.

We see a similar pattern in the keyword Machine Learning.

So, we see that basically almost around the same timeframe,

the keyword Machine Learning shows a repeated pattern.

We see the same repeated pattern,

but much less strongly on the keyword Deep Learning.

If you look very closely,

we see that the keyword Deep Learning also shows a similar pattern

almost at identical time to Big Data and Machine Learning,

but not as strongly.

So, for example, while we see very slight patterns earlier,

those patterns are not very strong.

Even at later stages the pattern that we see

is not very easy to identify.

In the case of Big Data,

the pattern is very easy to notice.

In the case of Machine Learning,

the pattern is very easy to notice.

In the case of Deep Learning,

the pattern is not easy to notice right.

So, these patterns, these repeated patterns,

we call them Motifs.

So Motif essentially is nothing but a repeated pattern in a time series.

So, these repeated patterns are often important to understand because usually the data,

the time series that we have,

covered the long period and a repeated pattern can tell us

the occurrence of the same event or similar events over time.

For being able to identify a Motif is important,

to be able to detect occurrence of similar events over time.

So, Motif Search is an important tool we have

in analysis of time series and decision-making based on time series.

Of course, the example also shows that, furthermore,

for motifs can be easy or hard depending on the time series.

So, in the case of the Big Data,

this Motif is very well-defined API, easy to identify.

Whereas, in the case of Deep Learning,

the Motif while it seemed to exist,

it is much less powerful,

much less strong and correspondingly,

it may be more difficult to identify.

So, what we'll do in this lecture,

we will first give a framework to Search for motifs.

Then we will discuss metrics or measures to quantify

the quality or strength of a Motif and I will be able to see how

this algorithmic framework can be modified to

better identify stronger or more interesting motifs in time.

Okay. So, let's basically take a look at the General Motif Search Algorithm.

For the Motif Search Algorithms are actually very simple.

So, the basic algorithm and essentially all water Search algorithms are variants

of this very simple algorithm and the general algorithm essentially has two steps.

The first step, what we do is we enumerate or

identify subsequences of time series and usually we assume that we are given a length.

So, given time and given my time series,

essentially what I do is,

I enumerate it's subsequences

of the same length,

say W. So, that's the first step,

I create a set of subsequences for my time series.

In the second step,

we apply a clustering algorithm to the set of

subsequences to identify groups of similar subsequences.

That is, I was given a number of subsequences.