Now we're going to look into a comparison of
a deep learning system that is compared to the performance of humans.
And a representative example is Google AlphaGo.
This was developed by Google DeepMind,
the team in London.
AlphaGo is the first Go game program to defeat professional human players.
Now, Go is a Chinese board game played on
a 19x19 grid with black and white polished stones,
as you can see right here.
It is considered the most challenging of classical games due to the 19x19 factorial,
the 361 factorial different possibilities that moves can be made.
This number is infinity.
So, therefore, what moves can be done is considered very complex and very diverse.
In 2015 October, AlphaGo wins all five games
over the European Go champion Fan, Hui.
This AlphaGo Fan system used 1,202 CPUs and 176 GPUs.
A CPU is a Central Processing Unit which is
a general processing device on
various computing systems and a GPU is a Graphics Processing Unit.
This is a task specified CPU processing device.
Now, in 2016 March,
AlphaGo won all but the fourth game against Lee,
Sedol of South Korea,
who is one of the world's top Go players.
This AlphaGo Lee system used 50 first generation TPUs.
And, TPU stand for Tensor Processing Units.
This is a deep learning specify GPU system.
In 2017 January, AlphaGo won
all 60 unofficial online matches over the world's top Go players.
This AlphaGo Master System used one-second generation TPU.
In 2017 May, AlphaGo wins
all three games against the world number one ranked player Ke, Jie.
And this is one of the shots of the images,
of that day of the event,
where in this case for this match,
the white stones were for AlphaGo and the black stones were for Ke, Jie.
Now, we've been talking about TPUs,
the Tensor Processing Units.
Google's Machine Learning ASIC processor
designed specifically for TensorFlow operation.
That's what a TPU is.
Now, looking at the AlphaGo Lee system,
well as you can see here,
this unit right here is what does the integer calculations.
It has a clock speed of 700 megahertz that it's operating on.
50 of these multi-core computing units are included into this server type system.
And this here is TPU first generation.
Then, the TPU 2nd generation that was used here in the AlphaGo Master System.
It has a total performance of 11.5 Peta FLOPS.
Peta stands for 10 to the power of 15.
Now, the FLOPS, this computation capability is truly tremendous.
Where does it come from?
It comes from 256 chip pods.
Now, what consists of these 256?
It is these four chip modules per chip pod,
they are integrated 64 of them,
in an 8x8 structure.
One chip module is rated at 45 TeraFLOPS.
And, FLOPS stands for Floating Point Operations Per Second.
Now, looking into the architecture,
this is how one chip looks like.
Four chips on one board looks like this.
And, as you can see right here you have an 8x8 grid in the middle area right here,
where this is 64 4-chip modules.
256 chip pods are included in here,
where we have networking and power supply coming in
from the outside to the in to support the overall system.
Now, as you can see,
AlphaGo key technologies are dependent upon the two major components here,
which are the ATS, the Advanced Tree Search and the DNN,
the Deep Neural Network.
AlphaGo networks work on two types of network engines;
one is a Policy Network and the other is the Value Network.
The Policy Network is the Deep Learning Neural Network that selects
the next move to play and the Value Network is the DNN that predicts the game winner.
These two work together to make the processing of AlphaGo work.
The training process, this is based on the ATS and the DNN,
where Initial training was based on Supervised
Learning using data from 160,000 games.
That's 30 million human moves.
Then, Advanced training on the AlphaGo master
system using TPU second generation one,
was based on reinforcement learning,
by playing games against itself.
Now, what is learning and what is training?
Learning is a method used in training the
weights of the Deep Learning Neural Network,
to make it perform in a desired way.
Such that, it builds up intelligence inside in making
very accurate decisions under complex situations,
where Supervised Learning is training that uses labeled data.
Labeled data is data that has a desired output,
a target output result.
For example, it's like this.
You put information over there at the input layer,
It goes through the hidden layers in between
and then something comes out the output layer.
This is a current output.
Then, we have the desired output, the labeled data.
Now, does the current output match the labeled data, the desired output?
If not, there is going to be an error.
This error is fed back into the system to retrain
the system such that it matches our desired output.
And this process is what we call Supervised Learning,
because it has this right here,
the desired output, the labeled data.
Now, compared to this,
Reinforcement Learning is different,
because this learns from trial and error.
And the best operation method is discovered by DNN.
Feedback is given back into the system and no labeled data is used.
This is where it's significantly different from Supervised Learning.
And as an example,
the maze path finding is a good example of
Reinforcement Learning because from trial and error,
there is no label data and it learns by trying and
trying until it figures out the optimal way to do this process.
Now in the training process,
one of the things that you'll see is that Supervised Learning is
the majority of the overall learning process, the training process.
Nearly taking up about maybe at least as 85%
of the training process even up to maybe 99%.
And, if needed at the end,
reinforcement learning is added on to make it even better.
These are the references that I used and I recommend them to you.
In addition, I have to mention that in the later chapters
of this course in Deep Learning for Business,
I'm going to get into further details of the Neural Network,
the Deep Learning Neural Network,
and its related technologies.
So, we'll go into further details over
there and I hope you wait and join me in those lectures as well.
These are the references. Thank you.