Acerca de este Curso
73,438 vistas recientes

100 % en línea

Comienza de inmediato y aprende a tu propio ritmo.

Fechas límite flexibles

Restablece las fechas límite en función de tus horarios.

Nivel avanzado

Aprox. 40 horas para completar

Sugerido: 6 weeks of study, 3-6 hours/week for base track, 6-9 with all the horrors of honors section...

Inglés (English)

Subtítulos: Inglés (English), Coreano

100 % en línea

Comienza de inmediato y aprende a tu propio ritmo.

Fechas límite flexibles

Restablece las fechas límite en función de tus horarios.

Nivel avanzado

Aprox. 40 horas para completar

Sugerido: 6 weeks of study, 3-6 hours/week for base track, 6-9 with all the horrors of honors section...

Inglés (English)

Subtítulos: Inglés (English), Coreano

Programa - Qué aprenderás en este curso

5 horas para completar

Intro: why should i care?

In this module we gonna define and "taste" what reinforcement learning is about. We'll also learn one simple algorithm that can solve reinforcement learning problems with embarrassing efficiency.

13 videos (Total 84 minutos), 8 readings, 3 quizzes
13 videos
Reinforcement learning vs all3m
Multi-armed bandit4m
Decision process & applications6m
Markov Decision Process5m
Crossentropy method9m
Approximate crossentropy method5m
More on approximate crossentropy method6m
Evolution strategies: core idea6m
Evolution strategies: math problems5m
Evolution strategies: log-derivative trick8m
Evolution strategies: duct tape6m
Blackbox optimization: drawbacks4m
8 lecturas
What you're getting into1m
Setting up course environment10m
Note: this course vs github course1m
Lecture slides10m
Course teaser placeholder10m
About honors track1m
3 horas para completar

At the heart of RL: Dynamic Programming

This week we'll consider the reinforcement learning formalisms in a more rigorous, mathematical way. You'll learn how to effectively compute the return your agent gets for a particular action - and how to pick best actions based on that return.

5 videos (Total 54 minutos), 2 readings, 4 quizzes
5 videos
State and Action Value Functions13m
Measuring Policy Optimality6m
Policy: evaluation & improvement10m
Policy and value iteration8m
2 lecturas
Advanced Reward Design10m
Discrete Stochastic Dynamic Programming10m
3 ejercicios de práctica
Reward design8m
Optimality in RL10m
Policy Iteration14m
5 horas para completar

Model-free methods

This week we'll find out how to apply last week's ideas to the real world problems: ones where you don't have a perfect model of your environment.

6 videos (Total 47 minutos), 1 reading, 4 quizzes
6 videos
Monte-Carlo & Temporal Difference; Q-learning8m
Exploration vs Exploitation8m
Footnote: Monte-Carlo vs Temporal Difference2m
Accounting for exploration. Expected Value SARSA.11m
On-policy vs off-policy; Experience replay7m
1 lectura
1 ejercicio de práctica
Model-free reinforcement learning10m
5 horas para completar

Approximate Value Based Methods

This week we'll learn to scale things even farther up by training agents based on neural networks.

9 videos (Total 104 minutos), 3 readings, 5 quizzes
9 videos
Loss functions in value based RL11m
Difficulties with Approximate Methods15m
DQN – bird's eye view9m
DQN – the internals9m
DQN: statistical issues6m
Double Q-learning6m
More DQN tricks10m
Partial observability17m
3 lecturas
TD vs MC10m
DQN follow-ups10m
3 ejercicios de práctica
MC & TD8m
SARSA and QLeaning8m
5 horas para completar

Policy-based methods

We spent 3 previous modules working on the value-based methods: learning state values, action values and whatnot. Now's the time to see an alternative approach that doesn't require you to predict all future rewards to learn something.

11 videos (Total 68 minutos), 1 reading, 3 quizzes
11 videos
All Kinds of Policies4m
Policy gradient formalism8m
The log-derivative trick3m
Advantage actor-critic6m
Duct tape zone4m
Policy-based vs Value-based4m
Case study: A3C6m
A3C case study (2/2)3m
Combining supervised & reinforcement learning6m
1 lectura
1 ejercicio de práctica
A policy-based quiz14m
5 horas para completar


In this final week you'll learn how to build better exploration strategies with a focus on contextual bandit setup. In honor track, you'll also learn how to apply reinforcement learning to train structured deep learning models.

10 videos (Total 85 minutos), 4 readings, 4 quizzes
10 videos
Regret: measuring the quality of exploration6m
The message just repeats. 'Regret, Regret, Regret.'5m
Intuitive explanation7m
Thompson Sampling5m
Optimism in face of uncertainty5m
Bayesian UCB11m
Introduction to planning17m
Monte Carlo Tree Search10m
4 lecturas
Extras: exploration10m
Extras: planning10m
2 ejercicios de práctica
56 revisionesChevron Right


comenzó una nueva carrera después de completar estos cursos


consiguió un beneficio tangible en su carrera profesional gracias a este curso


consiguió un aumento de sueldo o ascenso

Principales revisiones sobre Practical Reinforcement Learning

por AKMay 28th 2019

This is one of the Best Course available on Reinforcement Learning. I have gone through various study material but the depth and practical knowledge given in the course is awesome.

por FZFeb 14th 2019

A great course with very practical assignments to help you learn how to implement RL algorithms. But it also has some stupid quiz questions which makes you feel confusing.



Pavel Shvechikov

Researcher at HSE and Sberbank AI Lab
HSE Faculty of Computer Science

Alexander Panin

HSE Faculty of Computer Science

Acerca de National Research University Higher School of Economics

National Research University - Higher School of Economics (HSE) is one of the top research universities in Russia. Established in 1992 to promote new research and teaching in economics and related disciplines, it now offers programs at all levels of university education across an extraordinary range of fields of study including business, sociology, cultural studies, philosophy, political science, international relations, law, Asian studies, media and communicamathematics, engineering, and more. Learn more on

Acerca del programa especializado Aprendizaje automático avanzado

This specialization gives an introduction to deep learning, reinforcement learning, natural language understanding, computer vision and Bayesian methods. Top Kaggle machine learning practitioners and CERN scientists will share their experience of solving real-world problems and help you to fill the gaps between theory and practice. Upon completion of 7 courses you will be able to apply modern machine learning methods in enterprise and understand the caveats of real-world data and settings....
Aprendizaje automático avanzado

Preguntas Frecuentes

  • Una vez que te inscribes para obtener un Certificado, tendrás acceso a todos los videos, cuestionarios y tareas de programación (si corresponde). Las tareas calificadas por compañeros solo pueden enviarse y revisarse una vez que haya comenzado tu sesión. Si eliges explorar el curso sin comprarlo, es posible que no puedas acceder a determinadas tareas.

  • Cuando te inscribes en un curso, obtienes acceso a todos los cursos que forman parte del Programa especializado y te darán un Certificado cuando completes el trabajo. Se añadirá tu Certificado electrónico a la página Logros. Desde allí, puedes imprimir tu Certificado o añadirlo a tu perfil de LinkedIn. Si solo quieres leer y visualizar el contenido del curso, puedes auditar el curso sin costo.

¿Tienes más preguntas? Visita el Centro de Ayuda al Alumno.