Acerca de este Curso

69,047 vistas recientes
Certificado para compartir
Obtén un certificado al finalizar
100 % en línea
Comienza de inmediato y aprende a tu propio ritmo.
Fechas límite flexibles
Restablece las fechas límite en función de tus horarios.
Nivel intermedio

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

Aprox. 22 horas para completar
Inglés (English)

Habilidades que obtendrás

Artificial Intelligence (AI)Machine LearningReinforcement LearningFunction ApproximationIntelligent Systems
Certificado para compartir
Obtén un certificado al finalizar
100 % en línea
Comienza de inmediato y aprende a tu propio ritmo.
Fechas límite flexibles
Restablece las fechas límite en función de tus horarios.
Nivel intermedio

Probabilities & Expectations, basic linear algebra, basic calculus, Python 3.0 (at least 1 year), implementing algorithms from pseudocode.

Aprox. 22 horas para completar
Inglés (English)

ofrecido por

Placeholder

Universidad de Alberta

Placeholder

Alberta Machine Intelligence Institute

Programa - Qué aprenderás en este curso

Calificación del contenidoThumbs Up92%(2,172 calificaciones)Info
Semana
1

Semana 1

1 hora para completar

Welcome to the Course!

1 hora para completar
2 videos (Total 12 minutos), 2 lecturas
2 videos
Meet your instructors!8m
2 lecturas
Read Me: Pre-requisites and Learning Objectives10m
Reinforcement Learning Textbook10m
5 horas para completar

On-policy Prediction with Approximation

5 horas para completar
13 videos (Total 69 minutos), 2 lecturas, 2 cuestionarios
13 videos
Generalization and Discrimination5m
Framing Value Estimation as Supervised Learning3m
The Value Error Objective4m
Introducing Gradient Descent7m
Gradient Monte for Policy Evaluation5m
State Aggregation with Monte Carlo7m
Semi-Gradient TD for Policy Evaluation3m
Comparing TD and Monte Carlo with State Aggregation4m
Doina Precup: Building Knowledge for AI Agents with Reinforcement Learning7m
The Linear TD Update3m
The True Objective for TD5m
Week 1 Summary4m
2 lecturas
Module 1 Learning Objectives10m
Weekly Reading: On-policy Prediction with Approximation40m
1 ejercicio de práctica
On-policy Prediction with Approximation30m
Semana
2

Semana 2

5 horas para completar

Constructing Features for Prediction

5 horas para completar
11 videos (Total 52 minutos), 2 lecturas, 2 cuestionarios
11 videos
Generalization Properties of Coarse Coding5m
Tile Coding3m
Using Tile Coding in TD4m
What is a Neural Network?3m
Non-linear Approximation with Neural Networks4m
Deep Neural Networks3m
Gradient Descent for Training Neural Networks8m
Optimization Strategies for NNs4m
David Silver on Deep Learning + RL = AI?9m
Week 2 Review2m
2 lecturas
Module 2 Learning Objectives10m
Weekly Reading: On-policy Prediction with Approximation II40m
1 ejercicio de práctica
Constructing Features for Prediction28m
Semana
3

Semana 3

6 horas para completar

Control with Approximation

6 horas para completar
7 videos (Total 41 minutos), 2 lecturas, 2 cuestionarios
7 videos
Episodic Sarsa in Mountain Car5m
Expected Sarsa with Function Approximation2m
Exploration under Function Approximation3m
Average Reward: A New Way of Formulating Control Problems10m
Satinder Singh on Intrinsic Rewards12m
Week 3 Review2m
2 lecturas
Module 3 Learning Objectives10m
Weekly Reading: On-policy Control with Approximation40m
1 ejercicio de práctica
Control with Approximation40m
Semana
4

Semana 4

6 horas para completar

Policy Gradient

6 horas para completar
11 videos (Total 55 minutos), 2 lecturas, 2 cuestionarios
11 videos
Advantages of Policy Parameterization5m
The Objective for Learning Policies5m
The Policy Gradient Theorem5m
Estimating the Policy Gradient4m
Actor-Critic Algorithm5m
Actor-Critic with Softmax Policies3m
Demonstration with Actor-Critic6m
Gaussian Policies for Continuous Actions7m
Week 4 Summary3m
Congratulations! Course 4 Preview2m
2 lecturas
Module 4 Learning Objectives10m
Weekly Reading: Policy Gradient Methods40m
1 ejercicio de práctica
Policy Gradient Methods45m

Reseñas

Principales reseñas sobre PREDICTION AND CONTROL WITH FUNCTION APPROXIMATION

Ver todas las reseñas

Acerca de Programa especializado: Aprendizaje por refuerzo

Aprendizaje por refuerzo

Preguntas Frecuentes

¿Tienes más preguntas? Visita el Centro de Ayuda al Alumno.