Acerca de este Curso
4.7
402 calificaciones
87 revisiones
Programa Especializado
100 % en línea

100 % en línea

Comienza de inmediato y aprende a tu propio ritmo.
Fechas límite flexibles

Fechas límite flexibles

Restablece las fechas límite en función de tus horarios.
Nivel avanzado

Nivel avanzado

Horas para completar

Aprox. 48 horas para completar

Sugerido: 6-10 hours/week...
Idiomas disponibles

Inglés (English)

Subtítulos: Inglés (English)

Habilidades que obtendrás

Data AnalysisFeature ExtractionFeature EngineeringXgboost
Programa Especializado
100 % en línea

100 % en línea

Comienza de inmediato y aprende a tu propio ritmo.
Fechas límite flexibles

Fechas límite flexibles

Restablece las fechas límite en función de tus horarios.
Nivel avanzado

Nivel avanzado

Horas para completar

Aprox. 48 horas para completar

Sugerido: 6-10 hours/week...
Idiomas disponibles

Inglés (English)

Subtítulos: Inglés (English)

Programa - Qué aprenderás en este curso

Semana
1
Horas para completar
6 horas para completar

Introduction & Recap

This week we will introduce you to competitive data science. You will learn about competitions' mechanics, the difference between competitions and a real life data science, hardware and software that people usually use in competitions. We will also briefly recap major ML models frequently used in competitions....
Reading
8 videos (Total 46 min), 7 readings, 6 quizzes
Video8 videos
Meet your lecturers2m
Course overview7m
Competition Mechanics6m
Kaggle Overview [screencast]7m
Real World Application vs Competitions5m
Recap of main ML algorithms9m
Software/Hardware Requirements5m
Reading7 lecturas
Welcome!10m
Week 1 overview10m
Disclaimer10m
Explanation for quiz questions10m
Additional Materials and Links10m
Explanation for quiz questions10m
Additional Material and Links10m
Quiz5 ejercicios de práctica
Practice Quiz8m
Recap8m
Recap12m
Software/Hardware6m
Graded Soft/Hard Quiz8m
Horas para completar
2 horas para completar

Feature Preprocessing and Generation with Respect to Models

In this module we will summarize approaches to work with features: preprocessing, generation and extraction. We will see, that the choice of the machine learning model impacts both preprocessing we apply to the features and our approach to generation of new ones. We will also discuss feature extraction from text with Bag Of Words and Word2vec, and feature extraction from images with Convolution Neural Networks....
Reading
7 videos (Total 73 min), 4 readings, 4 quizzes
Video7 videos
Numeric features13m
Categorical and ordinal features10m
Datetime and coordinates8m
Handling missing values10m
Bag of words10m
Word2vec, CNN13m
Reading4 lecturas
Explanation for quiz questions10m
Additional Material and Links10m
Explanation for quiz questions10m
Additional Material and Links10m
Quiz4 ejercicios de práctica
Feature preprocessing and generation with respect to models8m
Feature preprocessing and generation with respect to models8m
Feature extraction from text and images8m
Feature extraction from text and images8m
Horas para completar
29 minutos para completar

Final Project Description

This is just a reminder, that the final project in this course is better to start soon! The final project is in fact a competition, in this module you can find an information about it....
Reading
1 video (Total 4 min), 2 readings
Video1 videos
Reading2 lecturas
Final project10m
Final project advice #110m
Semana
2
Horas para completar
2 horas para completar

Exploratory Data Analysis

We will start this week with Exploratory Data Analysis (EDA). It is a very broad and exciting topic and an essential component of solving process. Besides regular videos you will find a walk through EDA process for Springleaf competition data and an example of prolific EDA for NumerAI competition with extraordinary findings....
Reading
8 videos (Total 80 min), 2 readings, 1 quiz
Video8 videos
Building intuition about the data6m
Exploring anonymized data15m
Visualizations11m
Dataset cleaning and other things to check7m
Springleaf competition EDA I8m
Springleaf competition EDA II16m
Numerai competition EDA6m
Reading2 lecturas
Week 2 overview10m
Additional material and links10m
Quiz1 ejercicios de práctica
Exploratory data analysis12m
Horas para completar
2 horas para completar

Validation

In this module we will discuss various validation strategies. We will see that the strategy we choose depends on the competition setup and that correct validation scheme is one of the bricks for any winning solution. ...
Reading
4 videos (Total 51 min), 3 readings, 2 quizzes
Video4 videos
Validation strategies7m
Data splitting strategies14m
Problems occurring during validation20m
Reading3 lecturas
Validation strategies10m
Comments on quiz10m
Additional material and links10m
Quiz2 ejercicios de práctica
Validation8m
Validation8m
Horas para completar
5 horas para completar

Data Leakages

Finally, in this module we will cover something very unique to data science competitions. That is, we will see examples how it is sometimes possible to get a top position in a competition with a very little machine learning, just by exploiting a data leakage. ...
Reading
3 videos (Total 26 min), 3 readings, 3 quizzes
Video3 videos
Leaderboard probing and examples of rare data leaks9m
Expedia challenge9m
Reading3 lecturas
Comments on quiz10m
Additional material and links10m
Final project advice #210m
Quiz1 ejercicios de práctica
Data leakages8m
Semana
3
Horas para completar
3 horas para completar

Metrics Optimization

This week we will first study another component of the competitions: the evaluation metrics. We will recap the most prominent ones and then see, how we can efficiently optimize a metric given in a competition....
Reading
8 videos (Total 83 min), 3 readings, 2 quizzes
Video8 videos
Regression metrics review I14m
Regression metrics review II8m
Classification metrics review20m
General approaches for metrics optimization6m
Regression metrics optimization10m
Classification metrics optimization I7m
Classification metrics optimization II6m
Reading3 lecturas
Week 3 overview10m
Comments on quiz10m
Additional material and links10m
Quiz2 ejercicios de práctica
Metrics12m
Metrics12m
Horas para completar
4 horas para completar

Advanced Feature Engineering I

In this module we will study a very powerful technique for feature generation. It has a lot of names, but here we call it "mean encodings". We will see the intuition behind them, how to construct them, regularize and extend them. ...
Reading
3 videos (Total 27 min), 2 readings, 2 quizzes
Video3 videos
Regularization7m
Extensions and generalizations10m
Reading2 lecturas
Comments on quiz10m
Final project advice #310m
Quiz1 ejercicios de práctica
Mean encodings8m
Semana
4
Horas para completar
3 horas para completar

Hyperparameter Optimization

In this module we will talk about hyperparameter optimization process. We will also have a special video with practical tips and tricks, recorded by four instructors....
Reading
6 videos (Total 86 min), 4 readings, 2 quizzes
Video6 videos
Hyperparameter tuning II12m
Hyperparameter tuning III13m
Practical guide16m
KazAnova's competition pipeline, part 118m
KazAnova's competition pipeline, part 217m
Reading4 lecturas
Week 4 overview10m
Comments on quiz10m
Additional material and links10m
Additional materials and links10m
Quiz2 ejercicios de práctica
Practice quiz6m
Graded quiz8m
Horas para completar
4 horas para completar

Advanced feature engineering II

In this module we will learn about a few more advanced feature engineering techniques....
Reading
4 videos (Total 22 min), 2 readings, 2 quizzes
Video4 videos
Matrix factorizations6m
Feature Interactions5m
t-SNE5m
Reading2 lecturas
Comments on quiz10m
Additional Materials and Links10m
Quiz1 ejercicios de práctica
Graded Advanced Features II Quiz12m
Horas para completar
10 horas para completar

Ensembling

Nowadays it is hard to find a competition won by a single model! Every winning solution incorporates ensembles of models. In this module we will talk about the main ensembling techniques in general, and, of course, how it is better to ensemble the models in practice. ...
Reading
8 videos (Total 92 min), 4 readings, 4 quizzes
Video8 videos
Bagging9m
Boosting16m
Stacking16m
StackNet14m
Ensembling Tips and Tricks14m
CatBoost 17m
CatBoost 27m
Reading4 lecturas
Validation schemes for 2-nd level models10m
Comments on quiz10m
Additional materials and links10m
Final project advice #410m
Quiz2 ejercicios de práctica
Ensembling8m
Ensembling12m
4.7
87 revisionesChevron Right
Dirección de la carrera

33%

comenzó una nueva carrera después de completar estos cursos
Beneficio de la carrera

83%

consiguió un beneficio tangible en su carrera profesional gracias a este curso

Principales revisiones

por MSMar 29th 2018

Top Kagglers gently introduce one to Data Science Competitions. One will have a great chance to learn various tips and tricks and apply them in practice throughout the course. Highly recommended!

por MMNov 10th 2017

This course is fantastic. It's chock full of practical information that is presented clearly and concisely. I would like to thank the team for sharing their knowledge so generously.

Instructores

Avatar

Dmitry Ulyanov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Alexander Guschin

Visiting lecturer at HSE, Lecturer at MIPT
HSE Faculty of Computer Science
Avatar

Mikhail Trofimov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Dmitry Altukhov

Visiting lecturer
HSE Faculty of Computer Science
Avatar

Marios Michailidis

Research Data Scientist
H2O.ai

Acerca de National Research University Higher School of Economics

National Research University - Higher School of Economics (HSE) is one of the top research universities in Russia. Established in 1992 to promote new research and teaching in economics and related disciplines, it now offers programs at all levels of university education across an extraordinary range of fields of study including business, sociology, cultural studies, philosophy, political science, international relations, law, Asian studies, media and communications, IT, mathematics, engineering, and more. Learn more on www.hse.ru...

Acerca del programa especializado Advanced Machine Learning

This specialization gives an introduction to deep learning, reinforcement learning, natural language understanding, computer vision and Bayesian methods. Top Kaggle machine learning practitioners and CERN scientists will share their experience of solving real-world problems and help you to fill the gaps between theory and practice. Upon completion of 7 courses you will be able to apply modern machine learning methods in enterprise and understand the caveats of real-world data and settings....
Advanced Machine Learning

Preguntas Frecuentes

  • Una vez que te inscribes para obtener un Certificado, tendrás acceso a todos los videos, cuestionarios y tareas de programación (si corresponde). Las tareas calificadas por compañeros solo pueden enviarse y revisarse una vez que haya comenzado tu sesión. Si eliges explorar el curso sin comprarlo, es posible que no puedas acceder a determinadas tareas.

  • Cuando te inscribes en un curso, obtienes acceso a todos los cursos que forman parte del Programa especializado y te darán un Certificado cuando completes el trabajo. Se añadirá tu Certificado electrónico a la página Logros. Desde allí, puedes imprimir tu Certificado o añadirlo a tu perfil de LinkedIn. Si solo quieres leer y visualizar el contenido del curso, puedes auditar el curso sin costo.

¿Tienes más preguntas? Visita el Centro de Ayuda al Alumno.