Volver a Machine Learning: Classification

4.7

estrellas

3,266 calificaciones

•

540 revisiones

Case Studies: Analyzing Sentiment & Loan Default Prediction
In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification.
In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We've also included optional content in every module, covering advanced topics for those who want to go even deeper!
Learning Objectives: By the end of this course, you will be able to:
-Describe the input and output of a classification model.
-Tackle both binary and multiclass classification problems.
-Implement a logistic regression model for large-scale classification.
-Create a non-linear model using decision trees.
-Improve the performance of any model using boosting.
-Scale your methods with stochastic gradient ascent.
-Describe the underlying decision boundaries.
-Build a classification model to predict sentiment in a product review dataset.
-Analyze financial data to predict loan defaults.
-Use techniques for handling missing data.
-Evaluate your models using precision-recall metrics.
-Implement these techniques in Python (or in the language of your choice, though Python is highly recommended)....

Jun 15, 2020

A very deep and comprehensive course for learning some of the core fundamentals of Machine Learning. Can get a bit frustrating at times because of numerous assignments :P but a fun thing overall :)

Oct 16, 2016

Hats off to the team who put the course together! Prof Guestrin is a great teacher. The course gave me in-depth knowledge regarding classification and the math and intuition behind it. It was fun!

Filtrar por:

por Alex H

•Feb 08, 2018

Relying on a non-open source library for all of the code examples vitiates the value of this course. It should use Pandas and sklearn.

por Saqib N S

•Oct 16, 2016

Hats off to the team who put the course together! Prof Guestrin is a great teacher. The course gave me in-depth knowledge regarding classification and the math and intuition behind it. It was fun!

por Lewis C L

•Jun 13, 2019

First, coursera is a ghost town. There is no activity on the forum. Real responses stopped a year ago. Most of the activity is from 3 years ago. This course is dead.

Two, this course seems to approach the topic as teaching inadequate ways to perform various tasks to show the inadequacies. You can learn from that; we will make mistakes or use approaches that are less than ideal. But, that should be a quick "don't do this," while moving on to better approaches

Three, the professors seem to dismiss batch learning as a "dodgy" technique. If Hinton, Bengio, and other intellectual leaders of the field recommend it as the preferred technique, then it probably is.

Four, the professors emphasize log likelihood. Mathematically, minus the log likelihood is the same as cross-entropy cost. The latter is more robust and applicable to nearly every classification problem (except decision trees), and so is a more versatile formulation. As neither actually plays any roll in the training algorithm except as guidance for the gradient and epsilon formulas and as a diagnostic, the more versatile and robust approach should be preferred.

The professors seem very focused on decision trees. Despite the "apparent" intuitive appeal and computational tractability, the technique seems to be eclipsed by other methods. Worth teaching and occasionally using to be sure, but not for 3/4 of the course.

There are many mechanical problems that remain in the material. At least 6 errors in formulas or instructions remain. Most can be searched for on the forum to find some resolution, through a lot of noise. Since the last corrections were made 3 years ago, the UW or Coursera's lack of interest shows.

It was a bit unnecessary to use a huge dataset that resulted in a training matrix or over 10 billion cells. Sure, if you wanted to focus on methods for scaling--very valuable indeed--go for it. But, this lead to unnecessary long training times and data issues that were, at best, orthogonal to the overall purpose of highlighting classification techniques and encouraging good insights about how classification techniques work.

The best thing about the course was the willingness to allow various technologies to be used. The developers went to some lengths to make this possible. It was far more work to stray outside the velvet ropes of the Jupiter notebooks, but it was very rewarding.

Finally, the quizzes were dependent on numerical point answers that could often be matched only by using the same exact technology and somewhat sloppy approaches (no lowercase for word sentiment analysis, etc.). It does take some cleverness to think of questions that lead to the right answer if the concepts are implemented properly. It doesn't count when the answers rely precisely on anomalies.

I learned a lot, but only because I wrote my own code and was able to think more clearly about it, but that was somewhat of a side effect.

All in all, a disappointing somewhat out of date class.

por RAJKUMAR R V

•Oct 02, 2019

It will definitely help you in understanding the basics to dept of most of the algorithms. Even though you are already aware of most of the things covered elsewhere related to Classification, this course will add up up a considerable amount of extra inputs which will help to understand and explore more things in Machine learning.

por Christian J

•Jan 25, 2017

Very impressive course, I would recommend taking course 1 and 2 in this specialization first since they skip over some things in this course that they have explained thoroughly in those courses

por Ian F

•Jul 18, 2017

Good overview of classification. The python was easier in this section than previous sections (although maybe I'm just better at it by this point.) The topics were still as informative though!

por Jason M C

•Mar 29, 2016

This continues UWash's outstanding Machine Learning series of classes, and is equally as impressive, if not moreso, then the Regression class it follows. I'm super-excited for the next class!

por Feng G

•Jul 12, 2018

Very helpful. Many ThanksSome suggestions:1.Please add LDA into the module.2.It is really important if you guys can provide more examples for pandas and scikit-learn users in programming assignments like you do in regression module.

por Saransh A

•Oct 31, 2016

Well this series just doesn't seize to amaze me! Another great course after the introductory and regression course. Though I really missed detailed explanations of Random Forest and other Ensemble methods. Also, SVM was not discussed, but there were many other topics which all other courses and books easily skips. The programming assignments were fine, more focused on teaching the algorithms than trapping someone in the coding part. This series is the series for someone who really wants to get a hold of what machine learning really is. One thing which I really like about this course is that there are optional videos from time to time, where they discuss the mathematical aspects of the algorithms that they teach. Which really quenches my thirst for mathematical rigour. Definitely continuing this specialisation forward

por Sauvage F

•Mar 29, 2016

Excellent Course, I'm very found of Carlos jokes mixed with the hard concepts ^^. Lectures are precise, concise and comprehensive. I really enjoyed diving in depths of the algorithms' mechanics (like Emily did in the Regression Course). I also deeply appreciated the real-world examples in the lectures and real world datasets of assignments.

Some may regret the absence of a few "classic" algorithms like SVM but Carlos definitely made his point about it in the forum and did not exclude the addition of an optional module about it.

I found some of the assignments less challenging than during the Regression Course, but maybe I'm just getting better at Machine-Learning and Python ^^.

Thanks again to Emily and Carlos for the brilliant work at this very promising specialization.

por Daisuke H

•May 18, 2016

I really love this Classification course as well as Regression course!! This course is covering both mathematical background and practical implementation very well. Assignments are moderately challenging and it was a very good exercise for me to have a good intuition about classification algorithms. I only used standard Python libraries such as numpy, scikit-learn, matplotlib and pandas, and there were no problems for me to complete all of the assignments without any use of IPython, SFrames, GraphLab Create at all. I would say thank you so much to Carlos and Emily to give me such a great course!!

P.S. This course would be perfect if it covered bootstrap and Random Forest in details.

por Ridhwanul H

•Oct 16, 2017

As usual this was also a great course, except

⊃゜Д゜）⊃ decision trees ⊂（゜Д゜⊂

I am not saying presently anythings bad or incorrect, but I just dont feel familiar with this. It is one tough topic to understand. I think it would have been great if there were some videos and lectures where some programming example were also given, this would have helped out a lot in programming assignments.

Also there is another thing that I think should have been addressed (at least in one of the courses, unless you did it in course 4 the last one which I havent done yet) : vectorisation - instead of looping through each weight how it could be achieved at once through vectorisation.

por Gerard A

•May 18, 2020

So, there appear to be a lot of smarter people than me out there. Learnt some good python basics and the skeleton approach is quite OK as doing it from scratch for persons who studied maths at uni many years ago is may be a bridge too far. Carlos is great but I feel that 1) the ADABoost could have had an example to relate to - I looked on youtube and it clicked then 2) I miss the discussion on gini coeff. and when to use which type of Decision trees 3) SVM, Baysian missing meaning 2 courses instead of 1 here really required. 4) no tutors so how many are taking the course - few and why? 5) dropping the original 2 last modules seems not a great idea.

por Apurva A

•Jun 14, 2016

This course is very nice and covers some of the very important concepts like decision trees, boosting, and online learning apart form logistic regression. More importantly, everything here has been implemented from scratch and so the understanding of codes becomes very easy.

The lectures and slides were very intuitive. Carlos has explained everything very properly and even some of the very tough concepts have been explained in a proper manner from figures and graphs.

There are lots lots of python assignments to review what have we learned in the lectures.

Overall, its a must take course for all who wants an insight about classification in ML.

por Edward F

•Jun 25, 2017

I took the 4 (formerly 6) courses that comprised this certification, so I'm going to provide the same review for all of them.

This course and the specialization are fantastic. The subject matter is very interesting, at least to me, and the professors are excellent, conveying what could be considered advanced material in a very down-to-Earth way. The tools they provide to examine the material are useful and they stretch you out just far enough.

My only regret/negative is that they were unable to complete the full syllabus promised for this specialization, which included recommender systems and deep learning. I hope they get to do that some day.

por Benoit P

•Dec 29, 2016

This whole specialization is an outstanding program: the instructors are entertaining, and they strike the right balance between theory and practice. Even though I consider myself quite literate in statistics and numerical optimization, I learned several new techniques that I was able to directly apply in various part of my job. We really go in depth: while other classes I've taken limit themselves to an inventory of available techniques, in this specialization I get to implement key techniques from scratch. Highly, highly recommended.

FYI: the Python level required is really minimal, and the total time commitment is around 4 hours per week.

por Liang-Yao W

•Aug 12, 2017

The course walk through (and work through) concepts of linear classifier, logistic regression, decision trees, boosting, etc. For me it is a good introduction to these fundamental ideas with depth but not too deep to be distracted.

I personally become interested in knowing a bit more theoretical basis of the tools or concepts like boosting or maximum likelihood. The course understandably doesn't go that much into math and theory which leaves me a bit unsatisfied :P. But that is probably too much to ask for a short course and I do think the course covers great materials already.

por Paul C

•Aug 13, 2016

This Machine Learning class and the rest of the Machine Learning series from the University of Washington is the best material on the subject matter. What really sets this course and series apart is the case-base methodology as well as in-depth technical subject matter. Specifically, the step through coding of the algorithms provides key insight that is seriously missed in other classes even in traditional academic settings. I highly encourage the authors and other Coursera publishers to continue to publish more educational material in the same framework.

por Sean S

•Mar 09, 2018

I am generally very happy with the style, pace, and content of this entire specialization. This course is no exception and exposed me to a lot of new concepts and helped me to improve my python programming skills. I am left wondering if the programming assignments were made easier over time given all of the hints and "checkpoints" for code that was already supplied. I understand this is not a programming course but I probably would have been okay with toiling away at the algorithms for a few more hours without the hints. But that's just me. Great course.

por Ferenc F P

•Jan 18, 2018

This is a very good course in classification. Starts with logistic regression (w. and wo. regularization) and then makes a very good introduction to decision trees and boosting. Also has a very good explanation about stochastic gradient descent. The only drawback is that for some Quizes the result is different with scikit-learn than with Graphlab while the Quiz is prepared for Graphlab results. Thus, with scikit-learn one may fail some of them.

por Samuel d Z

•Jul 10, 2017

AWESOME!!! Very well structured. Concepts are explained in small and short videos which focus on one thing at the time. Unnecessary clutter is removed and deep dives can now be done with this solid foundation. Also the Python programming part teaches so much and again, only asked to program the essentials and non essentials or "special tricks" are done, so you can see and learn from them without having to search on the web. THANKS.

por Yifei L

•Mar 27, 2016

This is a very good course on classification as previous two.

Good explanation on topics like logistic regression, stochastic gradient descent. The assignments are well designed.

However the decision tree part should introduce entropy and gini which are mainly used for choosing the splitting feature. Also the random forest is worth discussing.

Overall, this is a good course which contains a handful of knowledge.

por Dauren B

•Jan 27, 2018

I loved this course! It is designed in a way that both beginner and more advanced student can grasp knowledge. New things for me like boosting (ensemble models), decision trees, stochastic gradient descent, online learning (which is not used much by big systems, instead they tend to do something different for incoming new data) and much more are introduced and explained in this course. Recommend 10/10.

por Rajkumar K

•Apr 01, 2017

This is a great course on ML - Classification that introduces one to the various techniques available in classification and to understand the algorithm under the hood. The course also explains the process, approach for each technique along with the methods to evaluate the results. Overall this takes the student through the next steps of learning classification algorithm from the foundational courses.

por Norman O

•Feb 19, 2018

I really liked this section on classification. Like with the regression course, complex concepts were explained well with nice examples and assignments. The only issue I had was that some of the coursework can be computing intensive (no surprise there). On the other hand, you really do learn by doing. And, of course, in the real world, computing resources (though plentiful) aren't infinite.

- IA para todos
- Introducción a TensorFlow
- Redes neurales y aprendizaje profundo
- Algoritmos, parte 1
- Algoritmos, parte 2
- Aprendizaje Automático
- Aprendizaje automático con Python
- Aprendizaje automático con Sas Viya
- Programación R
- Introducción a la programación con Matlab
- Análisis de datos con Python
- Aspectos básicos de AWS: El paso a la nube nativa
- Aspectos básicos de la plataforma en la nube de Google
- Ingeniería de confiabilidad del sitio
- Hablar inglés de manera profesional
- La ciencia del bienestar
- Aprendiendo a aprender
- Mercados financieros
- Prueba de hipótesis en el área de la salud pública
- Aspectos básicos del liderazgo diario

- Aprendizaje profundo
- Python para todos
- Ciencia de Datos
- Ciencias de los Datos Aplicada con Python
- Aspectos básicos de los negocios
- Arquitectura con Google Cloud Platform
- Ingeniería de datos en la plataforma en la nube de Google
- Excel para MySQL
- Aprendizaje automático avanzado
- Matemática aplicada al aprendizaje automático
- Automóviles de auto conducción
- Revolución de la cadena de bloques para la empresa
- Análisis comercial
- Habilidades de Excel aplicadas para los negocios
- mercadeo digital
- Análisis estadístico con R para el área de la salud pública
- Aspectos básicos de la inmunología
- Anatomía
- Gestión de la innovación y del pensamiento de diseño
- Aspectos básicos de la psicología positiva

- Soporte de TI de Google
- Especialista en compromiso con el cliente de IBM
- Ciencia de datos de IBM
- Administrador de proyectos aplicado
- Certificado profesional de IA aplicada de IBM
- Aprendizaje automático para análisis
- Análisis y visualización de datos espaciales
- Gestión e ingeniería de construcción
- Diseño instruccional

- Maestría en Ciencia de Datos
- Licenciatura en Ciencias de la Computación
- Títulos de Ciencias de la Computación e Ingeniería
- Maestría en Aprendizaje Automático
- Maestría en Administración de Empresas y títulos de estudios de negocios
- Maestría en Ingeniería Eléctrica
- Maestría en Salud Pública
- Maestría en Tecnología de la Información