MathWorks - Mobile View
  • Sign In to Your MathWorks AccountSe connecter
  • Access your MathWorks Account
    • Mon compte
    • Mon profil
    • Mes licences
    • Se déconnecter
  • Produits
  • Solutions
  • Le monde académique
  • Support
  • Communauté
  • Événements
  • Obtenir MATLAB
MathWorks
  • Produits
  • Solutions
  • Le monde académique
  • Support
  • Communauté
  • Événements
  • Obtenir MATLAB
  • Sign In to Your MathWorks AccountSe connecter
  • Access your MathWorks Account
    • Mon compte
    • Mon profil
    • Mes licences
    • Se déconnecter

Vidéos et webinars

  • MathWorks
  • Vidéos
  • Vidéos
  • Recherche
  • Vidéos
  • Recherche
  • Contacter un commercial
  • Version d'essai
  Register to watch video
  • Description
  • Full Transcript
  • Related Resources

Applied Machine Learning, Part 2: ROC Curves

From the series: Applied Machine Learning

Seth DeLand, MathWorks

Use ROC curves to assess classification models. ROC curves plot the true positive rate vs. the false positive rate for different values of a threshold.  

This video walks through several examples that illustrate broadly what ROC curves are and why you’d use them. It also outlines interesting scenarios you may encounter when using ROC curves. 

ROC curves are an important tool for assessing classification models. They're also a bit abstract, so let's start by reviewing some simpler ways to assess models. 

Let's use an example that has to do with the sounds a heart makes. Given 71 different features from an audio recording of a heart, we try to classify if the heart sounds normal or abnormal.

One of the easiest metrics to understand is the accuracy of a model – or, in other words, how often it is correct. The accuracy is useful because it’s a single number, making comparisons easy. The classifier I’m looking at right now has an accuracy of 86.3%.

What the accuracy doesn’t tell you is how the model was right or wrong.  For that, there’s the confusion matrix, which shows things such as the true positive rate. In this case, it is 74 %, meaning the classifier correctly predicted abnormal heart sounds 74% of the time.  We also have the false positive rate of 9%. This is the rate at which the classifier predicted abnormal when the heart sound was actually normal.

The confusion matrix gives results for a single model.  But most machine learning models don’t just classify things, they actually calculate probabilities.  The confusion matrix for this model shows the result of classifying anything with a probability of >=0.5 as abnormal, and anything with probability <0.5 as normal. But that 0.5 doesn’t have to be fixed, and in fact we could threshold anywhere in the range of probabilities between 0 and 1.

That’s where ROC curves come in.  The ROC curve plots the true positive rate vs. the false positive rate for different values of this threshold.

Let’s look at this in more detail.

Here’s my model, and I’ll run it on my test data to get the probability of an abnormal heart sound.  Now let’s start by thresholding these probabilities at 0.5.  If I do that, I get a true positive rate of 74% and a false positive rate of 9%.

But what if we wanted to be very conservative, so even if the probability of a heart sound being abnormal was just 10%, we would classify it as abnormal. 

If we do that, we get this point.

 What if we wanted to be really certain, and only classify sounds with a 90% probability as being abnormal?  Then we’d get this point, which has a much lower false positive rate, but also a lower true positive rate.

Now, if we were to create a bunch of values for this threshold in-between 0 and 1, say 1000 trials evenly spaced, we would get lots of these ROC points, and that’s where we get the ROC curve from.  The ROC curve shows us the tradeoff in the true positive rate and false positive rate for varying values of that threshold.

There will always be a point on the ROC curve at 0 comma 0. In our case, everything is classified as “normal”. And there will always be a point at 1 comma 1, where everything is classified as “abnormal”. 

The area under the curve is a metric for how good our classifier is.  A perfect classifier would have an AUC of 1.  In this example, the AUC is 0.926. 

In MATLAB, you don’t need to do all of this by hand like I’ve done here.  You can get the ROC curve and the AUC from the perfcurve function.

Now that we have that down, let’s look at some interesting cases for an ROC curve:

·       If a curve is all the way up and to the left, you have a classifier that for some threshold perfectly labeled every point in the test data, and your AUC is 1.  You either have a really good classifier, or you may want to be concerned that you don’t have enough data or that your classifier is overfit. 

·       If a curve is a straight line from the bottom left to the top right, you have a classifier that does no better than a random guess (its AUC is 0.5).  You may want to try some other types of models or go back to your training data to see if you can engineer some better features.

·       If a curve looks kind of jagged, that is sometimes due to the behavior of different types of classifiers.  For example, a decision tree only has a finite number of decision nodes, and each of those nodes has a specific probability.  The jaggedness comes from when the threshold value we talked about earlier crosses the probability at one of the nodes.  Jaggedness also commonly comes from gaps in the test data.

As you can see from these examples, ROC curves can be a simple, yet nuanced tool for assessing classifier performance.

If you want to learn more about machine learning model assessment, check out the links in the description below.

Related Products

  • Statistics and Machine Learning Toolbox

Learn More

Performance curves
Perfcurve Documentation
Model Building and Assessment
ROC Curve
Related Information
MATLAB for Machine Learning

Feedback

Featured Product

Statistics and Machine Learning Toolbox

  • Request Trial
  • Get Pricing

Up Next:

Learn about hyperparameters, including what they are and why you’d use them. Explore how changing the hyperparameters in your machine learning algorithm enables you to more accurately fit your models to data. 
4:43
Part 3: Hyperparameter Optimization
View full series (4 Videos)

Related Videos:

34:34
Machine Learning Made Easy
5:36
Machine Learning for Predictive Modelling (Highlights)
44:37
Machine Learning for Predictive Modelling
41:25
Machine Learning with MATLAB
34:31
Machine Learning with MATLAB: Getting Started with...

View more related videos

MathWorks - Domain Selector

Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web site

You can also select a web site from the following list:

How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

Americas

  • América Latina (Español)
  • Canada (English)
  • United States (English)

Europe

  • Belgium (English)
  • Denmark (English)
  • Deutschland (Deutsch)
  • España (Español)
  • Finland (English)
  • France (Français)
  • Ireland (English)
  • Italia (Italiano)
  • Luxembourg (English)
  • Netherlands (English)
  • Norway (English)
  • Österreich (Deutsch)
  • Portugal (English)
  • Sweden (English)
  • Switzerland
    • Deutsch
    • English
    • Français
  • United Kingdom (English)

Asia Pacific

  • Australia (English)
  • India (English)
  • New Zealand (English)
  • 中国
    • 简体中文Chinese
    • English
  • 日本Japanese (日本語)
  • 한국Korean (한국어)

Contact your local office

  • Contacter un commercial
  • Version d'essai

Découvrir les produits

  • MATLAB
  • Simulink
  • Version étudiante
  • Support Hardware
  • File Exchange

Essayer ou Acheter

  • Téléchargements
  • Version d'essai
  • Contacter un commercial
  • Tarifs et licences
  • Comment acheter

Se Former

  • Documentation
  • Tutoriels
  • Exemples
  • Vidéos et webinars
  • Formation

Obtenir de l'aide

  • Aide à l'installation
  • Forum MATLAB
  • Services de consulting
  • Gestion Licences
  • Contacter le support technique

La société

  • Offres d'emploi
  • Actualités
  • Social Mission
  • Contacter un commercial
  • La société

MathWorks

Accelerating the pace of engineering and science

MathWorks est le leader mondial des logiciels de calcul mathématique pour les ingénieurs et les scientifiques.

Découvrir…

  • Select a Web Site United States
  • Brevets
  • Marques déposées
  • Charte de confidentialité
  • Lutte anti-piratage
  • État des applications

© 1994-2021 The MathWorks, Inc.

  • Facebook
  • Twitter
  • Instagram
  • YouTube
  • LinkedIn
  • RSS

Rejoignez la conversation

This website uses cookies to improve your user experience, personalize content and ads, and analyze website traffic.  By continuing to use this website, you consent to our use of cookies.  Please see our Privacy Policy to learn more about cookies and how to change your settings.