Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

banner-img

Course Skill Level:

Foundational

Course Duration:

3 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    Big Data & Data Science

  • Course Code:

    ENLEWRL21E09

Who should attend & recommended skills:

Those experienced in Python with basic IT and Linux skills

Who should attend & recommended skills

  • This course is geared for Python experienced developers, analysts or others who are intending to Explore powerful R packages to create predictive models using ensemble methods.
  • Skill-level: Foundation-level Ensemble Learning with R skills for Intermediate skilled team members. This is not a basic class.
  • IT Skills: Basic to Intermediate (1-5 years’ experience)
  • Linux: Basic (1-2 years’ experience), including familiarity with command-line options such as ls, cd, cp, and su
  • Attendees without a programming background like Python may view labs as follow along exercises or team with others to complete them

About this course

Ensemble techniques are used for combining two or more similar or dissimilar machine learning algorithms to create a stronger model. Such a model delivers superior prediction power and can give your datasets a boost in accuracy. Ensemble Learning with R begins with the important statistical resampling methods. You will then walk through the central trilogy of ensemble techniques – bagging, random forest, and boosting – then you’ll learn how they can be used to provide greater accuracy on large datasets using popular R packages. You will learn how to combine model predictions using different machine learning algorithms to build ensemble models. In addition to this, you will explore how to improve the performance of your ensemble models. By the end of this course, you will have learned how machine learning algorithms can be combined to reduce common problems and build simple efficient ensemble models with the help of real-world examples.

Skills acquired & topics covered

  • Working in a hands-on learning environment, led by our Ensemble Learning with R instructor, students will learn about and explore:
  • Implementing machine learning algorithms to build ensemble-efficient models
  • Exploring powerful R packages to create predictive models using ensemble methods
  • Learning to build ensemble models on large datasets using a practical approach
  • Carrying out an essential review of re-sampling methods, bootstrap, and jackknife
  • Exploring the key ensemble methods: bagging, random forests, and boosting
  • Using multiple algorithms to make strong predictive models
  • Enjoying a comprehensive treatment of boosting methods
  • Supplementing methods with statistical tests, such as ROC
  • Walking through data structures in classification, regression, survival, and time series data
  • Using the supplied R code to implement ensemble methods
  • Learning stacking method to combine heterogeneous machine learning models

Course breakdown / modules

  • Datasets
  • Statistical/machine learning models
  • The right model dilemma!
  • An ensemble purview
  • Complementary statistical tests

  • Technical requirements
  • The jackknife technique
  • Bootstrap – a statistical method
  • The boot package
  • Bootstrap and testing hypotheses
  • Bootstrapping regression models
  • Bootstrapping survival models*
  • Bootstrapping time series models*

  • Technical requirements
  • Classification trees and pruning
  • Bagging
  • k-NN classifier
  • k-NN bagging

  • Technical requirements
  • Random Forests
  • Variable importance
  • Proximity plots
  • Random Forest nuances
  • Comparisons with bagging
  • Missing data imputation
  • Clustering with Random Forest

  • Technical requirements
  • The general boosting algorithm
  • Adaptive boosting
  • Gradient boosting
  • Using the adabag and gbm packages
  • Variable importance
  • Comparing bagging, random forests, and boosting

  • Technical requirements
  • Why does boosting work?
  • The gbm package
  • The xgboost package
  • The h2o package

  • Technical requirements
  • Why does ensembling work?
  • Ensembling by voting
  • Ensembling by averaging
  • Stack ensembling

  • Technical requirements
  • What is ensemble diagnostics?
  • Ensemble diversity
  • Pairwise measure
  • Interrating agreement

  • Technical requirements
  • Pre-processing the housing data
  • Visualization and variable reduction
  • Regression models
  • Bagging and Random Forests
  • Boosting regression models
  • Stacking methods for regression models

  • Core concepts of survival analysis
  • Nonparametric inference
  • Regression models – parametric and Cox proportional hazards models
  • Survival tree
  • Ensemble survival models

  • Technical requirements
  • Time series datasets
  • Time series visualization
  • Core concepts and metrics
  • Essential time series models
  • Bagging and time series
  • Ensemble time series models