Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.


Course Skill Level:


Course Duration:

4 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    Big Data & Data Science

  • Course Code:


Who should attend & recommended skills:

Those with Python experience and basic IT & Linux skills

Who should attend & recommended skills

  • Python experienced developers, analysts or others with Python skills who wish to master the craft of predictive modeling in R by developing strategy, intuition, and a solid foundation in essential concepts.
  • Skill-level: Foundation-level R skills for Intermediate skilled team members. This is not a basic class.
  • IT: Basic to Intermediate (1-5 years’ experience)
  • Linux: Basic (1-2 years’ experience) including familiarity with command-line options such as ls, cd, cp, and su
  • Attendees without a programming background like Python may view labs as follow along exercises or team with others to complete them

About this course

R offers a free and open-source environment that is perfect for both learning and deploying predictive modeling solutions. With its constantly growing community and plethora of packages, R offers the functionality to deal with a truly vast array of problems.
The course begins with a dedicated chapter on the language of models and the predictive modeling process. You will understand the learning curve and the process of tidying data. Each subsequent chapter tackles a particular type of model, such as neural networks, and focuses on the three important questions of how the model works, how to use R to train it, and how to measure and assess its performance using real-world datasets. How do you train models that can handle really large datasets? This course will also show you just that. Finally, you will tackle the really important topic of deep learning by implementing applications on word embedding and recurrent neural networks. By the end of this course, you will have explored and tested the most popular modeling techniques in use on real- world datasets and mastered a diverse range of techniques in predictive analytics using R.

Skills acquired & topics covered

  • Grasping the major methods of predictive modeling and moving beyond black box thinking to a deeper level of understanding
  • Leveraging the flexibility and modularity of R to experiment with a range of different techniques and data types
  • Practical advice and tips explaining important concepts and best practices to help you understand quickly and easily
  • Mastering the steps involved in the predictive modeling process
  • Growing your expertise in using R and its diverse range of packages
  • Learning how to classify predictive models and distinguish which models are suitable for a particular problem
  • Steps for tidying data and improving the performing metrics
  • Recognizing the assumptions, strengths, and weaknesses of a predictive model
  • Understanding how and why each predictive model works in R
  • Selecting appropriate metrics to assess the performance of different types of predictive model
  • Exploring word embedding and recurrent neural networks in R
  • Training models in R that can work on very large datasets

Course breakdown / modules

  • Gearing Up for Predictive Modeling
  • Models
  • Types of model
  • The process of predictive modeling

  • Getting started
  • Tidying data
  • Categorizing data quality
  • Performance metrics
  • Cross-validation
  • Learning curves

  • Introduction to linear regression
  • Simple linear regression
  • Multiple linear regression
  • Assessing linear regression models
  • Problems with linear regression
  • Feature selection
  • Regularization
  • Polynomial regression

  • Classifying with linear regression
  • Introduction to logistic regression
  • Predicting heart disease
  • Assessing logistic regression models
  • Regularization with the lasso
  • Classification metrics
  • Extensions of the binary logistic classifier
  • Poisson regression
  • Negative Binomial regression

  • The biological neuron
  • The artificial neuron
  • Stochastic gradient descent
  • Multilayer perceptron networks
  • The back propagation algorithm
  • Predicting the energy efficiency of buildings
  • Predicting glass type revisited
  • Predicting handwritten digits
  • Radial basis function networks

  • Maximal margin classification
  • Support vector classification
  • Kernels and support vector machines
  • Predicting chemical biodegration
  • Predicting credit scores
  • Multiclass classification with support vector machines

  • The intuition for tree models
  • Algorithms for training decision trees
  • Predicting class membership on synthetic 2D data
  • Predicting the authenticity of banknotes
  • Predicting complex skill learning
  • Improvements to the M5 model

  • Defining DR

  • Bagging
  • Boosting
  • Predicting atmospheric gamma ray radiation
  • Predicting complex skill learning with boosting

  • A little graph theory
  • Bayes’ theorem
  • Conditional independence
  • Bayesian networks
  • The Nave Bayes classifier

  • An overview of topic modeling
  • Latent Dirichlet Allocation
  • Modeling the topics of online news stories
  • Modeling tweet topics

  • Rating matrix
  • Collaborative filtering
  • Singular value decomposition
  • Predicting recommendations for movies and jokes
  • Loading and pre-processing the data
  • Exploring the data
  • Other approaches to recommendation systems

  • Starting the project
  • Characteristics of big data
  • Training models at scale
  • A path forward
  • Alternatives

  • Machine learning or deep learning
  • What is deep learning?