Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

Scikit-learn Cookbook

  • Course Code: Data Science - Scikit-learn Cookbook
  • Course Dates: Contact us to schedule.
  • Course Category: Big Data & Data Science Duration: 3 Days Audience: This course is geared for those who wants to Learn to use scikit-learn operations and functions for Machine Learning and deep learning applications.

Course Snapshot 

  • Duration: 3 days 
  • Skill-level: Foundation-level Scikit-learn Cookbook skills for Intermediate skilled team members. This is not a basic class. 
  • Targeted Audience: This course is geared for those who wants to Learn to use scikit-learn operations and functions for Machine Learning and deep learning applications. 
  • Hands-on Learning: This course is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and discussions with machine-based student labs and exercises. Student machines are required. 
  • Delivery Format: This course is available for onsite private classroom presentation. 
  • Customizable: This course may be tailored to target your specific training skills objectives, tools of choice and learning goals. 

Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. This course includes walk throughs and solutions to the common as well as the not-so-common problems in machine learning, and how scikit-learn can be leveraged to perform various machine learning tasks effectively. The edition begins with taking you through recipes on evaluating the statistical properties of data and generates synthetic data for machine learning modelling. As you progress through the lessons, you will comes across recipes that will teach you to implement techniques like data pre-processing, linear regression, logistic regression, K-NN, Naïve Bayes, classification, decision trees, Ensembles and much more. Furthermore, you’ll learn to optimize your models with multi-class classification, cross validation, model evaluation and dive deeper in to implementing deep learning with scikit-learn. Along with covering the enhanced features on model section, API and new features like classifiers, regressors and estimators the lesson also contains recipes on evaluating and fine-tuning the performance of your model. By the end of this course, you will have explored plethora of features offered by scikit-learn for Python to solve any machine learning problem you come across. 

Working in a hands-on learning environment, led by our Scikit-learn Cookbook expert instructor, students will learn about and explore: 

  • Handle a variety of machine learning tasks effortlessly by leveraging the power of scikit-learn 
  • Perform supervised and unsupervised learning with ease, and evaluate the performance of your model 
  • Practical, easy to understand recipes aimed at helping you choose the right machine learning algorithm 

Topics Covered: This is a high-level list of topics covered in this course. Please see the detailed Agenda below 

  • Build predictive models in minutes by using scikit-learn 
  • Understand the differences and relationships between Classification and Regression, two types of Supervised Learning. 
  • Use distance metrics to predict in Clustering, a type of Unsupervised Learning 
  • Find points with similar characteristics with Nearest Neighbors. 
  • Use automation and cross-validation to find a best model and focus on it for a data product 
  • Choose among the best algorithm of many or use them together in an ensemble. 
  • Create your own estimator with the simple syntax of sklearn 
  • Explore the feed-forward neural networks available in scikit-learn 

Audience & Pre-Requisites 

This course is designed for developers wants to use scikit-learn operations and functions for Machine Learning and deep learning applications. 

Pre-Requisites:  Students should have familiar with  

  • Basics of Python  
  • Knowledge of Python is assumed. 

Course Agenda / Topics 

  1. High-Performance Machine Learning – NumPy 
  • High-Performance Machine Learning – NumPy 
  • Introduction 
  • NumPy basics 
  • Loading the iris dataset 
  • Viewing the iris dataset 
  • Viewing the iris dataset with Pandas 
  • Plotting with NumPy and matplotlib 
  • A minimal machine learning recipe – SVM classification 
  • Introducing cross-validation 
  • Putting it all together 
  • Machine learning overview – classification versus regression 
  1. Pre-Model Workflow and Pre-Processing 
  • Pre-Model Workflow and Pre-Processing 
  • Introduction 
  • Creating sample data for toy analysis 
  • Scaling data to the standard normal distribution 
  • Creating binary features through thresholding 
  • Working with categorical variables 
  • Imputing missing values through various strategies 
  • A linear model in the presence of outliers 
  • Putting it all together with pipelines 
  • Using Gaussian processes for regression 
  • Using SGD for regression 
  1. Dimensionality Reduction 
  • Dimensionality Reduction 
  • Introduction 
  • Reducing dimensionality with PCA 
  • Using factor analysis for decomposition 
  • Using kernel PCA for nonlinear dimensionality reduction 
  • Using truncated SVD to reduce dimensionality 
  • Using decomposition to classify with DictionaryLearning 
  • Doing dimensionality reduction with manifolds – t-SNE 
  • Testing methods to reduce dimensionality with pipelines 
  1. Linear Models with scikit-learn 
  • Linear Models with scikit-learn 
  • Introduction 
  • Fitting a line through data 
  • Fitting a line through data with machine learning 
  • Evaluating the linear regression model 
  • Using ridge regression to overcome linear regression’s shortfalls 
  • Optimizing the ridge regression parameter 
  • Using sparsity to regularize models 
  • Taking a more fundamental approach to regularization with LARS 
  • References 
  1. Linear Models – Logistic Regression 
  • Linear Models – Logistic Regression 
  • Introduction 
  • Loading data from the UCI repository 
  • Viewing the Pima Indians diabetes dataset with pandas 
  • Looking at the UCI Pima Indians dataset web page 
  • Machine learning with logistic regression 
  • Examining logistic regression errors with a confusion matrix 
  • Varying the classification threshold in logistic regression 
  • Receiver operating characteristic – ROC analysis 
  • Plotting an ROC curve without context 
  • Putting it all together – UCI breast cancer dataset 
  1. Building Models with Distance Metrics 
  • Building Models with Distance Metrics 
  • Introduction 
  • Using k-means to cluster data 
  • Optimizing the number of centroids 
  • Assessing cluster correctness 
  • Using MiniBatch k-means to handle more data 
  • Quantizing an image with k-means clustering 
  • Finding the closest object in the feature space 
  • Probabilistic clustering with Gaussian mixture models 
  • Using k-means for outlier detection 
  • Using KNN for regression 
  1. Cross-Validation and Post-Model Workflow 
  • Cross-Validation and Post-Model Workflow 
  • Introduction 
  • Selecting a model with cross-validation 
  • K-fold cross validation 
  • Balanced cross-validation 
  • Cross-validation with ShuffleSplit 
  • Time series cross-validation 
  • Grid search with scikit-learn 
  • Randomized search with scikit-learn 
  • Classification metrics 
  • Regression metrics 
  • Clustering metrics 
  • Using dummy estimators to compare results 
  • Feature selection 
  • Feature selection on L1 norms 
  • Persisting models with joblib or pickle 
  1. Support Vector Machines 
  • Support Vector Machines 
  • Introduction 
  • Classifying data with a linear SVM 
  • Optimizing an SVM 
  • Multiclass classification with SVM 
  • Support vector regression 
  1. Tree Algorithms and Ensembles 
  • Tree Algorithms and Ensembles 
  • Introduction 
  • Doing basic classifications with decision trees 
  • Visualizing a decision tree with pydot 
  • Tuning a decision tree 
  • Using decision trees for regression 
  • Reducing overfitting with cross-validation 
  • Implementing random forest regression 
  •  Bagging regression with nearest neighbors 
  • Tuning gradient boosting trees 
  • Tuning an AdaBoost regressor 
  • Writing a stacking aggregator with scikit-learn 
  1. Text and Multiclass Classification with scikit-learn 
  • Text and Multiclass Classification with scikit-learn 
  • Using LDA for classification 
  • Working with QDA – a nonlinear LDA 
  • Using SGD for classification 
  • Classifying documents with Naive Bayes 
  • Label propagation with semi-supervised learning 
  1. Neural Networks 
  • Neural Networks 
  • Introduction 
  • Perceptron classifier 
  • Neural network – multilayer perceptron 
  • Stacking with a neural network 
  1. Create a Simple Estimator 
  • Create a Simple Estimator 
  • Introduction 
  • Create a simple estimator 
View All Courses

    Course Inquiry

    Fill in the details below and we will get back to you as quickly as we can.

    Interested in any of these related courses?