Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

banner-img

Course Skill Level:

Foundational to Intermediate

Course Duration:

4 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    AI / Machine Learning

  • Course Code:

    SCALMLL21E09

Who should attend & recommended skills:

Those with basic Linux, IT, and machine learning skills

Who should attend & recommended skills

  • This course is geared for those with basic Linux and computing skills who wish to leverage Scala and Machine Learning to study and construct systems that can learn from data.
  • Skill-level: Foundation-level Machine Learning skills for Intermediate skilled team members. This is not a basic class.
  • IT skills: Basic to Intermediate (1-2 years’ experience)
  • Machine Learning: Basic to Intermediate (1-2 years’ experience)
  • Linux: Basic (1-2 years’ experience), including familiarity with command-line options such as ls, cd, cp, and su

About this course

The discovery of information through data clustering and classification is becoming a key differentiator for competitive organizations. Machine learning applications are everywhere, from self-driving cars, engineering design, logistics, manufacturing, and trading strategies, to detection of genetic anomalies. The course is your one stop guide that introduces you to the functional capabilities of the Scala programming language that are critical to the creation of machine learning algorithms such as dependency injection and implicits. You start by learning data preprocessing and filtering techniques. Following this, you’ll move on to unsupervised learning techniques such as clustering and dimension reduction, followed by probabilistic graphical models such as Naive Bayes, hidden Markov models and Monte Carlo inference. Further, it covers the discriminative algorithms such as linear, logistic regression with regularization, kernelization, support vector machines, neural networks, and deep learning. You’ll move on to evolutionary computing, multibandit algorithms, and reinforcement learning. Finally, the course includes a comprehensive overview of parallel computing in Scala and Akka followed by a description of Apache Spark and its ML library. With updated codes based on the latest version of Scala and comprehensive examples, this book will ensure that you have more than just a solid fundamental knowledge in machine learning with Scala.

Skills acquired & topics covered

  • A broad variety of data processing, machine learning, and genetic algorithms through diagrams, mathematical formulation, and updated source code in Scala
  • Taking your expertise in Scala programming to the next level by creating and customizing AI applications
  • Experimenting with different techniques and evaluate their benefits and limitations using real-world applications in a tutorial style
  • Building dynamic workflows for scientific computing
  • Leveraging open source libraries to extract patterns from time series
  • Writing your own classification, clustering, or evolutionary algorithm
  • Performing relative performance tuning and evaluation of Spark
  • Mastering probabilistic models for sequential data
  • Experimenting with advanced techniques such as regularization and kernelization
  • Diving into neural networks and some deep learning architecture
  • Applying some basic multi-arm-bandit algorithms
  • Solving big data problems with Scala parallel collections, Akka actors, and Apache Spark clusters
  • Applying key learning strategies to a technical analysis of financial markets

Course breakdown / modules

  • Mathematical notations for the curious
  • Why machine learning?
  • Why Scala?
  • Model categorization
  • Taxonomy of machine learning algorithms
  • Leveraging Java libraries
  • Tools and frameworks
  • Source code
  • Let’s kick the tires

  • Modeling
  • Defining a methodology
  • Monadic data transformation
  • Workflow computational model
  • Profiling data
  • Assessing a model

  • Time series in Scala
  • Moving averages
  • Fourier analysis
  • The discrete Kalman filter
  • Alternative preprocessing techniques

  • K-mean clustering
  • Expectation-Maximization (EM)

  • Challenging model complexity
  • The divergences
  • Principal components analysis (PCA)
  • Nonlinear models

  • Probabilistic graphical models
  • Naïve Bayes classifiers
  • Multivariate Bernoulli classification
  • Naïve Bayes and text mining
  • Pros and cons

  • Markov decision processes
  • The hidden Markov model (HMM)
  • Conditional random fields
  • Regularized CRF and text analytics
  • Comparing CRF and HMM
  • Performance consideration

  • The purpose of sampling
  • Gaussian sampling
  • Monte Carlo approximation
  • Bootstrapping with replacement
  • Markov Chain Monte Carlo (MCMC)

  • Linear regression
  • Regularization
  • Numerical optimization
  • Logistic regression

  • Feed-forward neural networks (FFNN)
  • The multilayer perceptron (MLP)
  • Evaluation
  • Benefits and limitations

  • Sparse autoencoder
  • Restricted Boltzmann Machines (RBMs)
  • Convolution neural network

  • Kernel functions
  • The support vector machine (SVM)
  • Performance considerations

  • Evolution
  • Genetic algorithms and machine learning
  • Genetic algorithm components
  • Implementation
  • GA for trading strategies
  • Advantages and risks of genetic algorithms

  • K-armed bandit
  • Thompson sampling
  • Upper bound confidence

  • Learning classifier systems

  • Overview
  • Scala
  • Scalability with Actors
  • Akka

  • Overview
  • Apache Spark core
  • MLlib library
  • Reusable ML pipelines
  • Extending Spark
  • Streaming engine
  • Performance evaluation
  • Pros and cons