Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

Machine Learning with Apache Spark

  • Course Code: Artificial Intelligence - Machine Learning with Apache Spark
  • Course Dates: Contact us to schedule.
  • Course Category: AI / Machine Learning Duration: 2 Days Audience: This course is geared for those who wants to combine advanced analytics including Machine Learning, Deep Learning Neural Networks and Natural Language Processing with modern scalable technologies including Apache Spark to derive actionable insights from Big Data in real-time

Course Snapshot 

  • Duration: 2 days 
  • Skill-level: Foundation-level Machine Learning with Apache Spark skills for Intermediate skilled team members. This is not a basic class. 
  • Targeted Audience: This course is geared for those who wants to combine advanced analytics including Machine Learning, Deep Learning Neural Networks and Natural Language Processing with modern scalable technologies including Apache Spark to derive actionable insights from Big Data in real-time 
  • Hands-on Learning: This course is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and discussions with machine-based student labs and exercises. Student machines are required. 
  • Delivery Format: This course is available for onsite private classroom presentation. 
  • Customizable: This course may be tailored to target your specific training skills objectives, tools of choice and learning goals. 

Every person and every organization in the world manage data, whether they realize it or not. Data is used to describe the world around us and can be used for almost any purpose, from analyzing consumer habits to fighting disease and serious organized crime. Ultimately, we manage data in order to derive value from it, and many organizations around the world have traditionally invested in technology to help process their data faster and more efficiently. But we now live in an interconnected world driven by mass data creation and consumption where data is no longer rows and columns restricted to a spreadsheet, but an organic and evolving asset in its own right. With this realization comes major challenges for organizations: how do we manage the sheer size of data being created every second (think not only spreadsheets and databases, but also social media posts, images, videos, music, blogs and so on)? And once we can manage all of this data, how do we derive real value from it? The focus of Machine Learning with Apache Spark is to help us answer these questions in a hands-on manner. We introduce the latest scalable technologies to help us manage and process big data. We then introduce advanced analytical algorithms applied to real-world use cases in order to uncover patterns, derive actionable insights, and learn from this big data. 

Working in a hands-on learning environment, led by our Machine Learning with Apache expert instructor, students will learn about and explore: 

  • Make a hands-on start in the fields of Big Data, Distributed Technologies and Machine Learning 
  • Learn how to design, develop and interpret the results of common Machine Learning algorithms 
  • Uncover hidden patterns in your data in order to derive real actionable insights and business value 

Topics Covered: This is a high-level list of topics covered in this course. Please see the detailed Agenda below 

  • Understand how Spark fits in the context of the big data ecosystem 
  • Understand how to deploy and configure a local development environment using Apache Spark 
  • Understand how to design supervised and unsupervised learning models 
  • Build models to perform NLP, deep learning, and cognitive services using Spark ML libraries 
  • Design real-time machine learning pipelines in Apache Spark
  • Become familiar with advanced techniques for processing a large volume of data by applying machine learning algorithms 

Audience & Pre-Requisites 

This course is geared for attendees with Apache knowledge who wish to know the latest scalable technologies to help us manage and process big data. We then introduce advanced analytical algorithms applied to real-world use cases in order to uncover patterns, derive actionable insights, and learn from this big data. 

Pre-Requisites:  Students should have  

  • Basic to big-data-and-business-intelligence Skills. 
  • Good foundational mathematics or logic skills 
  • Basic Linux skills, including familiarity with command-line options such as ls, cd, cp, and su 

Course Agenda / Topics 

  1. The Big Data Ecosystem 
  • A brief history of data 
  • Big data ecosystem 
  1. Setting Up a Local Development Environment 
  • CentOS Linux 7 virtual machine 
  1. Artificial Intelligence and Machine Learning 
  • Artificial intelligence 
  • Machine learning 
  • Deep learning 
  • NLP 
  • Cognitive computing 
  • Machine learning pipelines in Apache Spark 
  1. Supervised Learning Using Apache Spark 
  • Linear regression 
  • Logistic regression 
  • Classification and Regression Trees 
  1. Unsupervised Learning Using Apache Spark 
  • Clustering 
  • Principal component analysis 
  1. Natural Language Processing Using Apache Spark 
  • Feature transformers 
  • Feature extractors 
  • Case study – sentiment analysis 
  1. Deep Learning Using Apache Spark 
  • Artificial neural networks 
  1. Real-Time Machine Learning Using Apache Spark 
  • Distributed streaming platform 
  • Distributed stream processing engines 
  • Stream processing pipeline 
View All Courses

    Course Inquiry

    Fill in the details below and we will get back to you as quickly as we can.

    Interested in any of these related courses?