Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

Data Analysis with Scala

  • Course Code: Data Analysis / BI - V
  • Course Dates: Contact us to schedule.
  • Course Category: Big Data & Data Science Duration: 2 Days Audience: This course is geared for Python experienced developers, analysts or others who wants to Master scala's advanced techniques to solve real-world problems in data analysis and gain valuable insights from your data.

Course Snapshot 

  • Duration: 2 days 
  • Skill-level: Foundation-level Data Analysis with Scala skills for Intermediate skilled team members. This is not a basic class. 
  • Targeted Audience: This course is geared for Python experienced developers, analysts or others who wants to Master scala’s advanced techniques to solve real-world problems in data analysis and gain valuable insights from your data.  
  • Hands-on Learning: This course is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and discussions with machine-based student labs and exercises. Student machines are required. 
  • Delivery Format: This course is available for onsite private classroom presentation, or remote instructor led delivery, or CBT/WBT (by request). 
  • Customizable: This course may be tailored to target your specific training skills objectives, tools of choice and learning goals. 

Efficient business decisions with an accurate sense of business data helps in delivering better performance across products and services. This course helps you to leverage the popular Scala libraries and tools for performing core data analysis tasks with ease. The course begins with a quick overview of the building blocks of a standard data analysis process. You will learn to perform basic tasks like Extraction, Staging, Validation, Cleaning, and Shaping of datasets. You will later deep dive into the data exploration and visualization areas of the data analysis life cycle. You will make use of popular Scala libraries like Saddle, Breeze, Vegas, and Prediction for processing your datasets. You will learn statistical methods for deriving meaningful insights from data. You will also learn to create applications for Apache Spark 2.x on complex data analysis, in real-time. You will discover traditional machine learning techniques for doing data analysis. Furthermore, you will also be introduced to neural networks and deep learning from a data analysis standpoint. By the end of this course, you will be capable of handling large sets of structured and unstructured data, perform exploratory analysis, and building efficient Scala applications for discovering and delivering insights 

Working in a hands-on learning environment, led by our Data Analysis with Scala expert instructor, students will learn about and explore: 

  • A beginner’s guide for performing data analysis loaded with numerous rich, practical examples 
  • Access to popular Scala libraries such as Breeze, Saddle for efficient data manipulation and exploratory analysis 
  • Develop applications in Scala for real-time analysis and machine learning in Apache Spark. 

Topics Covered: This is a high-level list of topics covered in this course. Please see the detailed Agenda below 

  • Techniques to determine the validity and confidence level of data 
  • Apply quartiles and n-tiles to datasets to see how data is distributed into many buckets 
  • Create data pipelines that combine multiple data lifecycle steps 
  • Use built-in features to gain a deeper understanding of the data 
  • Apply Lasso regression analysis method to your data 
  • Compare Apache Spark API with traditional Apache Spark data analysis 

Audience & Pre-Requisites 

This course is geared for attendees with Python skills who wish to Master scala’s advanced techniques to solve real-world problems in data analysis and gain valuable insights from your data 

Pre-Requisites:  Students should have  

  • developers with some knowledge of Python.  
  • experienced with spreadsheet software who know the basics of Python. 

Course Agenda / Topics 

  1. Scala Overview 
  • Scala Overview 
  • Getting started with Scala 
  • Overview of object-oriented and functional programming 
  • Scala case classes and the collection API 
  • Overview of Scala libraries for data analysis 
  1. Data Analysis Life Cycle 
  • Data Analysis Life Cycle 
  • Data journey 
  • Sourcing data 
  • Understanding data 
  • Using ML to learn from data 
  • Creating a data pipeline 
  1. Data Ingestion 
  • Data Ingestion 
  • Data extraction 
  • Data staging 
  • Cleaning and normalizing 
  • Enriching 
  • Organizing and storing 
  1. Data Exploration and Visualization 
  • Data Exploration and Visualization 
  • Sampling data 
  • Performing ad hoc analysis 
  • Finding a relationship between data elements 
  • Visualizing data 
  1. Applying Statistics and Hypothesis Testing 
  • Applying Statistics and Hypothesis Testing 
  • Basics of statistics 
  • Vector level statistics 
  • Random data generation 
  • Hypothesis testing 
  1. Introduction to Spark for Distributed Data Analysis 
  • Introduction to Spark for Distributed Data Analysis 
  • Spark setup and overview 
  • Spark Datasets and DataFrames 
  • Sourcing data using Spark 
  • Using Spark to explore data 
  1. Traditional Machine Learning for Data Analysis 
  • Traditional Machine Learning for Data Analysis 
  • ML overview 
  • Decision trees 
  • Random forest 
  • Ridge and lasso regression 
  • k-means cluster analysis 
  • Natural language processing for data analysis 
  • Algorithm selections 
  1. Near Real-Time Data Analysis Using Streaming 
  • Near Real-Time Data Analysis Using Streaming 
  • Overview of streaming 
  • Spark Streaming overview 
  • Streaming a k-means clustering algorithm using Spark 
  • Streaming linear regression using Spark 
  1. Working with Data at Scale 
  • Working with Data at Scale 
  • Working with data at scale 
  • Cost considerations 
  • Reliability considerations 
View All Courses

    Course Inquiry

    Fill in the details below and we will get back to you as quickly as we can.

    Interested in any of these related courses?