Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

Building Data Streaming Applications with Apache Kafka

  • Course Code: Big Data - Building Data Streaming Applications with Apache Kafka
  • Course Dates: Contact us to schedule.
  • Course Category: Big Data & Data Science Duration: 3 Days Audience: This course is geared for those who wants to Design and administer fast, reliable enterprise messaging systems with Apache Kafka

Course Snapshot 

  • Duration: 3 days 
  • Skill-level: Foundation-level Apache Spark skills for Intermediate skilled team members. This is not a basic class. 
  • Targeted Audience: This course is geared for those who wants to Design and administer fast, reliable enterprise messaging systems with Apache Kafka 
  • Hands-on Learning: This course is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and discussions with machine-based student labs and exercises. Student machines are required. 
  • Delivery Format: This course is available for onsite private classroom presentation. 
  • Customizable: This course may be tailored to target your specific training skills objectives, tools of choice and learning goals. 

Apache Kafka is a popular distributed streaming platform that acts as a messaging queue or an enterprise messaging system. It lets you publish and subscribe to a stream of records, and process them in a fault-tolerant way as they occur. This course is a comprehensive guide to designing and architecting enterprise-grade streaming applications using Apache Kafka and other big data tools. It includes best practices for building such applications, and tackles some common challenges such as how to use Kafka efficiently and handle high data volumes with ease. This course first takes you through understanding the type messaging system and then provides a thorough introduction to Apache Kafka and its internal details. The second part of the book takes you through designing streaming application using various frameworks and tools such as Apache Spark, Apache Storm, and more. Once you grasp the basics, we will take you through more advanced concepts in Apache Kafka such as capacity planning and security. By the end of this course, you will have all the information you need to be comfortable with using Apache Kafka, and to design efficient streaming data applications with it. 

Working in a hands-on learning environment, led by our Apache Spark expert instructor, students will learn about and explore: 

  • Build efficient real-time streaming applications in Apache Kafka to process data streams of data 
  • Master the core Kafka APIs to set up Apache Kafka clusters and start writing message producers and consumers 
  • A comprehensive guide to help you get a solid grasp of the Apache Kafka concepts in Apache Kafka with practical examples 

Topics Covered: This is a high-level list of topics covered in this course. Please see the detailed Agenda below 

  • Learn the basics of Apache Kafka from scratch 
  • Use the basic building blocks of a streaming application 
  • Design effective streaming applications with Kafka using Spark, Storm &, and Heron 
  • Understand the importance of a low-latency , high-throughput, and fault-tolerant messaging system 
  • Make effective capacity planning while deploying your Kafka Application 
  • Understand and implement the best security practices 

Audience & Pre-Requisites 

This course is geared for attendees who want to Design and administer fast, reliable enterprise messaging systems with Apache Kafka. 

Pre-Requisites:  Students should have  

  • Basic to Intermediate IT Skills.  
  • no previous exposure to large-scale data analysis or NoSQL tools.  
  • Familiarity with traditional databases is helpful. 

Course Agenda / Topics 

  1. Introduction to Messaging Systems 
  • Introduction to Messaging Systems 
  • Understanding the principles of messaging systems 
  • Understanding messaging systems 
  • Peeking into a point-to-point messaging system 
  • Publish-subscribe messaging system 
  • Advance Queuing Messaging Protocol 
  • Using messaging systems in big data streaming applications 
  1. Introducing Kafka the Distributed Messaging Platform 
  • Introducing Kafka the Distributed Messaging Platform 
  • Kafka origins 
  • Kafka’s architecture 
  • Message topics 
  • Message partitions 
  • Replication and replicated logs 
  • Message producers 
  • Message consumers 
  • Role of Zookeeper 
  1. Deep Dive into Kafka Producers 
  • Deep Dive into Kafka Producers 
  • Kafka producer internals 
  • Kafka Producer APIs 
  • Java Kafka producer example 
  • Common messaging publishing patterns 
  • Best practices 
  1. Deep Dive into Kafka Consumers 
  • Deep Dive into Kafka Consumers 
  • Kafka consumer internals 
  • Kafka consumer APIs 
  • Java Kafka consumer 
  • Scala Kafka consumer 
  • Common message consuming patterns 
  • Best practices 
  1. Building Spark Streaming Applications with Kafka 
  • Building Spark Streaming Applications with Kafka 
  • Introduction to Spark  
  • Spark Streaming  
  • Use case log processing – fraud IP detection 
  • Producer  
  1. Building Storm Applications with Kafka 
  • Building Storm Applications with Kafka 
  • Introduction to Apache Storm 
  • Introduction to Apache Heron 
  • Integrating Apache Kafka with Apache Storm – Java 
  • Integrating Apache Kafka with Apache Storm – Scala 
  • Use case – log processing in Storm, Kafka, Hive 
  1. Using Kafka with Confluent Platform 
  • Using Kafka with Confluent Platform 
  • Introduction to Confluent Platform 
  • Deep driving into Confluent architecture 
  • Understanding Kafka Connect and Kafka Stream 
  • Playing with Avro using Schema Registry 
  • Moving Kafka data to HDFS 
  1. Building ETL Pipelines Using Kafka 
  • Building ETL Pipelines Using Kafka 
  • Considerations for using Kafka in ETL pipelines 
  • Introducing Kafka Connect 
  • Deep dive into Kafka Connect 
  • Introductory examples of using Kafka Connect 
  • Kafka Connect common use cases 
  1. Building Streaming Applications Using Kafka Streams 
  • Building Streaming Applications Using Kafka Streams 
  • Introduction to Kafka Streams 
  • Kafka Stream architecture  
  • Integrated framework advantages 
  • Understanding tables and Streams together 
  • Use case example of Kafka Streams 
  1. Kafka Cluster Deployment 
  • Kafka Cluster Deployment 
  • Kafka cluster internals 
  • Capacity planning 
  • Single cluster deployment 
  • Multicluster deployment 
  • Decommissioning brokers 
  • Data migration 
  1. Using Kafka in Big Data Applications 
  • Using Kafka in Big Data Applications 
  • Managing high volumes in Kafka 
  • Kafka message delivery semantics 
  • Big data and Kafka common usage patterns 
  • Kafka and data governance 
  • Alerting and monitoring 
  • Useful Kafka matrices 
  1. Securing Kafka 
  • Securing Kafka 
  • An overview of securing Kafka 
  • Wire encryption using SSL 
  • Kerberos SASL for authentication 
  • Understanding ACL and authorization 
  • Understanding Zookeeper authentication 
  • Apache Ranger for authorization 
  1. Streaming Application Design Considerations 
  • Streaming Application Design Considerations 
  • Latency and throughput 
  • Data and state persistence 
  • Data sources 
  • External data lookups 
  • Data formats 
  • Data serialization 
  • Level of parallelism 
  • Out-of-order events 
  • Message processing semantics 
  •  

View All Courses

    Course Inquiry

    Fill in the details below and we will get back to you as quickly as we can.

    Interested in any of these related courses?