Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

Apache Hive Essentials

  • Course Code: Big Data - Apache Hive Essentials
  • Course Dates: Contact us to schedule.
  • Course Category: Big Data & Data Science Duration: 3 Days Audience: This course is geared for those who wants a fantastic journey to discover the attributes of big data using Apache Hive

Course Snapshot 

  • Duration: 3 days 
  • Skill-level: Foundation-level Apache Hive skills for Intermediate skilled team members. This is not a basic class. 
  • Targeted Audience: This course is geared for those who wants a fantastic journey to discover the attributes of big data using Apache Hive 
  • Hands-on Learning: This course is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and discussions with machine-based student labs and exercises. Student machines are required. 
  • Delivery Format: This course is available for onsite private classroom presentation. 
  • Customizable: This course may be tailored to target your specific training skills objectives, tools of choice and learning goals. 

In this course, we prepare you for your journey into big data by firstly introducing you to backgrounds in the big data domain, along with the process of setting up and getting familiar with your Hive working environment. Next, the course guides you through discovering and transforming the values of big data with the help of examples. It also hones your skills in using the Hive language in an efficient manner. Toward the end, the course focuses on advanced topics, such as performance, security, and extensions in Hive, which will guide you on exciting adventures on this worthwhile big data journey. By the end of the course, you will be familiar with Hive and able to work efficiently to find solutions to big data problems 

Working in a hands-on learning environment, led by our Apache Hive expert instructor, students will learn about and explore: 

  • Grasp the skills needed to write efficient Hive queries to analyze the Big Data 
  • Discover how Hive can coexist and work with other tools within the Hadoop ecosystem 
  • Uses practical, example-oriented scenarios to cover all the newly released features of Apache Hive 2.3.3 

Topics Covered: This is a high-level list of topics covered in this course. Please see the detailed Agenda below 

  • Create and set up the Hive environment 
  • Discover how to use Hive’s definition language to describe data 
  • Discover interesting data by joining and filtering datasets in Hive 
  • Transform data by using Hive sorting, ordering, and functions 
  • Aggregate and sample data in different ways 
  • Boost Hive query performance and enhance data security in Hive 
  • Customize Hive to your needs by using user-defined functions and integrate it 
  • with other tools 

Audience & Pre-Requisites 

This course is geared for attendees who wants a fantastic journey to discover the attributes of big data using Apache Hive 

Pre-Requisites:  Students should have  

  • Basic to Intermediate IT Skills. Attendees without a programming background like Python may view labs as follow along exercises or team with others to complete them. 
  • Good foundational mathematics or logic skills 

Course Agenda / Topics 

  1. Overview of Big Data and Hive 
  • Overview of Big Data and Hive 
  • A short history 
  • Introducing big data 
  • The relational and NoSQL databases versus Hadoop 
  • Batch, real-time, and stream processing 
  • Overview of the Hadoop ecosystem 
  • Hive overview 
  1. Setting Up the Hive Environment 
  • Setting Up the Hive Environment 
  • Installing Hive from Apache 
  • Installing Hive from vendors 
  • Using Hive in the cloud  
  • Using the Hive command 
  • Using the Hive IDE 
  1. Data Definition and Description 
  • Data Definition and Description 
  • Understanding data types 
  • Data type conversions 
  • Data Definition Language 
  • Database 
  • Tables 
  • Partitions 
  • Buckets 
  • Views 
  1. Data Correlation and Scope 
  • Data Correlation and Scope 
  • Project data with SELECT 
  • Filtering data with conditions 
  • Linking data with JOIN 
  • Combining data with UNION 
  1. Data Manipulation 
  • Data Manipulation 
  • Data exchanging with LOAD 
  • Data exchange with INSERT 
  • Data exchange with [EX|IM]PORT 
  • Data sorting 
  • Functions 
  • Transactions and locks 
  1. Data Aggregation and Sampling 
  • Data Aggregation and Sampling 
  • Basic aggregation  
  • Enhanced aggregation 
  • Aggregation condition 
  • Window functions 
  • Sampling 
  1. Performance Considerations 
  • Performance Considerations 
  • Performance utilities 
  • Design optimization 
  • Data optimization 
  • Job optimization 
  1. Extensibility Considerations 
  • Extensibility Considerations 
  • User-defined functions 
  • HPL/SQL 
  • Streaming 
  • SerDe 
  1. Security Considerations 
  • Security Considerations 
  • Authentication 
  • Authorization 
  • Mask and encryption 
  1. Working with Other Tools 
  • Working with Other Tools 
  • The JDBC/ODBC connector 
  • NoSQL 
  • The Hue/Ambari Hive view 
  • HCatalog 
  • Oozie 
  • Spark 
  • Hivemall 
  •  
View All Courses

    Course Inquiry

    Fill in the details below and we will get back to you as quickly as we can.

    Interested in any of these related courses?