Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

banner-img

Course Skill Level:

Foundational

Course Duration:

3 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    Big Data & Data Science

  • Course Code:

    APHIESL21E09

Who should attend & recommended skills:

Those with basic IT skills

Who should attend & recommended skills

  • This course is geared for those who want a fantastic journey to discover the attributes of big data using Apache Hive.
  • Skill-level: Foundation-level Apache Hive skills for Intermediate skilled team members. This is not a basic class.
  • IT Skills: Basic to Intermediate (1-5 years’ experience)
  • Attendees without a programming background like Python may view labs as follow along exercises or team with others to complete them

About this course

In this course, we prepare you for your journey into big data by firstly introducing you to backgrounds in the big data domain, along with the process of setting up and getting familiar with your Hive working environment. Next, the course guides you through discovering and transforming the values of big data with the help of examples. It also hones your skills in using the Hive language in an efficient manner. Toward the end, the course focuses on advanced topics, such as performance, security, and extensions in Hive, which will guide you on exciting adventures on this worthwhile big data journey. By the end of the course, you will be familiar with Hive and able to work efficiently to find solutions to big data problems.

Skills acquired & topics covered

  • Working in a hands-on learning environment, led by our Apache Hive expert instructor, participants will learn about and explore:
  • Grasping the skills needed to write efficient Hive queries to analyze the Big Data
  • Discovering how Hive can coexist and work with other tools within the Hadoop ecosystem
  • Using practical, example-oriented scenarios to cover all the newly released features of Apache Hive 2.3.3
  • Creating and setting up the Hive environment
  • Discovering how to use Hive’s definition language to describe data
  • Discovering interesting data by joining and filtering datasets in Hive
  • Transforming data by using Hive sorting, ordering, and functions
  • Aggregate and sample data in different ways
  • Boosting Hive query performance and enhance data security in Hive
  • Customizing Hive to your needs by using user-defined functions and integrating it with other tools

Course breakdown / modules

  • A short history
  • Introducing big data
  • The relational and NoSQL databases versus Hadoop
  • Batch, real-time, and stream processing
  • Overview of the Hadoop ecosystem
  • Hive overview

  • Installing Hive from Apache
  • Installing Hive from vendors
  • Using Hive in the cloud
  • Using the Hive command
  • Using the Hive IDE

  • Understanding data types
  • Data type conversions
  • Data Definition Language
  • Database
  • Tables
  • Partitions
  • Buckets
  • Views

  • Project data with SELECT
  • Filtering data with conditions
  • Linking data with JOIN
  • Combining data with UNION

  • Data exchanging with LOAD
  • Data exchange with INSERT
  • Data exchange with [EX

IM]PORT

  • Data sorting
  • Functions
  • Transactions and locks
    • Basic aggregation
    • Enhanced aggregation
    • Aggregation condition
    • Window functions
    • Sampling

    • Performance utilities
    • Design optimization
    • Data optimization
    • Job optimization

    • User-defined functions
    • HPL/SQL
    • Streaming
    • SerDe

    • Security Considerations
    • Authentication
    • Authorization
    • Mask and encryption