Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

banner-img

Course Skill Level:

Foundational

Course Duration:

2 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    Big Data & Data Science

  • Course Code:

    PANDASL21E09

Who should attend & recommended skills:

Those with Python and intermediate spreadsheet software experience

Who should attend & recommended skills

  • Python experienced developers, analysts or others with Python skills who wish to use pandas to automate repetitive spreadsheet functionality and derive insight from data by sorting columns, filtering data subsets, and creating multi-leveled indices.
  • Skill-level: Foundation-level Pandas skills for Intermediate skilled team members. This is not a basic class.
  • Python: Basic (1-2 years’ experience)
  • Spreadsheet software: Intermediate (3-5 years’ experience)

About this course

Pandas makes it easy to dive into Python-based data analysis. You’ll learn to use pandas to automate repetitive spreadsheet functionality and derive insight from data by sorting columns, filtering data subsets, and creating multi-leveled indices. Each lesson is a self-contained tutorial, letting you dip in when you need to troubleshoot tricky problems. Best of all, you won’t be learning from sterile or randomly created data. You’ll start with a variety of datasets that are big, small, incomplete, broken, and messy and learn how to clean and format them for proper analysis.

Skills acquired & topics covered

  • Importing a CSV, identifying issues with its data structures, and converting it to the proper format
  • Sorting, filtering, pivoting, and drawing conclusions from a dataset and its subsets.
  • Identifying trends from text-based and time-based data
  • Organizing, grouping, merging, and joining separate datasets
  • Real-world datasets that are easy to download and explore

Course breakdown / modules

  • Data in the 21st Century
  • Introducing pandas
  • Importing a Dataset
  • Manipulating a DataFrame
  • Counting Values in a Series
  • Filtering a Column by One or More Criteria
  • Grouping Data

  • Simple Data Types
  • Operators
  • Variables
  • Functions
  • Objects and Methods
  • Lists
  • Tuples
  • Dictionaries
  • Sets
  • Modules, Classes, and Datetimes

  • Dimensions
  • The ndarray Object
  • The nan Object

  • Overview of a Series
  • Create a Series from Python Objects
  • Retrieving the First and Last Rows
  • Mathematical Operations
  • Passing the Series to Python’s Built-In Functions
  • Coding Challenges / Exercises

  • Importing a Dataset with the read_csv Method
  • Sorting a Series
  • Overwriting a Series with the inplace Parameter
  • Counting Values with the value_counts Method
  • Invoking a Function on Every Series Value with the apply Method
  • Coding Challenge: Deriving Insights from a Series

  • Overview of a DataFrame
  • Similarities between Series and DataFrames
  • Sorting a DataFrame
  • Sort by Index
  • Setting a New Index
  • Selecting Columns or Rows from a DataFrame
  • Select Rows from a DataFrame
  • Extract Value from Series
  • Rename Column or Row
  • Resetting an Index
  • Coding Challenge

  • Optimizing A Dataset for Memory Usage
  • Filtering by a Single Condition
  • Filtering by Multiple Conditions
  • Filtering by Condition
  • Dealing with Duplicates
  • Coding Challenge

  • String Casing
  • String Slicing
  • Boolean Methods
  • Splitting Strings
  • Coding Challenge
  • A Note on Regular Expressions

  • The MultiIndex Object
  • MultiIndex DataFrames
  • Sorting A MultiIndex
  • Indexing with a MultiIndex
  • Cross Sections
  • Manipulating the Index