Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

banner-img

Course Skill Level:

Foundational

Course Duration:

3 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    Big Data & Data Science

  • Course Code:

    DAPANDL21E09

Who should attend & recommended skills:

Python experienced developers, analysts, or others with basic Python, developing, and spreadsheet software skills

Who should attend & recommended skills

  • For those who want to get to grips with pandas – a versatile and high-performance Python library for data manipulation, analysis, and discovery, this course is geared for Python experienced developers, analysts or others with Python skills who wish to leverage the full potential of SAS to get unique, actionable insights from your data.
  • Skill-level: Foundation-level Data Analysis with Pandas skills for Intermediate skilled team members. This is not a basic class.
  • Developers: Basic to Intermediate (1-5 years’ experience)
  • Python: Basic (1-2 years’ experience)
  • Spreadsheet software: Basic to Intermediate (1-5 years’ experience)

About this course

Data analysis has become a necessary skill in a variety of domains where knowing how to work with data and extract insights can generate significant value. Data Analysis with Pandas will show you how to analyze your data, get started with machine learning, and work effectively with Python libraries often used for data science, such as pandas, NumPy, matplotlib, seaborn, and scikit-learn. Using real-world datasets, you will learn how to use the powerful pandas library to perform data wrangling to reshape, clean, and aggregate your data. Then, you will be able to conduct exploratory data analysis by calculating summary statistics and visualizing the data to find patterns. In the concluding lessons, you will explore some applications of anomaly detection, regression, clustering, and classification using scikit-learn to make predictions based on past data. By the end of this course, you will be equipped with the skills you need to use pandas to ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets.

Skills acquired & topics covered

  • Working in a hands-on learning environment, led by our SAS for Data Analysis expert instructor, students will learn about and explore:
  • Performing efficient data analysis and manipulation tasks using pandas
  • Applying pandas to different real-world domains with the help of step-by-step demonstrations
  • Getting accustomed to using pandas as an effective data exploration tool.
  • How data analysts and scientists gather and analyze data
  • Performing data analysis and data wrangling using Python
  • Combining, grouping, and aggregating data from multiple sources
  • Creating data visualizations with pandas, matplotlib, and seaborn
  • Applying machine learning (ML) algorithms to identify patterns and make predictions
  • Using Python data science libraries to analyze real-world datasets
  • Using pandas to solve common data representation and analysis problems
  • Building Python scripts, modules, and packages for reusable analysis code

Course breakdown / modules

  • Fundamentals of data analysis
  • Statistical foundations
  • Setting up a virtual environment

  • Pandas data structures
  • Bringing data into a pandas DataFrame
  • Inspecting a DataFrame object
  • Grabbing subsets of the data
  • Adding and removing data

  • What is data wrangling?
  • Collecting temperature data
  • Cleaning up the data
  • Restructuring the data
  • Handling duplicate, missing, or invalid data

  • Database-style operations on DataFrames
  • DataFrame operations
  • Aggregations with pandas and numpy
  • Time series

  • An introduction to matplotlib
  • Plotting with pandas
  • The pandas.plotting subpackage

  • Utilizing seaborn for advanced plotting
  • Formatting
  • Customizing visualizations

  • Building a Python package
  • Data extraction with pandas
  • Exploratory data analysis
  • Technical analysis of financial instruments
  • Modeling performance

  • Simulating login attempts
  • Exploratory data analysis
  • Rule-based anomaly detection

  • Learning the lingo
  • Exploratory data analysis
  • Preprocessing data
  • Clustering
  • Regression
  • Classification

  • Hyperparameter tuning with grid search
  • Feature engineering
  • Ensemble methods
  • Inspecting classification prediction confidence
  • Addressing class imbalance
  • Regularization

  • Exploring the data
  • Unsupervised methods
  • Supervised methods
  • Online learning

  • Data resources
  • Practicing working with data
  • Python practice