Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

Data Analysis with Pandas

  • Course Code: Data Analysis / BI - Data Analysis with Pandas
  • Course Dates: Contact us to schedule.
  • Course Category: Big Data & Data Science Duration: 3 Days Audience: This course is geared for Python experienced developers, analysts or others who wants to Get grips with pandas - a versatile and high-performance Python library for data manipulation, analysis, and discovery.

Course Snapshot 

  • Duration: 3 days 
  • Skill-level: Foundation-level Data Analysis with Pandas skills for Intermediate skilled team members. This is not a basic class. 
  • Targeted Audience: This course is geared for Python experienced developers, analysts or others who wants to Get grips with pandas – a versatile and high-performance Python library for data manipulation, analysis, and discovery.  
  • Hands-on Learning: This course is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and discussions with machine-based student labs and exercises. Student machines are required. 
  • Delivery Format: This course is available for onsite private classroom presentation, or remote instructor led delivery, or CBT/WBT (by request). 
  • Customizable: This course may be tailored to target your specific training skills objectives, tools of choice and learning goals. 

Data analysis has become a necessary skill in a variety of domains where knowing how to work with data and extract insights can generate significant value. Data Analysis with Pandas will show you how to analyze your data, get started with machine learning, and work effectively with Python libraries often used for data science, such as pandas, NumPy, matplotlib, seaborn, and scikit-learn. Using real-world datasets, you will learn how to use the powerful pandas library to perform data wrangling to reshape, clean, and aggregate your data. Then, you will be able to conduct exploratory data analysis by calculating summary statistics and visualizing the data to find patterns. In the concluding lessons, you will explore some applications of anomaly detection, regression, clustering, and classification using scikit-learn to make predictions based on past data. By the end of this course, you will be equipped with the skills you need to use pandas to ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets. 

Working in a hands-on learning environment, led by our SAS for Data Analysis expert instructor, students will learn about and explore: 

  • Perform efficient data analysis and manipulation tasks using pandas 
  • Apply pandas to different real-world domains with the help of step-by-step demonstrations 
  • Get accustomed to using pandas as an effective data exploration tool. 

Topics Covered: This is a high-level list of topics covered in this course. Please see the detailed Agenda below 

  • Understand how data analysts and scientists gather and analyze data 
  • Perform data analysis and data wrangling using Python 
  • Combine, group, and aggregate data from multiple sources 
  • Create data visualizations with pandas, matplotlib, and seaborn 
  • Apply machine learning (ML) algorithms to identify patterns and make predictions 
  • Use Python data science libraries to analyze real-world datasets 
  • Use pandas to solve common data representation and analysis problems 
  • Build Python scripts, modules, and packages for reusable analysis code 

Audience & Pre-Requisites 

This course is geared for attendees with Python skills who wish to get Leverage of full potential of SAS to get unique, actionable insights from your data 

Pre-Requisites:  Students should have  

  • developers with some knowledge of Python.  
  • experienced with spreadsheet software who know the basics of Python. 

Course Agenda / Topics 

  1. Introduction to Data Analysis 
  • Introduction to Data Analysis 
  • Fundamentals of data analysis 
  • Statistical foundations 
  • Setting up a virtual environment 
  1. Working with Pandas DataFrames 
  • Working with Pandas DataFrames 
  • Pandas data structures 
  • Bringing data into a pandas DataFrame 
  • Inspecting a DataFrame object 
  • Grabbing subsets of the data 
  • Adding and removing data 
  1. Data Wrangling with Pandas 
  • Data Wrangling with Pandas 
  • What is data wrangling? 
  • Collecting temperature data 
  • Cleaning up the data 
  • Restructuring the data 
  • Handling duplicate, missing, or invalid data 
  1. Aggregating Pandas DataFrames 
  • Aggregating Pandas DataFrames 
  • Database-style operations on DataFrames 
  • DataFrame operations 
  • Aggregations with pandas and numpy 
  • Time series 
  1. Visualizing Data with Pandas and Matplotlib 
  • Visualizing Data with Pandas and Matplotlib 
  • An introduction to matplotlib 
  • Plotting with pandas 
  • The pandas.plotting subpackage 
  1. Plotting with Seaborn and Customization Techniques 
  • Plotting with Seaborn and Customization Techniques 
  • Utilizing seaborn for advanced plotting 
  • Formatting 
  • Customizing visualizations 
  1. Financial Analysis – Bitcoin and the Stock Market 
  • Financial Analysis – Bitcoin and the Stock Market 
  • Building a Python package 
  • Data extraction with pandas 
  • Exploratory data analysis 
  • Technical analysis of financial instruments 
  • Modeling performance 
  1. Rule-Based Anomaly Detection 
  • Rule-Based Anomaly Detection 
  • Simulating login attempts 
  • Exploratory data analysis 
  • Rule-based anomaly detection 
  1. Getting Started with Machine Learning in Python 
  • Getting Started with Machine Learning in Python 
  • Learning the lingo 
  • Exploratory data analysis 
  • Preprocessing data 
  • Clustering 
  • Regression 
  • Classification 
  1. Making Better Predictions – Optimizing Models 
  • Making Better Predictions – Optimizing Models 
  • Hyperparameter tuning with grid search 
  • Feature engineering 
  • Ensemble methods 
  • Inspecting classification prediction confidence 
  • Addressing class imbalance 
  • Regularization 
  1. Machine Learning Anomaly Detection 
  • Machine Learning Anomaly Detection 
  • Exploring the data 
  • Unsupervised methods 
  • Supervised methods 
  • Online learning 
  1. The Road Ahead 
  • The Road Ahead 
  • Data resources 
  • Practicing working with data 
  • Python practice 
View All Courses

    Course Inquiry

    Fill in the details below and we will get back to you as quickly as we can.

    Interested in any of these related courses?