Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

banner-img

Course Skill Level:

Foundational

Course Duration:

4 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    Big Data & Data Science

  • Course Code:

    EXDAPYL21E09

Who should attend & recommended skills:

Those with basic Python, developing, & spreadsheet experience

Who should attend & recommended skills

  • This course is geared for Python experienced developers, analysts or others with Python skills who wish to discover techniques to summarize the characteristics of your data using PyPlot, NumPy, SciPy, and pandas.
  • Skill-level: Foundation-level Exploratory Data Analysis with Python skills for Intermediate skilled team members. This is not a basic class.
  • Python: Basic (1-2 years’ experience)
  • Spreadsheet software: Basic (1-2 years’ experience)

About this course

Exploratory Data Analysis (EDA) is an approach to data analysis that involves the application of diverse techniques to gain insights into a dataset. This course will help you gain practical knowledge of the main pillars of EDA – data cleaning, data preparation, data exploration, and data visualization. You’ll start by performing EDA using open-source datasets and perform simple to advanced analyses to turn data into meaningful insights. You’ll then learn various descriptive statistical techniques to describe the basic characteristics of data and progress to performing EDA on time-series data. As you advance, you’ll learn how to implement EDA techniques for model development and evaluation and build predictive models to visualize results. Using Python for data analysis, you’ll work with real-world datasets, understand data, summarize its characteristics, and visualize it for business intelligence. By the end of this EDA course, you’ll have developed the skills required to carry out a preliminary investigation on any dataset, yield insights into data, present your results with visual aids, and build a model that correctly predicts future outcomes.

Skills acquired & topics covered

  • Working in a hands-on learning environment, led by our Data Analysis with Python expert instructor, students will learn about and explore:
  • The fundamental concepts of exploratory data analysis using Python
  • Finding missing values in your data and identify the correlation between different variables
  • Practicing graphical exploratory analysis techniques using Matplotlib and the Seaborn Python package
  • Importing, cleaning, and exploring data to perform preliminary analysis using powerful Python packages
  • Identifying and transforming erroneous data using different data wrangling techniques
  • Exploring the use of multiple regression to describe non-linear relationships
  • Discovering hypothesis testing and explore techniques of time-series analysis
  • Understanding and interpreting results obtained from graphical analysis
  • Building, training, and optimizing predictive models to estimate results
  • Performing complex EDA techniques on open-source datasets

Course breakdown / modules

  • Exploratory Data Analysis Fundamentals
  • Understanding data science
  • The significance of EDA
  • Making sense of data
  • Comparing EDA with classical and Bayesian analysis
  • Software tools available for EDA
  • Getting started with EDA

  • Technical requirements
  • Line chart
  • Bar charts
  • Scatter plot
  • Area plot and stacked plot
  • Pie chart
  • Table chart
  • Polar chart
  • Histogram
  • Lollipop chart
  • Choosing the best chart
  • Other libraries to explore

  • Technical requirements
  • Loading the dataset
  • Data transformation
  • Data analysis

  • Technical requirements
  • Background
  • Merging database-style dataframes
  • Transformation techniques
  • Benefits of data transformation

  • Technical requirements
  • Understanding statistics
  • Measures of central tendency
  • Measures of dispersion

  • Technical requirements
  • Understanding groupby()
  • Groupby mechanics
  • Data aggregation
  • Pivot tables and cross-tabulations

  • Technical requirements
  • Introducing correlation
  • Types of analysis
  • Discussing multivariate analysis using the Titanic dataset
  • Outlining Simpson’s paradox
  • Correlation does not imply causation

  • Technical requirements
  • Understanding the time series dataset
  • TSA with Open Power System Data

  • Technical requirements
  • Hypothesis testing
  • p-hacking
  • Understanding regression
  • Model development and evaluation

  • Technical requirements
  • Types of machine learning
  • Understanding supervised learning
  • Understanding unsupervised learning
  • Understanding reinforcement learning
  • Unified machine learning workflow

  • Technical requirements
  • Disclosing the wine quality dataset
  • Analyzing red wine
  • Analyzing white wine
  • Model development and evaluation