Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

banner-img

Course Skill Level:

Foundational to Intermediate

Course Duration:

3 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    Big Data & Data Science

  • Course Code:

    SADWRPL21E09

Who should attend & recommended skills:

Those with Python experience and basic IT & Linux skills

Who should attend & recommended skills

  • Python experienced developers, analysts or others with Python skills who are seeking software implementation illustrated with R and Python.
  • Skill-level: Foundation-level R skills for Intermediate skilled team members. This is not a basic class.
  • IT skills: Basic to Intermediate (1-5 years’ experience)
  • Linux: Basic (1-2 years’ experience), including familiarity with command-line options such as ls, cd, cp, and su
  • Those without a programming background like Python may view labs as follow along exercises or team with others to complete them.”
  • “Those who want to build a solid foundation of Storm essentials and use Apache Storm for the real-world tasks associated with processing and analyzing real-time data streams.
  • Skill-level: Foundation-level Storm Applied skills for Intermediate skilled team members. This is not a basic class.
  • IT skills: Basic to Intermediate (1-5 years’ experience)
  • Big data and real-time system: Basic (1-2 years’ experience)
  • Storm: Experience not required

About this course

Statistical Analysis involves collecting and examining data to describe the nature of data that needs to be analyzed. It helps you explore the relation of data and build models to make better decisions. This course explores statistical concepts along with R and Python, which are well integrated from the word go. Almost every concept has an R code going with it which exemplifies the strength of R and applications. The R code and programs have been further strengthened with equivalent Python programs. Thus, you will first understand the data characteristics, descriptive statistics and the exploratory attitude, which will give you firm footing of data analysis. Statistical inference will complete the technical footing of statistical methods. Regression, linear, logistic modeling, and CART, builds the essential toolkit. This will help you complete complex problems in the real world. You will begin with a brief understanding of the nature of data and end with modern and advanced statistical models like CART. Every step is taken with DATA and R code, and further enhanced by Python. The data analysis journey begins with exploratory analysis, which is more than simple, descriptive, data summaries. You will then apply linear regression modeling, and end with logistic regression, CART, and spatial statistics. By the end of this course you will be able to apply your statistical learning in major domains at work or in your projects.

Skills acquired & topics covered

  • The nature of data through software which takes the preliminary concepts right away using R and Python.
  • Data modeling and visualization to perform efficient statistical analysis
  • Getting well versed with techniques such as regression, clustering, classification, support vector machines and much more to learn the fundamentals of modern statistics.
  • The nature of data through software with preliminary concepts right away in R
  • Reading data from various sources and export the R output to other software
  • Performing effective data visualization with the nature of variables and rich alternative options
  • Doing exploratory data analysis for useful first sight understanding building up to the right attitude towards effective inference
  • Statistical inference through simulation combining the classical inference and modern computational power
  • Delving deep into regression models such as linear and logistic for continuous and discrete regressands for forming the fundamentals of modern statistics
  • An introduction to CART – a machine learning tool which is very useful when the data has an intrinsic nonlinearity

Course breakdown / modules

  • Questionnaire and its components
  • Experiments with uncertainty in computer science
  • Installing and setting up R
  • Using R packages
  • Python installation and setup
  • IDEs for R and Python
  • The companion code bundle
  • Discrete distributions
  • Continuous distributions

  • Packages and settings – R and Python
  • Understanding data.frame and other formats
  • Using utils and the foreign packages
  • Exporting data/graphs

  • Packages and settings – R and Python
  • Visualization techniques for categorical data
  • Visualization techniques for continuous variable data
  • Pareto chart
  • A brief peek at ggplot2

  • Packages and settings – R and Python
  • Essential summary statistics
  • Techniques for exploratory analysis

  • Packages and settings – R and Python
  • Maximum likelihood estimator
  • Confidence intervals
  • Hypothesis testing

  • Packages and settings – R and Python
  • The essence of regression
  • The simple linear regression model
  • Multiple linear regression model
  • Regression diagnostics
  • Model selection

  • Packages and settings – R and Python
  • Model validation and diagnostics
  • Logistic regression for the German credit screening dataset

  • Packages and settings – R and Python
  • Regression spline
  • Ridge regression for linear models

  • Packages and settings – R and Python
  • Splitting the data

  • Packages and settings – R and Python
  • Understanding bagging