Data Analysis with R

Home » Technology » Big Data & Data Science » Data Analysis with R

Course Skill Level:

Foundational

Course Duration:

4 day/s

Course Delivery Format:

Live, instructor-led.
Course Category:

Big Data & Data Science
Course Code:

DAR000L21E09

Who should attend & recommended skills:

Developers, analysts or others with basic Python and developing experience

Who should attend & recommended skills

This course is geared for Python experienced developers, analysts or others who want to learn, by example, the fundamentals of data analysis as well as several intermediate to advanced methods and techniques ranging from classification and regression to Bayesian methods and MCMC, which can be put to immediate use.
Skill-level: Foundation-level Data Analysis with R skills for Intermediate skilled team members. This is not a basic class.
Developers: Basic (1-2 years’ experience)
Python: Basic (1-2 years’ experience)

About this course

Frequently the tool of choice for academics, R has spread deep into the private sector and can be found in the production pipelines at some of the most advanced and successful enterprises. The power and domain-specificity of R allows the user to express complex analytics easily, quickly, and succinctly. Starting with the basics of R and statistical reasoning, this course dives into advanced predictive analytics, showing how to apply those techniques to real-world data though with real-world examples. Packed with engaging problems and exercises, this course begins with a review of R and its syntax with packages like Rcpp, ggplot2, and dplyr. From there, get to grips with the fundamentals of applied statistics and build on this knowledge to perform sophisticated and powerful analytics. Solve the difficulties relating to performing data analysis in practice and find solutions to working with messy data, large data, communicating results, and facilitating reproducibility. This course is engineered to be an invaluable resource through many stages of anyone’s career as a data analyst.

Skills acquired & topics covered

Working in a hands-on learning environment, led by our Data Analysis with R expert instructor, students will learn about and explore:
Analyzing your data using R the most powerful statistical programming language
How to implement applied statistics using practical use-cases
Using popular R packages to work with unstructured and structured data
Gaining a thorough understanding of statistical reasoning and sampling theory
Employing hypothesis testing to draw inferences from your data
Bayesian methods for estimating parameters
Train regression, classification, and time series models
Handling missing data gracefully using multiple imputation
Identifying and manage problematic data points
How to scale your analyses to larger data with Rcpp, data.table, dplyr, and parallelization
Putting best practices into effect to make your job easier and facilitate reproducibility

Course breakdown / modules

Navigating the basics
Getting help in R
Vectors
Functions
Matrices
Loading data into R
Working with packages

Univariate data
Frequency distributions
Central tendency
Spread
Populations, samples, and estimation
Probability distributions
Visualization methods

Multivariate data
Relationships between a categorical and continuous variable
Relationships between two categorical variables
The relationship between two continuous variables
Visualization methods

Basic probability
A tale of two interpretations
Sampling from distributions
The normal distribution

Estimating means
The sampling distribution
Interval estimation
Smaller samples

The null hypothesis significance testing framework
Testing the mean of one sample
Testing two means
Testing more than two means
Testing independence of proportions
What if my assumptions are unfounded?

The big idea behind Bayesian analysis
Choosing a prior
Who cares about coin flips
Enter MCMC – stage left
Using JAGS and runjags
Fitting distributions the Bayesian way
The Bayesian independent samples t-test

What’s… uhhh… the deal with the bootstrap?
Performing the bootstrap in R (more elegantly)
Confidence intervals
A one-sample test of means
Bootstrapping statistics other than the mean
Busting bootstrap myths

Linear models
Simple linear regression
Simple linear regression with a binary predictor
Multiple regression
Regression with a non-binary predictor
Kitchen sink regression
The bias-variance trade-off
Linear regression diagnostics
Advanced topics

k-Nearest neighbors
Logistic regression
Decision trees
Random forests
Choosing a classifier

What is a time series?
What is forecasting?
Creating and plotting time series
Components of time series
Time series decomposition
White noise
Autocorrelation
Smoothing
ETS and the state space model
Interventions for improvement
What we didn’t cover
Citations for the climate change data

Relational databases
Using JSON
XML
Other data formats
Online repositories

Analysis with missing data
Visualizing missing data
Types of missing data
Unsophisticated methods for dealing with missing data
So how does mice come up with the imputed values?

Checking unsanitized data
Regular expressions
Other tools for messy data

Wait to optimize
Using a bigger and faster machine
Be smart about your code
Using optimized packages
Using another R implementation
Using parallelization
Using Rcpp
Being smarter about your code

The data.table package
Using dplyr and tidyr to manipulate data
Functional programming as a main tidyverse principle
Reshaping data with tidyr

R scripting
R projects
Version control
Communicating results

Free Training Courses

Leadership & Professional Development Courses

Microsoft Office Courses

Technology Courses

Who should attend & recommended skills

About this course

Skills acquired & topics covered

Course breakdown / modules

Browse our programs to take the next step toward advancing yourself, your team, and organization.

Free Training Courses

Leadership & Professional Development Courses

Microsoft Office Courses

Technology Courses

Let us help you find the training program you are looking for.

Data Analysis with R

Who should attend & recommended skills

About this course

Skills acquired & topics covered

Course breakdown / modules

RefresheR

The Shape of Data

Describing Relationships

Probability

Using Data To Reason About The World

Testing Hypotheses

Bayesian Methods

The Bootstrap

Predicting Continuous Variables

Predicting Categorical Variables

Predicting Changes with Time

Sources of Data

Dealing with Missing Data

Dealing with Messy Data

Dealing with Large Data

Working with Popular R Packages

Reproducibility and Best Practices

Browse our programs to take the next step toward advancing yourself, your team, and organization.

View Course Detail