- Duration: 3 days
- Skill-level: Foundation-level Practical Data Science skills for Intermediate skilled team members. This is not a basic class.
- Targeted Audience: This course is geared for those who wants learn R language and its associated tools provide a straightforward way to tackle day-to-day data science tasks without a lot of academic theory or advanced mathematics..
- Hands-on Learning: This course is approximately 50% hands-on lab to 50% lecture ratio, combining engaging lecture, demos, group activities and discussions with machine-based student labs and exercises. Student machines are required.
- Delivery Format: This course is available for onsite private classroom presentation.
- Customizable: This course may be tailored to target your specific training skills objectives, tools of choice and learning goals.
Practical Data Science with R shows you how to apply the R programming language and useful statistical techniques to everyday business situations. Using examples from marketing, business intelligence, and decision support, it shows you how to design experiments (such as A/B tests), build predictive models, and present results to audiences of all levels.
Working in a hands-on learning environment, led by our Data Science expert instructor, students will learn about and explore:
It explains basic principles without the theoretical mumbo-jumbo and jumps right to the real use cases
- you’ll face as you collect, curate, and analyze the data crucial to the success of your business.
- You’ll apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business intelligence, and decision support.
Topics Covered: This is a high-level list of topics covered in this course. Please see the detailed Agenda below
- Data science for the business professional
- Statistical analysis using the R language
- Project lifecycle, from planning to delivery
- Numerous instantly familiar use cases
- Keys to effective data presentations
Audience & Pre-Requisites
This course is for intermediate Business analysts and developers are increasingly collecting, curating, analyzing, and reporting on crucial business datal.
Pre-Requisites: Students should have familiar with:
- Readers without a background in data science.
- Some familiarity with basic statistics, R, or another scripting language is assumed.
Course Agenda / Topics
- THE DATA SCIENCE PROCESS
- The roles in a data science project
- Stages of a data science project
- Setting expectations
- LOADING DATA INTO R
- Working with data from files
- Working with relational databases
- EXPLORING DATA
- Using summary statistics to spot problems
- Spotting problems using graphics and visualization
- MANAGING DATA
- Cleaning data
- Sampling for modeling and validation
- CHOOSING AND EVALUATING MODELS
- Mapping problems to machine learning tasks
- Evaluating models
- Validating models
- MEMORIZATION METHODS
- KDD and KDD Cup 2009
- Building single-variable models
- Building models using many variables
- LINEAR AND LOGISTIC REGRESSION
- Using linear regression
- Using logistic regression
- UNSUPERVISED METHODS
- Cluster analysis
- Association rules
- Exploring advanced methods
- Using bagging and random forests to reduce training variance
- Using generalized additive models (GAMs) to learn non-monotone relationships
- Using kernel methods to increase data separation
- Using SVMs to model complicated decision boundaries
- DOCUMENTATION AND DEPLOYMENT
- The buzz dataset
- Using knitr to produce milestone documentation
- Using comments and version control for running documentation
- Deploying models
- PRODUCING EFFECTIVE PRESENTATIONS
- Presenting your results to the project sponsor
- Presenting your model to end users
- Presenting your work to other data scientists