Let us help you find the training program you are looking for.

If you can't find what you are looking for, contact us, we'll help you find it. We have over 800 training programs to choose from.

banner-img

Course Skill Level:

Foundational

Course Duration:

6 day/s

  • Course Delivery Format:

    Live, instructor-led.

  • Course Category:

    Big Data & Data Science

  • Course Code:

    DSBOCAL21E09

Who should attend & recommended skills:

Those with basic IT, programming, Python, & Linux skills

Who should attend & recommended skills

  • This course is for those who know the basics of Python.
  • No prior data science or machine learning skills required.
  • It is geared to test and build your knowledge of Python and learn to handle the kind of open-ended problems that professional data scientists work on daily.
  • Downloadable data sets and thoroughly-explained solutions help you lock in what you’ve learned, building your confidence and making you ready for an exciting new data science career.
  • Skill-level: Foundation-level Data Science Boot camp skills for Intermediate skilled team members.
  • This is not a basic class.
  • IT skills: Basic to Intermediate (1-5 years’ experience) Linux: Basic (1-2 years’ experience), including familiarity with command-line options such as ls, cd, cp, and suProgramming: Attendees without a programming background like Python may view labs as follow along exercises or team with others to complete them.

About this course

Data Science Boot camp is a comprehensive set of challenging projects carefully designed to grow your data science skills from novice to master. Veteran data scientist Leonard Apeltsin sets 10 increasingly difficult exercises that test your abilities against the kind of problems you’d encounter in the real-world. As you solve each challenge, you’ll acquire and expand the data science and Python skills you’ll use as a professional data scientist. Ranging from text processing to machine learning, each project comes complete with a unique downloadable data set and a fully-explained step-by-step solution. Because these projects come from Dr. Apelstin’s vast experience, each solution highlights the most likely failure points along with practical advice for getting past unexpected pitfalls. When you wrap up these 10 awesome exercises, you’ll have a diverse relevant skill set that’s transferable to working in industry.

Skills acquired & topics covered

  • Working in a hands-on learning environment, led by a Data Science Boot Camp expert instructor, students will learn about and explore:
  • Visualizing complex multi-variable datasets
  • Training a decision tree machine learning algorithm
  • 10 in-depth Python exercises with full downloadable data sets
  • Web scraping for text and images
  • Organizing data sets with clustering algorithms

Course breakdown / modules

  • Sample Space Analysis: An Equation-Free Approach for Measuring Uncertainty in Outcomes
  • Computing Non-Trivial Probabilities
  • Computing Probabilities Over Interval Ranges

  • Basic Matplotlib Plots
  • Plotting Coin-Flip Probabilities

  • Simulating Random Coin-Flips and Dice-Rolls Using NumPy
  • Computing Confidence Intervals Using Histograms and NumPy Arrays
  • Leveraging Confidence Intervals to Analyze a Biased Deck of Cards
  • Using Permutations to Shuffle Cards

  • Overview
  • Predicting Red Cards within a Shuffled Deck
  • Optimizing Strategies using the Sample Space for a 10-Card Deck
  • Key Takeaways
  • Case Study 2: Assessing Online Ad-Clicks for Significance

  • Exploring the Relationships between Data and Probability Using SciPy
  • Mean as a Measure of Centrality
  • Variance as a Measure of Dispersion

  • Manipulating the Normal Distribution Using SciPy
  • 6.2 Determining Mean and Variance of a Population through Random Sampling
  • 6.3 Making Predictions Using Mean

  • Assessing the Divergence Between Sample Mean and Population Mean
  • Data Dredging: Coming to False Conclusions through Oversampling
  • Bootstrapping with Replacement: Testing a Hypothesis When the Population Variance is Unknown
  • Permutation Testing: Comparing Means of Samples when the Population Parameters are Unknown

  • Storing Tables Using Basic Python
  • Exploring Tables Using Pandas
  • Retrieving Table Columns
  • Retrieving Table Rows
  • Modifying Table Rows and Columns
  • Saving and Loading Table Data
  • Visualizing Tables Using Seaborn

  • Processing the Ad-Click Table in Pandas
  • Computing P-values from Differences in Means
  • Determining Statistical Significance
  • Shades of Blue: A Real-Life Cautionary Tale
  • Key Takeaways
  • Case Study 3: Tracking Disease Outbreaks Using News Headlines

  • Using Centrality to Discover Clusters
  • K-Means: A Clustering Algorithm for Grouping Data into K Central Groups
  • Using the Elbow Method
  • Using Density to Discover Clusters
  • DBSCAN: A Clustering Algorithm for Grouping Data Based on Spatial Density
  • Analyzing Clusters Using Pandas

  • The Great-Circle Distance: A Metric for Computing Distances Between 2 Global Points
  • Plotting Maps Using Base map
  • Location Tracking Using GeoNamesCache
  • Matching Location Names in Text

  • Overview
  • Extracting Locations from Headline Data
  • Visualizing and Clustering the Extracted Location Data
  • Extracting Insights from Location Clusters
  • Key Takeaways
  • Case Study 4: Using Online Job Postings to Improve Your Data Science Resume

  • Simple Text Comparison
  • Vectorizing Texts Using Word Counts
  • Matrix Multiplication for Efficient Similarity Calculation
  • Computational Limits of Matrix Multiplication

  • Clustering 2D Data in 1-Dimension
  • Dimension Reduction Using PCA and Scikit-Learn
  • Clustering 4D Data in 2-Dimensions
  • Computing Principal Components Without Rotation
  • Efficient Dimension Reduction Using SVD and Scikit-Learn

  • The Structure of HTML Documents
  • Parsing HTML using Beautiful Soup
  • Downloading and Parsing Online Data

  • Overview
  • Extracting Skill Requirements from Job Posting Data
  • Filtering Jobs by Relevance
  • Clustering Skills in Relevant Job Postings
  • Conclusion
  • Key Takeaways