Type to search

Homepage
Preface
About the Authors
1 Introduction
2 Soft Skills for Data Scientists
3 Introduction to The Data
4 Big Data Cloud Platform
5 Data Pre-processing
6 Data Wrangling
- 6.1 Summarize Data
  - 6.1.1 dplyr package
  - 6.1.2 apply(), lapply() and sapply() in base R
- 6.2 Tidy and Reshape Data
7 Model Tuning Strategy
- 7.1 Variance-Bias Trade-Off
- 7.2 Data Splitting and Resampling
  - 7.2.1 Data Splitting
  - 7.2.2 Resampling
8 Measuring Performance
- 8.1 Regression Model Performance
- 8.2 Classification Model Performance
9 Regression Models
- 9.1 Ordinary Least Square
  - 9.1.1 The Magic P-value
  - 9.1.2 Diagnostics for Linear Regression
- 9.2 Principal Component Regression and Partial Least Square
10 Regularization Methods
11 Tree-Based Methods
12 Deep Learning
Appendix
A Handling Large Local Data
- A.1 readr
- A.2 data.table— enhanced data.frame
B R code for data simulation
- B.1 Customer Data for Clothing Company
- B.2 Swine Disease Breakout Data
References
Published with bookdown

Introduction to Data Science

Acknowledgements