STAT 133 - Introduction to Exploratory Data Analysis
Course Description
Exploratory data analysis techniques; robust estimators for location and scale parameters.
Course Learning Outcomes
At the end of this course, students will:
- Recognize the need and importance of exploratory techniques in studying
and summarizing the major characteristics of any data set;
- Apply the different exploratory techniques in studying and summarizing
data sets;
- Utilize robust estimators that are useful in cases where the assumptions of
an underlying probabilistic model are not satisfied; and
- Interpret the results derived from data sets after completing the analysis.
Course Outline
UNIT 1. Introduction
- What is EDA?
- Broad Phases of Data Analysis
- Main Themes of EDA
- Objectives of Graphical Methods
UNIT 2. Stem-and-Leaf Display
- The Basic Display
- Some Variations
UNIT 3. Letter Values
- Sorting and Ranking
- Letter Values and Letter-Value Displays
- Spreads
- Outside Cut-offs
UNIT 4. Boxplots and Batch Comparison
- The Boxplot for a Single Batch
- Comparing Batches Using Boxplots
- Quantile Plots and Empirical Quantile-Quantile Plots
- The Spread-versus-Level Plot
UNIT 5. Transforming Data
- Reasons for Transforming
- Power Transformations
- Transforming for Symmetry
- Transforming for Other Data Structures
- Matched Transformations
UNIT 6. Resistant Lines for y versus x
- Slope and Intercept
- Summary Points
- Finding the Slope and the Intercept
- Residuals
- Polishing the Fit
- Outliers
- Straightening Plots by Re-expression
UNIT 7. Analysis of Two-Way Tables
- Two-Way Tables
- Median Polish
- Non-Additivity and the Diagnostic Plot
UNIT 8. Smoothing Data
- Data Sequences and Smooth Summaries
- Elementary Smoothers
- Compound Smoothers
UNIT 9. Examining Residuals
- Residuals and the Fit
- Residuals as Batches
- Residual Plots
- Rootograms
UNIT 10. Location Estimators
- Main Concepts
- Simple L-Estimators
- M-Estimators
- Distributions
- Choosing Robust Estimators
UNIT 11. Geographical Information System