Skip to main content
Back to top
Ctrl
+
K
Search
Ctrl
+
K
Foundations of Data Science with Python
Online Resources for Book Chapters
1. Introduction
1.1. Who is this book for?
1.2. Why learn data science from this book?
1.3. What is data science?
1.4. What data science topics does this book cover?
1.5. What data science topics does this book
not
cover?
1.6. Extremely Brief Intro to Jupyter and Python
1.7. Chapter Summary
2. First Simulations, Visualizations, and Statistical Tests
2.1. Motivating Problem: Is This Coin Fair?
2.2. First Computer Simulations
2.3. First Visualizations: Scatter Plots and Histograms
2.4. First Statistical Tests
2.5. Chapter Summary
3. First Visualizations and Statistical Tests with Real Data
3.1. Introduction to Pandas
3.2. Visualizing Multiple Data Sets - Part 1: Scatter Plots
3.3. Partitions
3.4. Summary Statistics
3.5. Visualizing Multiple Data Sets - Part 2: Histograms for Partitioned Data
3.6. Null Hypothesis Testing with Real Data
3.7. A Quick Preview of Two-Dimensional Statistical Methods
3.8. Chapter Summary
4. Introduction to Probability
4.1. Outcomes, Sample Spaces, and Events
4.2. Relative Frequencies and Probabilities
4.3. Fair Experiments
4.4. Axiomatic Probability
4.5. Corollaries to the Axioms of Probability
4.6. Combinatorics
4.7. Chapter Summary
5. Null Hypothesis Tests
5.1. Statistical Studies
5.2. General Resampling Approaches for Null Hypothesis Significance Testing
5.3. Calculating
\(p\)
-Values
5.4. How to Sample from the Pooled Data
5.5. Example Null Hypothesis Significance Tests
5.6. Bootstrap Distribution and Confidence Intervals
5.7. Types of Errors and Statistical Power
5.8. Summary
6. Dependence and Independence
6.1. Simulating and Counting Conditional Probabilities
6.2. Conditional Probability: Notation and Intuition
6.3. Formally Defining Conditional Probability
6.4. Relating Conditional and Unconditional Probabilities
6.5. More on Simulating Conditional Probabilities
6.6. Statistical Independence
6.7. Conditional Probabilities and Independence in Fair Experiments
6.8. Conditioning and (In)dependence
6.9. Chain Rules and Total Probability
6.10. Summary
7. Introduction to Bayesian Methods
7.1. Bayes’ Rule
7.2. Bayes’ Rule in Systems with Hidden State
7.3. Optimal Decisions for Discrete Stochastic Systems
7.4. Bayesian Hypothesis Testing
7.5. Chapter Summary
8. Random Variables
8.1. Definition of a Real Random Variable
8.2. Discrete Random Variables
8.3. Cumulative Distribution Functions
8.4. Important Discrete RVs
8.5. Continuous Random Variables
8.6. Important Continuous Random Variables
8.7. Histograms of Continuous Random Variables and Kernel Density Estimation
8.8. Conditioning with Random Variables
8.9. Chapter Summary
9. Moments, Parameter Estimation, and Binary Hypothesis Tests on Sample Means
9.1. Expected Value
9.2. Expected Value of a Continuous Random Variable with SymPy
9.3. Moments
9.4. Parameter Estimation
9.5. Confidence Intervals for Estimates
9.6. Testing a Difference of Means
9.7. Sampling and Bootstrap Distributions of Parameters
9.8. Effect Size, Power, and Sample Size Selection
9.9. Summary
10. Decision Making with Observations from Continuous Distributions
10.1. Binary Decisions from Continuous Data: Non-Bayesian Approaches
10.2. Point Conditioning
10.3. Optimal Bayesian Decision Making with Continuous Random Variables
10.4. Summary
11. Categorical Data, Tests for Dependence, and Goodness of Fit for Discrete Distributions
11.1. Tabulating Categorical Data and Creating a Test Statistic
11.2. Null Hypothesis Significance Testing for Dependence in Contingency Tables
11.3. Chi-Square Goodness-of-Fit Test
11.4. Summary
12. Multidimensional Data: Vectors and Linear Regression
12.1. Summary Statistics for Vector Data
12.2. Linear Regression
12.3. Null Hypothesis Tests for Correlation
12.4. Nonlinear Regression Tests
12.5. Summary
13. Working with Dependent Data in Multiple Dimensions
13.1. Jointly Distributed Pairs of Random Variables
13.2. Standardization and Linear Transformation of Numerical Data
13.3. Decorrelating Random Vectors and Multi-Dimensional Data
13.4. Principal Components Analysis
13.5. Summary
Online Resources for Book Chapters
Next Steps
Appendix: Brief Introduction to Sets
Buy the Book
Buy
Foundations of Data Science with Python
Repository
Open issue
Conditioning and (In)dependence
6.8.
Conditioning and (In)dependence
#
Self-Assessment
\(~~~~\mbox{ }\)
Terminology Review