Free Statistics Help Book

An Interactive multimedia introductory-level statistics book.

The book features interactive demos, simulations and case studies.

The book features interactive demos, simulations and case studies.

Table of Contents

- Introduction
- Graphing Distributions
- Summarizing Distributions
- Describing Bivariate Data
- Probability
- Normal Distributions
- Sampling Distributions
- Estimation
- Logic of Hypothesis Testing
- Testing Means
- Power
- Prediction
- ANOVA
- Chi Square
- Case Studies
- Calculators
- Glossary

Online Statistics Education

A Multimedia Course of Study

(http://www.onlinestatbook.com)

Project Lead: David M. Lane

Rice University

(http://www.onlinestatbook.com)

Project Lead: David M. Lane

Rice University

1. Introduction

Prerequisites: none

This first chapter begins by discussing what statistics are and why the study of statistics is important. Subsequent sections cover a variety of topics all basic to the study of statistics. One theme common to all of these sections is that they cover concepts and ideas important for other chapters in the book.

2. Graphing Distributions

Prerequisites: none

C. Exercises

Graphing data is the first and often most important step in data analysis. In this day of computers, researchers all too often see only the results of complex computer analyses without ever taking a close look at the data themselves. This is all the more unfortunate because computers can create many types of graphs quickly and easily.

The of this chapter gives an example in which a well-construed graph makes it clear that there was a bias in the draft lottery of 1969. The two following sections discuss common graphs for qualitative and quantitative variables.

3. Summarizing Distributions

Prerequisites: none

C. Shape

D. Comparing Distributions Demo

E. Effects of Trransformations

F. Variance Sum Law I

G. Exercises

Descriptive statistics often involves using a few numbers to summarize a distribution. One important aspect of a distribution is where its center is located. Measures of central tendency are discussed first. A second aspect of a distribution is how spread out it is. In other words, how much the numbers in the distribution vary from one another. The second section describes measures of variability. Distributions can differ in shape. Some distributions are symmetric whereas others have long tails in just one direction. The third section describes measures of the shape of distributions. The final two sections concern (1) how transformations affect measures summarizing distributions and (2) the variance sum law, an important relationship involving a measure of variability.

4. Describing Bivariate Data

Prerequisites: none

A. Introduction to Bivariate Data

B. Values of the Pearson Correlation

C. Guessing Correlations Simulation

D. Properties of Pearson’s r

E. Computing Pearson’s r

F. Restriction of Range Demo

G. Variance Sum Law II

H. Exercises

B. Values of the Pearson Correlation

C. Guessing Correlations Simulation

D. Properties of Pearson’s r

E. Computing Pearson’s r

F. Restriction of Range Demo

G. Variance Sum Law II

H. Exercises

A dataset with two variables contains what is called bivariate data. This chapter discusses ways to describe the relationship between two variables. For example, you may wish to describe the relationship between the heights and weights of people to determine the extent to which taller people weigh more.

The introductory section gives more examples of bivariate relationships and presents the most common way of portraying these relationships graphically. The next five sections discuss Pearson’s correlation, the most common index of the relationship between two variables. The final section, “Variance Sum Law II” makes use of Pearson’s correlation to generalize this law to bivariate data.

5. Probability

Prerequisites: none

A. Introduction

B. Basic Concepts

C. Conditional Probability Demo

D. Gamblers Fallacy Simulation

E. Binomial Distribution

F. Binomial Demonstration

G. Poisson Distribution (not available yet)

H. Multinomial Distribution (not available yet)

I. Hypergeometric Distribution (not available yet)

J. Base Rates

K. Bayes’ Theorem Demonstration

L. Monty Hall Problem Demonstration

M. Exercises

N. Probability Files (in .zip archive)

Probability is an important and complex field of study. Fortunately, only a few basic issues in probability theory are essential for understanding statistics at the level covered in this book. These basic issues are covered in this chapter.

The introductory section discusses the definitions of probability. This is not as simple as it may seem. The section on basic concepts covers how to compute probabilities in a variety of simple situations. The Gambler’s Fallacy Simulation provides an opportunity to explore this fallacy by simulation. The Birthday Demonstration illustrates the probability of finding two or more people with the same birthday. The Binomial Demonstration shows the binomial distribution for different parameters. The section on base rates discusses an important but often-ignored factor in determining probabilities. It also presents Bays’ Theorem. The Bays’ Theorem Demonstration shows how a tree diagram and Bays’ Theorem result in the same answer. Finally, the Monty Hall Demonstration lets you play a game with a very counterintuitive result.nbsp;

6. Normal Distributions

Prerequisites: none

A. Introduction

B. History

C. Areas of Normal Distributions

D. Varieties of Normal Distribution Demo

E. Standard Normal

F. Normal Approximation to the Binomial

G. Normal Approximation Demo

H. Exercises

Most of the statistical analyses presented in this book are based on the bell-shaped or normal distribution. The introductory section defines what it means for a distribution to be normal and presents some important properties of normal distributions. The interesting history of the discovery of the normal distribution is described in the second section. Methods for calculating probabilities based on the normal distribution are described in Areas of Normal Distributions. The Varieties of Normal Distribution Demo allows you to enter values for the mean and standard deviation of a normal distribution and see a graph of the resulting distribution. A frequently used normal distribution is called the Standard Normal distribution and is described in the section with that name. The binomial distribution can be approximated by a normal distribution. The section Normal Approximation to the Binomial shows this approximation. The Normal Approximation Demo allows you to explore the accuracy of this approximation.

7. Sampling Distributions

Prerequisites: none

A. Introduction

B. Basic Demo

C. Sample Size Demo

D. Central Limit Theorem Demo

E. Sampling Distribution of the Mean

F. Sampling Distribution of Difference Between Means

G. Sampling Distribution of Pearson’s r

H. Difference Between r’s (not available yet)

I. Sampling Distribution of a Proportion

J. Difference Between Proportions (not available yet)

K. Law of Lage Numbers (not available yet)

L. Exercises

The concept of a sampling distribution is perhaps the most basic concept in inferential statistics. It is also a difficult concept to teach because a sampling distribution is a theoretical distribution rather than an empirical distribution.

The introductory section defines the concept and gives an example for both a discrete and a continuous distribution. It also discusses how sampling distributions are used in inferential statistics.

The Basic Demo is an interactive demonstration of sampling distributions. It is designed to make the abstract concept of sampling distributions more concrete. The Sample Size Demo allows you to investigate the effect of sample size on the sampling distribution of the mean. The Central Limit Theorem (CLT) Demo is an interactive illustration of a very important and counter-intuitive characteristic of the sampling distribution of the mean.

The remaining sections of the chapter concern the sampling distributions of important statistics: the Sampling Distribution of the Mean, the Sampling Distribution of the Difference Between Means, the Sampling Distribution of r, and the Sampling Distribution of a Proportion.

8. Estimation

Prerequisites: none

A. Intorduction

B. Degrees of Freedom

C. Characteristics of Estimators

D. Bias and Variability Simulation

E. Confidence Intervals

1. Introduction

2. Confidence Interval for the Mean

4. Confidence Interval Simulation

5. Confidence Interval for the Difference Between Means

F. Exercises

One of the major applications of statistics is estimating population parameters from sample statistics . For example, a poll may seek to estimate the proportion of adult residents of a city that support a proposition to build a new sports stadium. Out of a random sample of 200 people, 106 say they support the proposition. Thus in the sample, 0.53 of the people supported the proposition. This value of 0.53 is called a point estimate of the population proportion. It is called a point estimate because the estimate consists of a single value or point.

The concept of degrees of freedom and its relationship to estimation is discussed in Section B. “Characteristics of Estimators” discusses two important concepts: bias and precision.

Point estimates are usually supplemented by interval estimates called confidence intervals . Confidence intervals are intervals constructed using a method that contains the population parameter a specified proportion of the time. For example, if the pollster used a method that contains the parameter 95% of the time it is used, he or she would arrive at the following 95% confidence interval: 0.46 < π < 0.60. The pollster would then conclude that somewhere between 0.46 and 0.60 of the population supports the proposal. The media usually reports this type of result by saying that 53% favor the proposition with a margin of error of 7%. The sections on confidence interval show how to compute confidence intervals for a variety of parameters.

9. Logic of Hypothesis Testing

Prerequisites: none

A. Introduction

B. Significance Testing

C. Type I and Type II Errors

D. One- and Two-Tailed Tests

E. Interpreting Significant Results

F. Interpreting Non-Significant Results

G. Steps in Hypothesis Testing

H. Significance Testing and Confidence Intervals

I. Misconceptions

J. Exercises

When interpreting an experimental finding, a natural question arises as to whether the finding could have occurred by chance. Hypothesis testing is a statistical procedure for testing whether chance is a plausible explanation of an experimental finding. Misconceptions about hypothesis testing are common among practitioners as well as students. To help prevent these misconception, this chapter goes into more detail about the logic of hypothesis testing than is typical for an introductory-level text.

10. Testing Means

Prerequisites: none

A. Single Mean

B. t Distribution Demo

C. Difference between Two Means (Independent Groups)

D. Robustness Simulation

E. All Pairwise Comparisons Among Means

F. Specific Comparisons

G. Difference between Two Means (Correlated Pairs)

H. Correlated t Simulation

I. Specific Comparisons (Correlated Observations)

J. Pairwise Comparisons (Correlated Observations)

K. Exercises

Many if not most experiments are designed to compare means. The experiment may involve only one sample mean that is to be compared to a specific value. Or the experiment could be testing differences among many different experimental conditions, and the experimenter could be interested in comparing each mean with each other mean. This chapter covers methods of comparing means in many different experimental situations.

The topics covered here in sections D, E, G, and H are typically covered in other texts in a chapter on Analysis of Variance. We prefer to cover them here since they bear no necessary relationship to analysis of variance. As has been pointed out elsewhere, it is not logical to consider the procedures in this chapter tests to be performed subsequent to an analysis of variance. Nor is it logical to call them post-hoc tests as some computer programs do.

11. Power

Prerequisites: none

A. Introduction

B. Example Calculations

C. Power Demo 1

D. Power Demo 2

E. Factors Affecting Power

F. Exercises

12. Prediction

Prerequisites: none

A. Introduction to Simple Linear Regression

B. Linear Fit Demo

C. Partitioning Sums of Squares

D. Standard Error of the Estimate

E. Prediction Line Demo

F. Inferential Statistics for b and r

G. Influential Observations

H. Regression Toward the Mean

I. Introduction to Multiple Regression

J. Exercises

Statisticians are often called upon to develop methods to predict one variable from other variables. For example, one might want to predict college grade point average from high school grade point average. Or, one might want to predict income from the number of years of education.

13. ANOVA

Prerequisites: none

14. Chi Square

Prerequisites: none

A. Chi Square Distribution

B. One-Way Tables

C. Testing Distributions Demo

D. Contingency Tables

E. 2*2 Table Simulation

F. Exercises

Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes the basics of this distribution. The following two sections cover the most common statistical tests that make use of the Chi Square distribution. The section “One-Way Tables” shows how to use the Chi Square Distribution to test the difference between theoretically expected and observed frequencies. The section “Contingency Tables” shows how to use Chi Square to test the association between two nominal variables. This use of Chi Square is so common that it is often referred to as the “Chi Square Test.”

15. Case Studies

Prerequisites: none

A. Angry Moods

B. Flatulence

C. Physicians Reactions to Patient Size

D. Teacher Ratings

E. Mediterranean Diet and Health

F. Smiles and Leniency

G. Animal Research

H. ADHD Treatment

I. Weapons and Aggression

J. SAT and College GPA

K. Stereograms

L. Driving

M. Stroop Interference

N. TV Violence

O. Bias Against Associates of the Obese

P. Shaking and Stirring Martinis

Copyright 2012