Statistics is all about understanding change and chance. At it's core, it simply involves differences (subtraction) and similarities (averages). If you know how to subtract and find averages, you will be off to a great start!
Samples vs. Populations
With statistics, we are using information we know about a sample or group to make predictions about it's larger population. This process is called inference. This can also work the other way around. (We can take information about the larger population to make predictions about a subset of that population.)
Populations
Populations have a mean and a standard deviation. The mean is the average of the population. The standard deviation is the average difference between each member of the population.
μ = Population Mean
σ = Population Standard Deviation
The population statistics are represented by Greek symbols.
Samples
Samples also have a mean and standard deviation. The sample mean is the average of all the data points in the sample. The sample standard deviation is the average distance between each data point and the mean. Both are examples of descriptive statistics, which describe the data set.
x - Individual data points
x̄ - Sample Mean (average of the data points)
s - Standard Deviation (average difference between the data points)
The sample statistics are represented by letters.
Note: Sometimes the sample mean is represented by a capital X or M.
The Key to Statistics: Understanding the Differences Between Averages.
In statistics, we want to understand the differences between the averages of samples and their populations (X̄ - μ). Specifically, we'd like to know whether or not those differences are large enough to be statistically significant (likely to happen).
To consider whether a difference in between is likely to occur, must account for the change that is happening between all of the data points (the standard deviation, σ, or average difference).
If the difference between the averages is substantially larger than the standard deviation of the data points, then there is a significant difference.
For example, a difference of 20 points between the averages of a sample and its population may be significant if the typical difference or standard deviation is normally only 4 points. However, if the standard deviation is normally say 30 points, than a difference of 20 points would not seem significant at all.
This relationship is expressed in the equation for a z-score:
The differences between the averages of samples and their populations (X̄ - μ) adjusted for (divided by) the standard deviation, σ.
The bigger the standard deviation, the smaller your test statistic (Z). The smaller your test statistic (Z), the less significant the difference will be. If you think about it, this makes sense because the more change that is happening (bigger σ), the less meaningful the difference.
Also, the larger your Z statistic, the closer it will be to the red rejection regions. The rejection regions indicate that the statistic is significant.
Significance Tests / Hypothesis Testing
To determine whether the difference between the sample mean and the population mean is significant, we perform significance tests aka hypothesis tests (the terms are used interchangeably). With significance tests, we first decide how confident we want to be about something happening (our hypothesis). This is our confidence level aka significance level. Then we use the test to decide whether or not we can actually be that confident in it happening (or the opposite not happening).
One of the most confusing parts about statistics is the fact that we are actually testing to see what the probability is of our hypothesis not happening. That's right - in order to be so sure that our hypothesis is correct, we must prove that the chances of it not being correct are incredibly (significantly) small.
If the probability of our hypothesis not being true is small enough, we can conclude that our hypothesis is true.
Types of Significance Tests
Z Tests
One Sample Z Test
What is is
Formula
How to Interpret
When to use
Two Sample Z Test
What is is
Formula
How to Interpret
When to use
One Proportion Z Test
What is is
Formula
How to Interpret
When to use
Two Proportion Z Test
What is is
Formula
How to Interpret
When to use
T Tests
One-Sample T Test
What is is
Formula
How to Interpret
When to use
Two-Sample T Test
What is is
Formula
How to Interpret
When to use
One Proportion T Test
What is is
Formula
How to Interpret
When to use
Two Proportion T Test
What is is
Formula
How to Interpret
When to use
Paired T Test (Repeated Measures)
What is is
Formula
How to Interpret
When to use
F Tests/ANOVA
What is is
Formula
How to Interpret
When to use
Pearson's Correlation Coefficient
What is is
Formula
How to Interpret
When to use
Simple Regression & Multiple Regression
What is is
Formula
How to Interpret
When to use
Commentaires