top of page

Intro to Statistics


Statistics is all about understanding change and chance. At it's core, it simply involves differences (subtraction) and similarities (averages). If you know how to subtract and find averages, you will be off to a great start!


Samples vs. Populations


With statistics, we are using information we know about a sample or group to make predictions about it's larger population. This process is called inference. This can also work the other way around. (We can take information about the larger population to make predictions about a subset of that population.)


Populations


Populations have a mean and a standard deviation. The mean is the average of the population. The standard deviation is the average difference between each member of the population.


μ = Population Mean

σ = Population Standard Deviation


The population statistics are represented by Greek symbols.


Samples


Samples also have a mean and standard deviation. The sample mean is the average of all the data points in the sample. The sample standard deviation is the average distance between each data point and the mean. Both are examples of descriptive statistics, which describe the data set.


x - Individual data points


x̄ - Sample Mean (average of the data points)


s - Standard Deviation (average difference between the data points)


The sample statistics are represented by letters.

Note: Sometimes the sample mean is represented by a capital X or M.



 

The Key to Statistics: Understanding the Differences Between Averages.

In statistics, we want to understand the differences between the averages of samples and their populations (X̄ - μ). Specifically, we'd like to know whether or not those differences are large enough to be statistically significant (likely to happen).


To consider whether a difference in between is likely to occur, must account for the change that is happening between all of the data points (the standard deviation, σ, or average difference).


If the difference between the averages is substantially larger than the standard deviation of the data points, then there is a significant difference.


For example, a difference of 20 points between the averages of a sample and its population may be significant if the typical difference or standard deviation is normally only 4 points. However, if the standard deviation is normally say 30 points, than a difference of 20 points would not seem significant at all.


This relationship is expressed in the equation for a z-score:


The differences between the averages of samples and their populations (X̄ - μ) adjusted for (divided by) the standard deviation, σ.

The bigger the standard deviation, the smaller your test statistic (Z). The smaller your test statistic (Z), the less significant the difference will be. If you think about it, this makes sense because the more change that is happening (bigger σ), the less meaningful the difference.




Also, the larger your Z statistic, the closer it will be to the red rejection regions. The rejection regions indicate that the statistic is significant.



Significance Tests / Hypothesis Testing


To determine whether the difference between the sample mean and the population mean is significant, we perform significance tests aka hypothesis tests (the terms are used interchangeably). With significance tests, we first decide how confident we want to be about something happening (our hypothesis). This is our confidence level aka significance level. Then we use the test to decide whether or not we can actually be that confident in it happening (or the opposite not happening).


One of the most confusing parts about statistics is the fact that we are actually testing to see what the probability is of our hypothesis not happening. That's right - in order to be so sure that our hypothesis is correct, we must prove that the chances of it not being correct are incredibly (significantly) small.


If the probability of our hypothesis not being true is small enough, we can conclude that our hypothesis is true.


Types of Significance Tests


Z Tests


One Sample Z Test

What is is

Formula

How to Interpret

When to use


Two Sample Z Test

What is is

Formula

How to Interpret

When to use


One Proportion Z Test

What is is

Formula

How to Interpret

When to use


Two Proportion Z Test

What is is

Formula

How to Interpret

When to use


T Tests


One-Sample T Test

What is is

Formula

How to Interpret

When to use


Two-Sample T Test

What is is

Formula

How to Interpret

When to use


One Proportion T Test

What is is

Formula

How to Interpret

When to use


Two Proportion T Test

What is is

Formula

How to Interpret

When to use


Paired T Test (Repeated Measures)

What is is

Formula

How to Interpret

When to use


F Tests/ANOVA

What is is

Formula

How to Interpret

When to use


Pearson's Correlation Coefficient

What is is

Formula

How to Interpret

When to use


Simple Regression & Multiple Regression

What is is

Formula

How to Interpret

When to use


24 views0 comments
bottom of page