How to Calculate Correlation Between Two Interval Levels
Correlation measures the statistical relationship between two interval-level variables. This guide explains how to calculate and interpret different types of correlation coefficients, with practical examples and an interactive calculator.
What is Correlation?
Correlation is a statistical measure that examines the relationship between two variables. When variables are correlated, changes in one variable are associated with changes in the other variable. Correlation does not imply causation, meaning that just because two variables are correlated doesn't mean one causes the other.
Correlation coefficients range from -1 to 1. A value of 1 indicates a perfect positive correlation, -1 indicates a perfect negative correlation, and 0 indicates no correlation.
Types of Correlation
There are several types of correlation coefficients, each suitable for different types of data:
- Pearson Correlation Coefficient - Measures linear correlation between two continuous variables
- Spearman's Rank Correlation - Measures monotonic relationships between two variables
- Kendall's Tau - Measures ordinal association between two variables
This guide focuses on Pearson and Spearman correlation, which are most commonly used for interval-level data.
Pearson Correlation Coefficient
The Pearson correlation coefficient (r) measures the linear relationship between two continuous variables. The formula is:
r = Σ[(X - X̄)(Y - Ȳ)] / √[Σ(X - X̄)²Σ(Y - Ȳ)²]
Where:
- X and Y are the variables
- X̄ and Ȳ are the means of X and Y
- Σ represents the sum of all values
The Pearson coefficient is most appropriate when:
- Both variables are normally distributed
- The relationship between variables is linear
- There are no outliers in the data
Spearman's Rank Correlation
Spearman's rank correlation (ρ) measures the monotonic relationship between two variables. It's based on the ranked values rather than the actual values. The formula is:
ρ = 1 - [6Σd² / n(n² - 1)]
Where:
- d is the difference between ranks of corresponding variables
- n is the number of pairs
Spearman's correlation is appropriate when:
- The relationship between variables is not strictly linear
- Data is ordinal or not normally distributed
- There are outliers in the data
How to Interpret Correlation Results
Interpreting correlation coefficients requires understanding the strength and direction of the relationship:
- 0.00 to 0.19 - Very weak correlation
- 0.20 to 0.39 - Weak correlation
- 0.40 to 0.59 - Moderate correlation
- 0.60 to 0.79 - Strong correlation
- 0.80 to 1.00 - Very strong correlation
The sign of the coefficient indicates the direction:
- Positive coefficient (+) indicates a positive relationship
- Negative coefficient (-) indicates a negative relationship
Remember that correlation does not imply causation. Just because two variables are correlated doesn't mean one causes the other.
Example Calculation
Let's calculate the Pearson correlation between hours studied (X) and exam scores (Y) for 5 students:
| Student | Hours Studied (X) | Exam Score (Y) |
|---|---|---|
| 1 | 2 | 85 |
| 2 | 4 | 90 |
| 3 | 3 | 88 |
| 4 | 5 | 92 |
| 5 | 1 | 80 |
Using the Pearson formula, we calculate:
r = Σ[(X - X̄)(Y - Ȳ)] / √[Σ(X - X̄)²Σ(Y - Ȳ)²]
Calculating the means: X̄ = 3, Ȳ = 87.4
Calculating the numerator: Σ[(X - X̄)(Y - Ȳ)] = 15.6
Calculating the denominator: √[Σ(X - X̄)²Σ(Y - Ȳ)²] = 10.2
Final result: r ≈ 0.63
This indicates a moderate positive correlation between hours studied and exam scores.