For The Following Data Set Calculate The Pearson Correlation

Pearson correlation is a measure of the linear relationship between two variables. This guide explains how to calculate Pearson correlation for your data set, interpret the results, and apply them in statistical analysis.

What is Pearson Correlation?

Pearson correlation (often referred to as Pearson's r) measures the linear relationship between two continuous variables. It ranges from -1 to +1:

+1 indicates a perfect positive linear relationship
0 indicates no linear relationship
-1 indicates a perfect negative linear relationship

The Pearson correlation coefficient is calculated using the following formula:

r = Σ[(xᵢ - x̄)(yᵢ - ȳ)] / √[Σ(xᵢ - x̄)²Σ(yᵢ - ȳ)²]

Where:

xᵢ and yᵢ are individual data points
x̄ and ȳ are the means of the x and y variables
Σ represents the sum of all data points

Pearson correlation is widely used in fields such as economics, psychology, and biology to identify relationships between variables.

How to Calculate Pearson Correlation

Step 1: Organize Your Data

Create a table with two columns of paired data points. Each row represents one observation.

X Variable	Y Variable
2	4
4	6
6	8
8	10

Step 2: Calculate the Means

Find the mean (average) of each variable.

x̄ = (2 + 4 + 6 + 8) / 4 = 5 ȳ = (4 + 6 + 8 + 10) / 4 = 7

Step 3: Calculate Covariance

Compute the covariance between the variables.

Cov(x,y) = Σ[(xᵢ - x̄)(yᵢ - ȳ)] / n

For our example:

Cov(x,y) = [(2-5)(4-7) + (4-5)(6-7) + (6-5)(8-7) + (8-5)(10-7)] / 4 = [(-3)(-3) + (-1)(-1) + (1)(1) + (3)(3)] / 4 = [9 + 1 + 1 + 9] / 4 = 20/4 = 5

Step 4: Calculate Standard Deviations

Compute the standard deviation for each variable.

σ_x = √[Σ(xᵢ - x̄)² / n] σ_y = √[Σ(yᵢ - ȳ)² / n]

For our example:

σ_x = √[(9 + 1 + 1 + 9) / 4] = √(20/4) = √5 ≈ 2.236 σ_y = √[(9 + 1 + 1 + 9) / 4] = √(20/4) = √5 ≈ 2.236

Step 5: Calculate Pearson Correlation

Divide the covariance by the product of the standard deviations.

r = Cov(x,y) / (σ_x * σ_y) = 5 / (2.236 * 2.236) ≈ 5 / 5 ≈ 1

In this perfect linear relationship example, the Pearson correlation is exactly 1.

Interpreting Pearson Correlation Results

The Pearson correlation coefficient (r) provides several important insights:

Direction: The sign (+ or -) indicates the direction of the relationship
Strength: The absolute value (0 to 1) indicates the strength of the relationship
Significance: A correlation is statistically significant if it's unlikely to occur by chance

Common interpretations:

0.7 to 1.0: Strong positive relationship
0.3 to 0.7: Moderate positive relationship
0.0 to 0.3: Weak or no positive relationship
-0.3 to -0.7: Weak or no negative relationship
-0.7 to -1.0: Strong negative relationship

Correlation does not imply causation. A strong Pearson correlation between two variables does not prove that one causes the other.

Worked Example

Let's calculate Pearson correlation for the following data set:

Hours Studied (X)	Exam Score (Y)
2	50
4	60
6	70
8	80

Step 1: Calculate Means

x̄ = (2 + 4 + 6 + 8) / 4 = 5 ȳ = (50 + 60 + 70 + 80) / 4 = 65

Step 2: Calculate Covariance

Cov(x,y) = [(2-5)(50-65) + (4-5)(60-65) + (6-5)(70-65) + (8-5)(80-65)] / 4 = [(-3)(-15) + (-1)(-5) + (1)(5) + (3)(15)] / 4 = [45 + 5 + 5 + 45] / 4 = 100/4 = 25

Step 3: Calculate Standard Deviations

σ_x = √[(9 + 1 + 1 + 9) / 4] = √(20/4) = √5 ≈ 2.236 σ_y = √[(225 + 25 + 25 + 225) / 4] = √(500/4) = √125 ≈ 11.18

Step 4: Calculate Pearson Correlation

r = 25 / (2.236 * 11.18) ≈ 25 / 25.2 ≈ 0.99

This indicates a very strong positive linear relationship between hours studied and exam scores.

FAQ

What is the difference between Pearson and Spearman correlation?

Pearson correlation measures linear relationships between continuous variables, while Spearman correlation measures monotonic relationships (whether linear or not) between ranked variables.

When should I use Pearson correlation?

Use Pearson correlation when you have continuous data and suspect a linear relationship. It's commonly used in fields like economics, psychology, and biology.

How do I know if my correlation is statistically significant?

You need to perform a hypothesis test to determine if your correlation is statistically significant. Common methods include using a t-test or looking up critical values in correlation tables.

What if my data doesn't meet the assumptions of Pearson correlation?

If your data is not normally distributed or has outliers, consider using Spearman correlation or transforming your data before analysis.