How to Calculate Icc with A Small N

Intraclass Correlation Coefficient (ICC) measures the consistency of measurements across different raters or time points. When dealing with small sample sizes (n), special considerations apply to ensure reliable results. This guide explains how to calculate ICC with small n, including formulas, interpretation, and practical examples.

What is ICC?

The Intraclass Correlation Coefficient (ICC) is a statistical measure that quantifies the consistency of measurements across different raters, time points, or conditions. It ranges from 0 to 1, where:

0 indicates no consistency
0.5 indicates moderate consistency
0.75 or higher indicates good consistency

ICC is commonly used in reliability studies, medical research, and quality control assessments.

Why Small N Matters

When working with small sample sizes (typically n ≤ 20), several challenges arise:

Reduced statistical power: Small samples may not detect meaningful differences.
Increased variability: ICC estimates become less precise with fewer observations.
Type I error inflation: Higher chance of false positive results.

For small n, consider using two-way random effects models or Bayesian approaches to improve reliability estimates.

Calculating ICC

The most common ICC formula for two-way random effects models is:

ICC(2,1) = σ_b² / (σ_b² + σ_w²)

Where:

σ_b² = between-subjects variance
σ_w² = within-subjects variance

Step-by-Step Calculation

Collect your data in a two-way table format (subjects × raters/conditions)
Calculate the mean for each subject
Compute the between-subjects variance (σ_b²)
Compute the within-subjects variance (σ_w²)
Plug values into the ICC formula

Example Calculation

Consider a study with 5 subjects rated by 3 raters:

Subject	Rater 1	Rater 2	Rater 3	Mean
1	4.2	4.5	4.0	4.27
2	3.8	4.0	3.9	3.90
3	5.1	5.3	5.0	5.13
4	2.9	3.0	2.8	2.90
5	4.7	4.8	4.6	4.70

Using this data, the calculated ICC(2,1) would be approximately 0.78, indicating good reliability.

Interpreting Results

Interpret ICC results with these guidelines:

ICC < 0.5: Poor reliability - Consider improving measurement methods
ICC 0.5-0.75: Moderate reliability - Acceptable for screening purposes
ICC 0.75-0.9: Good reliability - Suitable for clinical or research use
ICC > 0.9: Excellent reliability - High confidence in measurements

For small n, ICC values should be interpreted cautiously. Consider confidence intervals to assess precision.

FAQ

What is the minimum sample size for ICC calculation?

For reliable ICC estimates, aim for at least 10 subjects and 2-3 raters/conditions. With very small n (n ≤ 5), ICC estimates become unreliable and should be interpreted with caution.

How does ICC differ from Cronbach's alpha?

ICC measures consistency across raters or time points, while Cronbach's alpha measures internal consistency within a single rater. ICC is preferred when multiple raters are involved.

Can I use ICC with ordinal data?

Yes, but consider using polychoric correlation coefficients for ordinal data to maintain metric properties.