R Function Calculate Relative Risk with Confidence Interval

Relative risk is a fundamental measure in epidemiology and medical research used to quantify the strength of association between an exposure and an outcome. Calculating relative risk with confidence intervals in R provides a statistically rigorous way to assess the significance of this association.

What is Relative Risk?

Relative risk (RR) is defined as the ratio of the probability of an event occurring in an exposed group to the probability of the event occurring in an unexposed group. It measures how much more (or less) likely an event is to occur in the exposed group compared to the unexposed group.

Formula: RR = (a/n) / (c/m)

Where:

a = number of cases in exposed group
n = total number in exposed group
c = number of cases in unexposed group
m = total number in unexposed group

Relative risk values can be interpreted as follows:

RR = 1: No difference in risk between groups
RR > 1: Higher risk in exposed group
RR < 1: Lower risk in exposed group

Confidence Interval for Relative Risk

A confidence interval provides a range of values that is likely to contain the true relative risk. It helps assess the precision of the estimate and the uncertainty around it. Common methods for calculating confidence intervals for relative risk include:

Wald interval
Score interval
Exact interval
Miettinen interval

The most commonly used method is the Wald interval, which is straightforward to calculate and widely implemented in statistical software.

The Wald confidence interval for relative risk is calculated as:

CI = RR × exp(±1.96 × √(1/a - 1/n + 1/c - 1/m))

Where:

RR = relative risk
a, n, c, m = same as in the relative risk formula
1.96 = z-value for 95% confidence interval

R Function to Calculate Relative Risk

In R, you can calculate relative risk with confidence intervals using the epitools package, which provides functions specifically designed for epidemiological calculations.

Example R code:

# Install package if needed
install.packages("epitools")

# Load the package
library(epitools)

# Create a 2x2 table
table <- matrix(c(50, 10, 30, 70), nrow = 2, byrow = TRUE)
dimnames(table) <- list(c("Exposed", "Unexposed"), c("Case", "Control"))

# Calculate relative risk with 95% confidence interval
result <- riskratio(table, method = "wald")
print(result)

The output will include the relative risk estimate, confidence interval, and other relevant statistics.

For more complex analyses, you may need to use other packages like epiR or implement custom functions, especially when dealing with exact methods or stratified data.

Example Calculation

Consider a study examining the relationship between smoking and lung cancer:

Group	Cases	Controls	Total
Smokers	50	10	60
Non-smokers	30	70	100

Using the formulas:

RR = (50/60) / (30/100) = 0.833 / 0.3 = 2.778

CI = 2.778 × exp(±1.96 × √(1/50 - 1/60 + 1/30 - 1/100)) ≈ 2.778 × (0.75, 4.25)

This indicates that smokers have approximately 2.78 times the risk of developing lung cancer compared to non-smokers, with a 95% confidence interval of (0.75, 4.25).

Interpreting Results

When interpreting relative risk with confidence intervals, consider the following:

Magnitude of RR: Values greater than 1 indicate higher risk in the exposed group.
Width of CI: A narrow CI suggests more precise estimates.
Inclusion of 1: If the CI includes 1, there is no statistically significant difference.
Direction: Whether the RR is above or below 1 indicates protective or harmful effects.

Always consider the study design, potential confounders, and effect modification when interpreting results.

FAQ

What is the difference between relative risk and odds ratio?: Relative risk measures the ratio of probabilities, while odds ratio measures the ratio of odds. RR is preferred when the outcome is rare, while OR is more appropriate when the outcome is common.
How do I choose the right confidence interval method?: The Wald interval is generally preferred for its simplicity, but exact methods may be more appropriate for small sample sizes or rare outcomes.
Can I calculate relative risk with only case-control data?: Yes, but you need to make assumptions about the prevalence of exposure in the population to estimate the risk in the unexposed group.
What does a relative risk of 0.5 mean?: It means the exposed group has half the risk of the unexposed group, indicating a protective effect.
How do I handle missing data in my analysis?: Consider complete case analysis or multiple imputation, depending on the amount and pattern of missing data.