Calculate Degrees of Freedom for Likelihood Ratio

Determining the degrees of freedom (df) for a likelihood ratio test is essential for statistical hypothesis testing. This guide explains how to calculate df for likelihood ratio tests, including the formula, assumptions, and practical applications.

What is Degrees of Freedom?

Degrees of freedom (df) refer to the number of independent pieces of information that can vary in a statistical model. In the context of likelihood ratio tests, df represents the difference in the number of parameters between the two nested models being compared.

Understanding df is crucial because it determines the shape of the chi-square distribution used to evaluate the likelihood ratio test statistic. A higher df means a more spread-out distribution, making it easier to detect significant differences.

Likelihood Ratio Test

The likelihood ratio test compares two nested statistical models to determine if the more complex model provides a significantly better fit to the data than the simpler model. The test statistic is calculated as:

Test statistic = -2 × (ln(L₀) - ln(L₁))

Where:

L₀ = Likelihood of the restricted (simpler) model
L₁ = Likelihood of the unrestricted (more complex) model

The test statistic follows a chi-square distribution with degrees of freedom equal to the difference in the number of parameters between the two models.

Calculating Degrees of Freedom

The degrees of freedom for a likelihood ratio test are calculated as the difference between the number of parameters in the unrestricted model and the number of parameters in the restricted model:

df = k₁ - k₀

Where:

k₁ = Number of parameters in the unrestricted model
k₀ = Number of parameters in the restricted model

For example, if you're comparing a model with 5 parameters to a simpler model with 2 parameters, the degrees of freedom would be 3.

Note: The restricted model must be a special case of the unrestricted model. If this isn't true, the likelihood ratio test is not valid.

Example Calculation

Let's consider a scenario where you're comparing two nested logistic regression models:

Model 0 (restricted): Contains 3 parameters (intercept and two predictors)
Model 1 (unrestricted): Contains 5 parameters (intercept and four predictors)

The degrees of freedom would be calculated as:

df = 5 - 3 = 2

This means the likelihood ratio test statistic would follow a chi-square distribution with 2 degrees of freedom.

FAQ

What is the difference between degrees of freedom and sample size?

Degrees of freedom are not the same as sample size. While sample size refers to the number of observations in your data, degrees of freedom represent the number of independent pieces of information available for estimation. In many cases, df is related to sample size but adjusted for the number of parameters in your model.

Can degrees of freedom be negative?

No, degrees of freedom cannot be negative. If you calculate a negative value, it indicates that the unrestricted model has fewer parameters than the restricted model, which violates the assumptions of the likelihood ratio test.

How does df affect the likelihood ratio test?

The degrees of freedom determine the shape of the chi-square distribution used to evaluate the test statistic. A higher df means a more spread-out distribution, making it easier to detect significant differences between the models. Conversely, a lower df makes the test more conservative.

What happens if the models are not nested?

The likelihood ratio test is only valid when the models are nested, meaning one model is a special case of the other. If the models are not nested, the test statistic may not follow a chi-square distribution, and the results may not be reliable.