How to Calculate Negative Log Likelihood Python
Negative log likelihood is a fundamental concept in statistics and machine learning. It measures how well a statistical model fits observed data. This guide explains how to calculate negative log likelihood in Python, including the formula, implementation, and practical applications.
What is Negative Log Likelihood?
The negative log likelihood (NLL) is a measure of how well a statistical model fits observed data. It's derived from the likelihood function, which represents the probability of observing the given data under the model. The negative log likelihood is simply the negative natural logarithm of the likelihood function.
In machine learning, minimizing the negative log likelihood is equivalent to maximizing the likelihood function. This is often used in training models like logistic regression, where the goal is to find parameters that maximize the likelihood of the observed data.
Key Points
- Negative log likelihood is always non-negative
- A lower NLL indicates a better fit of the model to the data
- It's commonly used in optimization problems
- In machine learning, it's often used as a loss function
Negative Log Likelihood Formula
The negative log likelihood is calculated using the following formula:
Formula
NLL = -Σ[log(P(yᵢ|xᵢ;θ))]
Where:
- P(yᵢ|xᵢ;θ) is the probability of observing yᵢ given xᵢ and parameters θ
- Σ represents the sum over all observations
- θ represents the model parameters
For a single observation, the negative log likelihood for that observation is simply -log(P(y|x;θ)). For multiple observations, we sum these values across all observations.
Calculating Negative Log Likelihood in Python
Python provides several libraries that make calculating negative log likelihood straightforward. The most common approach is to use the scipy.stats module, which contains many probability distributions with built-in methods for calculating log probabilities.
Using scipy.stats
Here's an example of how to calculate negative log likelihood for a normal distribution using scipy.stats:
import numpy as np
from scipy.stats import norm
# Observed data
data = np.array([1.2, 1.5, 1.8, 2.1, 2.4])
# Model parameters (mean and standard deviation)
mu, sigma = 1.5, 0.5
# Calculate log likelihood for each observation
log_likelihoods = norm.logpdf(data, mu, sigma)
# Calculate negative log likelihood
nll = -np.sum(log_likelihoods)
print(f"Negative Log Likelihood: {nll:.4f}")
Custom Implementation
If you need to implement the calculation yourself, you can use the natural logarithm function from the math module:
import math
def negative_log_likelihood(data, mu, sigma):
nll = 0
for x in data:
# Probability density function for normal distribution
pdf = (1 / (sigma * math.sqrt(2 * math.pi))) * math.exp(-0.5 * ((x - mu) / sigma) ** 2)
# Add to negative log likelihood
nll -= math.log(pdf)
return nll
# Example usage
data = [1.2, 1.5, 1.8, 2.1, 2.4]
mu, sigma = 1.5, 0.5
print(f"Negative Log Likelihood: {negative_log_likelihood(data, mu, sigma):.4f}")
Example Calculation
Let's walk through a concrete example to illustrate how negative log likelihood works. Suppose we have the following observed data points: [1.2, 1.5, 1.8, 2.1, 2.4]. We'll assume a normal distribution with mean (μ) = 1.5 and standard deviation (σ) = 0.5.
For each data point, we calculate the probability density using the normal distribution formula:
Normal Distribution PDF
P(x) = (1 / (σ√(2π))) * e^(-0.5 * ((x - μ)/σ)²)
Then we take the natural logarithm of each probability and sum them up. Finally, we take the negative of this sum to get the negative log likelihood.
Calculation Steps
- Calculate PDF for each data point
- Take natural logarithm of each PDF
- Sum all log probabilities
- Take the negative of the sum
The result will be a single number representing how well the normal distribution with μ=1.5 and σ=0.5 fits the observed data. A lower value indicates a better fit.
Practical Applications
Negative log likelihood has several important applications in statistics and machine learning:
- Model Selection: Comparing different models to determine which one fits the data better
- Parameter Estimation: Finding the optimal parameters for a statistical model
- Machine Learning: Used as a loss function in algorithms like logistic regression
- Hypothesis Testing: Assessing the goodness-of-fit of a model
In machine learning, minimizing the negative log likelihood is equivalent to maximizing the likelihood function, which is the foundation of many probabilistic models.
FAQ
What is the difference between log likelihood and negative log likelihood?
The log likelihood is the natural logarithm of the likelihood function. The negative log likelihood is simply the negative of the log likelihood. While both measures are used to evaluate model fit, the negative log likelihood is more commonly used in optimization problems because it's always non-negative and can be minimized.
Why is negative log likelihood used in optimization?
Negative log likelihood is used in optimization because it's always non-negative and can be minimized. This makes it suitable for gradient-based optimization algorithms, which work by iteratively reducing the value of the function being optimized.
How does negative log likelihood relate to cross-entropy loss?
In the context of classification problems, negative log likelihood is equivalent to cross-entropy loss when the true labels are one-hot encoded. Both measures quantify the difference between the predicted probabilities and the true labels.