How to Calculate Negative Log Likelihood Python

Negative log likelihood is a fundamental concept in statistics and machine learning. It measures how well a statistical model fits observed data. This guide explains how to calculate negative log likelihood in Python, including the formula, implementation, and practical applications.

What is Negative Log Likelihood?

The negative log likelihood (NLL) is a measure of how well a statistical model fits observed data. It's derived from the likelihood function, which represents the probability of observing the given data under the model. The negative log likelihood is simply the negative natural logarithm of the likelihood function.

In machine learning, minimizing the negative log likelihood is equivalent to maximizing the likelihood function. This is often used in training models like logistic regression, where the goal is to find parameters that maximize the likelihood of the observed data.

Key Points

Negative log likelihood is always non-negative
A lower NLL indicates a better fit of the model to the data
It's commonly used in optimization problems
In machine learning, it's often used as a loss function

Negative Log Likelihood Formula

The negative log likelihood is calculated using the following formula:

Formula

NLL = -Σ[log(P(yᵢ|xᵢ;θ))]

Where:

P(yᵢ|xᵢ;θ) is the probability of observing yᵢ given xᵢ and parameters θ
Σ represents the sum over all observations
θ represents the model parameters

For a single observation, the negative log likelihood for that observation is simply -log(P(y|x;θ)). For multiple observations, we sum these values across all observations.

Calculating Negative Log Likelihood in Python

Python provides several libraries that make calculating negative log likelihood straightforward. The most common approach is to use the scipy.stats module, which contains many probability distributions with built-in methods for calculating log probabilities.

Using scipy.stats

Here's an example of how to calculate negative log likelihood for a normal distribution using scipy.stats:

import numpy as np
from scipy.stats import norm

# Observed data
data = np.array([1.2, 1.5, 1.8, 2.1, 2.4])

# Model parameters (mean and standard deviation)
mu, sigma = 1.5, 0.5

# Calculate log likelihood for each observation
log_likelihoods = norm.logpdf(data, mu, sigma)

# Calculate negative log likelihood
nll = -np.sum(log_likelihoods)

print(f"Negative Log Likelihood: {nll:.4f}")

Custom Implementation

If you need to implement the calculation yourself, you can use the natural logarithm function from the math module:

import math

def negative_log_likelihood(data, mu, sigma):
    nll = 0
    for x in data:
        # Probability density function for normal distribution
        pdf = (1 / (sigma * math.sqrt(2 * math.pi))) * math.exp(-0.5 * ((x - mu) / sigma) ** 2)
        # Add to negative log likelihood
        nll -= math.log(pdf)
    return nll

# Example usage
data = [1.2, 1.5, 1.8, 2.1, 2.4]
mu, sigma = 1.5, 0.5
print(f"Negative Log Likelihood: {negative_log_likelihood(data, mu, sigma):.4f}")

Example Calculation

Let's walk through a concrete example to illustrate how negative log likelihood works. Suppose we have the following observed data points: [1.2, 1.5, 1.8, 2.1, 2.4]. We'll assume a normal distribution with mean (μ) = 1.5 and standard deviation (σ) = 0.5.

For each data point, we calculate the probability density using the normal distribution formula:

Normal Distribution PDF

P(x) = (1 / (σ√(2π))) * e^(-0.5 * ((x - μ)/σ)²)

Then we take the natural logarithm of each probability and sum them up. Finally, we take the negative of this sum to get the negative log likelihood.

Calculation Steps

Calculate PDF for each data point
Take natural logarithm of each PDF
Sum all log probabilities
Take the negative of the sum

The result will be a single number representing how well the normal distribution with μ=1.5 and σ=0.5 fits the observed data. A lower value indicates a better fit.

Practical Applications

Negative log likelihood has several important applications in statistics and machine learning:

Model Selection: Comparing different models to determine which one fits the data better
Parameter Estimation: Finding the optimal parameters for a statistical model
Machine Learning: Used as a loss function in algorithms like logistic regression
Hypothesis Testing: Assessing the goodness-of-fit of a model

In machine learning, minimizing the negative log likelihood is equivalent to maximizing the likelihood function, which is the foundation of many probabilistic models.

FAQ

What is the difference between log likelihood and negative log likelihood?

The log likelihood is the natural logarithm of the likelihood function. The negative log likelihood is simply the negative of the log likelihood. While both measures are used to evaluate model fit, the negative log likelihood is more commonly used in optimization problems because it's always non-negative and can be minimized.

Why is negative log likelihood used in optimization?

Negative log likelihood is used in optimization because it's always non-negative and can be minimized. This makes it suitable for gradient-based optimization algorithms, which work by iteratively reducing the value of the function being optimized.

How does negative log likelihood relate to cross-entropy loss?

In the context of classification problems, negative log likelihood is equivalent to cross-entropy loss when the true labels are one-hot encoded. Both measures quantify the difference between the predicted probabilities and the true labels.