Calculate True Positive Rate Python
The true positive rate (TPR) is a key metric in binary classification that measures the proportion of actual positives correctly identified by a model. This guide explains how to calculate TPR in Python, including the formula, implementation, and practical interpretation.
What is True Positive Rate?
The true positive rate, also known as sensitivity or recall, measures how well a classification model identifies positive cases. It's calculated as the number of true positives divided by the total number of actual positives.
In medical testing, for example, TPR would represent the proportion of sick patients correctly identified as having the disease. A high TPR indicates good model performance at detecting positive cases.
True Positive Rate Formula
True Positive Rate (TPR) = True Positives / (True Positives + False Negatives)
Where:
- True Positives (TP) - Correctly identified positive cases
- False Negatives (FN) - Positive cases incorrectly identified as negative
The result is a value between 0 and 1, where 1 represents perfect detection of all positive cases.
Calculate True Positive Rate in Python
You can calculate TPR in Python using scikit-learn's classification metrics. Here's a complete implementation:
This example assumes you have a trained classifier and test data. For a complete implementation, you would need to train a model first.
from sklearn.metrics import confusion_matrix
# Example confusion matrix
y_true = [1, 0, 1, 1, 0, 1, 0, 1, 0, 1]
y_pred = [1, 0, 1, 0, 0, 1, 0, 1, 0, 0]
# Calculate confusion matrix
tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
# Calculate TPR
tpr = tp / (tp + fn)
print(f"True Positive Rate: {tpr:.2f}")
For a more complete implementation with model training, you would use:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
# Load your data
X, y = load_your_data()
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Train model
model = RandomForestClassifier()
model.fit(X_train, y_train)
# Evaluate
y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))
Worked Example
Let's calculate TPR for a medical test with these results:
| Actual | Predicted | Count |
|---|---|---|
| Positive | Positive | 80 |
| Positive | Negative | 20 |
| Negative | Positive | 15 |
| Negative | Negative | 85 |
Using the formula:
TPR = 80 / (80 + 20) = 0.80 or 80%
This means the test correctly identifies 80% of actual positive cases.
FAQ
What is the difference between TPR and precision?
True Positive Rate (TPR) measures how well a model identifies actual positives, while precision measures how accurate the positive predictions are. TPR is calculated as TP/(TP+FN), while precision is TP/(TP+FP).
How do I interpret a TPR value?
A TPR of 0.8 means the model correctly identifies 80% of actual positive cases. Values closer to 1 indicate better performance at detecting positives.
Can TPR be higher than 1?
No, TPR is a proportion and must be between 0 and 1. A value greater than 1 would indicate an error in the calculation.
What's the relationship between TPR and false positive rate?
TPR and false positive rate (FPR) are complementary metrics. As TPR increases, FPR typically also increases, showing the trade-off between detecting positives and avoiding false alarms.