MedAE
May 20, 2023
MedAE stands for Median Absolute Error, which is a measure of the variability of data in statistics. It is used in Artificial Intelligence and Machine Learning to evaluate the performance of regression models. The median absolute error is robust to outliers and does not assume any specific distribution of data. MedAE is a useful metric for assessing the accuracy of a machine learning model in predicting continuous variables.
Overview
In Machine Learning, the goal is to build a model that can predict the outcome of future events based on past data. A regression model is a type of machine learning model that predicts a continuous variable. The accuracy of a regression model is evaluated by measuring the difference between the predicted outcome and the actual outcome. MedAE is one of the metrics used to evaluate the performance of regression models.
Calculation
The Median Absolute Error is calculated by taking the median of the absolute differences between the predicted values and the actual values. The formula is as follows:
\(\)$$MedAE = median(|y_true – y_pred|)$$
Where y_true
is the true value and y_pred
is the predicted value.
Interpretation
The Median Absolute Error indicates the typical distance between the predicted values and the actual values. A lower MedAE indicates better performance of the model, as it means the predicted values are closer to the actual values. MedAE is a robust metric, as it is not affected by outliers or extreme values in the data.
Example
Suppose we have a dataset of house prices and we want to build a regression model to predict the price of a house based on its features. We split the data into training and testing sets and train a regression model using the training set. We then evaluate the performance of the model using the testing set.
Suppose the true values of house prices in the testing set are:
[200,000, 300,000, 400,000, 500,000, 600,000]
And the predicted values of house prices by our model are:
[180,000, 320,000, 390,000, 520,000, 650,000]
The absolute differences between the true values and predicted values are:
[20,000, 20,000, 10,000, 20,000, 50,000]
Taking the median of these values yields a MedAE of 20,000. This means that, on average, the predicted values of our model are off by $20,000 from the actual values.