When evaluating the performances of time series extrapolation methods, both researchers and practitioners typically focus on the average or median performance according to some specific error metric, such as the absolute error or the absolute percentage error. However, from a risk-assessment point of view, it is far more important to evaluate the distributions of such errors, and especially their tails. For instance, a lack of normality and symmetry in error distributions can have significant implications for decision making, such as in stock control. Moreover, frequently these distributions can only be constructed empirically, as they may be the result of a computationally-intensive non-parametric approach, such as an artificial neural network. This study proposes an approach for evaluating the empirical distributions of forecasting methods and uses it to assess eleven popular time series extrapolation approaches across two different datasets (M3 and ForeDeCk). The results highlight some very interesting tales from the tails.