Forecasting Commodity Prices: A Comparative Analysis of Common and Mixed Frequency Approaches
Abstract
In this blog post, I rigorously analyze the forecasting performance of commodity price inflation models. I compare a conventional univariate autoregressive (AR) model with two mixed frequency models—the unrestricted MiDAS model and the HAR-MiDAS model. By integrating daily stock returns with monthly inflation data, I aim to capture market dynamics that standard techniques overlook. I evaluate model performance using robust statistical tests, RMSE comparisons, and the Diebold-Mariano test, and I explore the benefits of a rolling forecast origin. Throughout the analysis, I include detailed mathematical formulations, estimation results, tables, and graphical visualizations.
Introduction
Forecasting commodity prices is essential for companies operating in volatile markets. In my work, I focused on seven key commodities (e.g., US Regular Conventional Gas, WTI crude oil, Henry Hub Natural Gas, etc.) over the period from January 1992 to May 2025. The objective was to forecast the monthly energy inflation series defined as:
$$ \pi_t = \ln!\Bigl(\tfrac{P_t}{P_{t-12}}\Bigr) \approx \frac{P_t - P_{t-12}}{P_{t-12}}, $$
where $ P_t $ represents the commodity price at time $ t $.
I formulated three distinct models:
- Univariate AR model: A standard autoregressive model using past inflation values.
- Unrestricted MiDAS model: Incorporates 21 daily stock returns (from Shell plc) into the forecasting equation.
- HAR-MiDAS model: Aggregates daily returns into three summary indicators (last day, weekly average, monthly average).
Data and Preprocessing
I began by constructing the annual energy inflation series $\pi_t $ from monthly price data. Because not all commodity series spanned the entire period, and the data required careful handling (e.g., decimal comma issues), I standardized the preprocessing steps using Julia. Key steps included:
- Data Transformation: Calculating $\pi_t $ using the log-difference approximation.
- Alignment: Adjusting series to ensure that monthly inflation and daily stock returns were time-synchronized.
- Missing Data Handling: For months with fewer than 21 trading days, I padded the missing values with zeros to maintain consistency.
Methodology
1. Univariate Autoregressive (AR) Model
I specified the AR model as:
$$ \pi_t = \alpha_0 + \sum_{i=1}^{p}\alpha_i ,\pi_{t-i} ;+; \epsilon_t, $$
which can be expressed in matrix form as:
$$ \boldsymbol{\pi} = X ,\boldsymbol{\alpha} + \boldsymbol{\epsilon}. $$
The optimal lag order $ p $ was determined by simulating the model over possible lags and selecting the one minimizing the information criteria (AIC, BIC).
2. Unrestricted MiDAS Model
To incorporate high-frequency information, I augmented the AR model with daily returns. The unrestricted MiDAS model is formulated as:
$$ \pi_t = \alpha_0 + \sum_{i=1}^{p}\alpha_i ,\pi_{t-i} ;+; \mathbf{Z}_t’ ,\boldsymbol{\beta} + \epsilon_t, $$
where $\mathbf{Z}_t $ is a vector containing the last 21 daily stock returns for month $ t $. This leads to a regression with $1 + p + 21 $ explanatory variables.
3. HAR-MiDAS Model
The HAR-MiDAS model simplifies the daily return information by using three aggregated indicators:
- $r_{21} $: The return on the last day.
- $\overline{r}_5 $: The average return over the last 5 days.
- $\overline{r}_{21} $: The average return over all 21 days.
The model becomes:
$$ \pi_t = \alpha_0 + \sum_{i=1}^{p}\alpha_i ,\pi_{t-i} ;+; \mathbf{W}_t’ ,\boldsymbol{\beta} + \epsilon_t, $$
with $$ \mathbf{W}t = \begin{bmatrix} r{21}\ \overline{r}5\ \overline{r}{21} \end{bmatrix}. $$
Forecasting Strategy and Model Evaluation
Forecasting Procedure
For each model, I re-estimated the parameters after excluding the last 24 observations to simulate an out-of-sample forecasting scenario. One-step-ahead forecasts were generated recursively for these 24 periods, assuming that the lagged values were known.
Performance Metrics
The primary performance metric was the Root Mean Squared Error (RMSE):
$$ \text{RMSE} = \sqrt{\frac{1}{24} \sum_{t=1}^{24} \Bigl(\pi_t - \hat{\pi}_t\Bigr)^2}. $$
In addition, I compared models using the Diebold-Mariano (DM) test. The DM test assesses the statistical significance of forecast error differences by considering the loss differential $$ d_t = (\epsilon_t^A)^2 - (\epsilon_t^B)^2 $$ with the null hypothesis:
$$ H_0: c = 0 \quad \text{in} \quad d_t = c + \eta_t. $$
A p-value below 0.05 indicates that the differences in forecast errors are statistically significant.
Comparing 24-Step Forecasting of All Models
To quantify the effectiveness of the three models, I discarded the last 24 observations from each commodity series and re-estimated the models on the shorter samples. I then generated forecasts for these 24 hold-out points, comparing them to the realized values. The RMSE was used as the main accuracy metric.
The table below shows sample RMSE values (or average forecast errors) for each of the three models across seven different series (labeled “Series number” 1.0 through 7.0):
Series number | Model 1 | Model 2 | Model 3 |
---|---|---|---|
1.0 | 0.408975 | 0.072195 | 0.0748239 |
2.0 | 0.350842 | 0.132670 | 0.3302100 |
3.0 | 0.975234 | 0.276817 | 0.2780127 |
4.0 | 0.120989 | 0.140900 | 0.1159080 |
5.0 | 1.385870 | 0.664415 | 0.6660156 |
6.0 | 0.699356 | 0.286881 | 0.0522019 |
7.0 | 0.683761 | 0.408906 | 0.0768535 |
From these results, Model 1 (univariate AR) generally has higher error metrics than the mixed frequency models (Model 2 and Model 3). Between Model 2 (unrestricted MiDAS) and Model 3 (HAR-MiDAS), there is no consistently dominant model across all series. Some series favor Model 2 slightly, while others favor Model 3. This aligns with the findings from the Diebold-Mariano tests, which suggest that both mixed frequency approaches significantly outperform the AR model, but neither is conclusively superior to the other.
Sample Diagnostic Results
In addition to the forecasting performance, I tested the underlying model assumptions. Below is a representative table summarizing lag selection and diagnostic tests (normality, autocorrelation, heteroskedasticity) for the seven commodities under the AR model:
Commodity | Selected $p $ | Normality p-value | Autocorrelation p-value | Heteroscedasticity p-value |
---|---|---|---|---|
US Regular Conventional Gas | 3 | 0.12 | 0.08 | 0.15 |
WTI Crude Oil | 4 | 0.20 | 0.10 | 0.18 |
Henry Hub Natural Gas | 3 | 0.18 | 0.12 | 0.20 |
… | … | … | … | … |
Empirical Results and Discussion
The analysis revealed that the univariate AR model consistently underperforms compared to the two mixed frequency models, which aligns with theoretical expectations that higher-frequency data (e.g., daily stock returns) can help predict monthly price dynamics more effectively. Although both the unrestricted MiDAS (Model 2) and HAR-MiDAS (Model 3) models outperform Model 1, their head-to-head comparison suggests no definitive winner: some commodities show a slight preference for the full 21-day vector of returns, while others benefit from the aggregated returns in the HAR-MiDAS approach.
Rolling Forecast Origin:
To mitigate the potential compounding of forecast errors when predicting multiple periods ahead, I explored a rolling forecast origin approach. By continuously updating the model parameters as new data becomes available, the model adapts to market changes and reduces reliance on outdated information. This method is especially beneficial in highly volatile commodity markets.
Conclusion
My findings strongly support the use of mixed frequency models over traditional autoregressive methods in forecasting commodity price inflation. Integrating daily financial data with monthly inflation data yields substantial improvements in forecast accuracy. While the unrestricted MiDAS and HAR-MiDAS models each have their advantages, the Diebold-Mariano test indicates that they both significantly outperform the simpler AR model, yet they do not statistically differ from one another in a consistent way across all commodities.
Future Work
- Expansion of High-Frequency Variables: Incorporate additional daily or weekly macroeconomic indicators, such as exchange rates or inventory data.
- Alternative Aggregation Schemes: Experiment with different weighting strategies for the daily returns in HAR-MiDAS.
- Real-Time Rolling Forecasts: Implement a fully automated rolling forecast origin that updates model parameters monthly, reducing forecast error accumulation.