top of page
Gradient With Circle
Image by rizki rama28

Insights Across Technology, Software, and AI

Discover articles across technology, software, and AI. From core concepts to modern tech and practical implementations.

Time Series Forecasting: Models, Techniques, and a Hands-On Example in Python

  • Writer: Samul Black
    Samul Black
  • 7 hours ago
  • 13 min read

Time series forecasting focuses on predicting future values from historical, time-ordered data and is a core technique in data science and applied machine learning. It is widely used in domains such as finance, demand forecasting, energy systems, healthcare analytics, and system monitoring.


In this article, we will explore different types of forecasting, popular methodologies, benchmark datasets, and practical modelling approaches, culminating in a Python demonstration of time series forecasting with an autoregressive model on the S&P 500 index.


time series forecast in python - colabcodes

What Is Time Series Forecasting?

Time series forecasting is the process of predicting future observations using historical, time-ordered data. Unlike standard predictive modeling, the sequence of observations matters, as each value depends on past behavior and underlying temporal structure. Common examples include sales records, stock prices, sensor readings, and system performance metrics. Key characteristics of time series data include:


  1. Trend – long-term upward or downward movement in the data

  2. Seasonality – recurring patterns over fixed intervals

  3. Cyclic behavior – irregular fluctuations influenced by external factors

  4. Noise – random variation not explained by patterns


Effective forecasting relies on identifying these components before applying statistical or learning-based models. By accounting for temporal dependencies and structural patterns, forecasting techniques produce predictions that are both analytically sound and practically useful across real-world applications.



Time series forecasting problems come in many forms, but some types are far more common in practice and widely searched by students, analysts, and practitioners. Highlighting these first ensures clarity and SEO value, while still allowing room to cover specialized or emerging forecasting methods.


1. Univariate Time Series Forecasting

Univariate time series forecasting focuses on predicting future values using only a single historical variable. Despite its simplicity, it can produce highly accurate predictions when the series exhibits clear trends, seasonality, or cyclical behavior. This approach is widely applied in stock price prediction, sales forecasting, energy consumption analysis, and sensor monitoring. Its interpretability and straightforward implementation make it a common starting point for beginners and researchers alike. Common techniques and methods include:


  1. Autoregressive (AR) Models: Predict future values as a linear combination of past observations, capturing momentum and trend in the series.

  2. Moving Average (MA) Models: Smooth out fluctuations by modeling the forecast as a combination of past forecast errors, helping reduce noise.

  3. ARMA / ARIMA Models: Combine AR and MA components; ARIMA adds differencing to handle non-stationary data with trends or varying mean.

  4. Exponential Smoothing: Assigns exponentially decreasing weights to older observations, allowing the model to adapt quickly to recent changes.

  5. Simple Moving Average (SMA): Calculates the average of past data points over a fixed window, providing a baseline smoothing technique.

  6. Holt-Winters Method: Extends exponential smoothing to model both trend and seasonal patterns, ideal for series with strong seasonality.


These techniques form the foundation of most time series forecasting projects and are often implemented first to establish baseline models. Once a baseline is created, forecasts can be refined using advanced statistical methods or machine learning models for improved accuracy, especially when preparing for Python-based implementations.


2. Multivariate Time Series Forecasting

Multivariate time series forecasting involves predicting future values using two or more related historical variables. By capturing interdependencies between variables, this approach can significantly improve predictive accuracy compared to univariate methods. It is commonly applied in scenarios such as electricity demand forecasting (using past consumption, weather, and holiday data), financial markets (using multiple correlated assets), and industrial process monitoring (tracking several sensor readings simultaneously). Common techniques and methods include:


  1. Vector Autoregression (VAR): Extends univariate AR models to multiple variables, modeling each series as a linear function of past values of all series in the system.

  2. Vector Error Correction Models (VECM): Suitable for multivariate non-stationary data that share long-term equilibrium relationships.

  3. Multivariate LSTM (Long Short-Term Memory): Deep learning models that capture complex temporal dependencies and interactions across multiple series.

  4. Transformers for Time Series: Modern architectures capable of modeling long-range dependencies and cross-variable interactions efficiently.

  5. Regression-Based Forecasting: Uses explanatory variables in linear or non-linear regression frameworks to predict the target series.

  6. Hybrid Models: Combines classical statistical models and machine learning approaches to leverage the strengths of both.


Multivariate forecasting is particularly powerful when multiple factors influence the target series, but it also introduces higher complexity in data preprocessing, feature engineering, and model selection. Python libraries like statsmodels, scikit-learn, and PyTorch make it easier to implement these models, allowing analysts to experiment with both statistical and deep learning approaches in practical forecasting projects.


3. Short-Term vs Long-Term Forecasting

Forecasting can also be categorized based on the prediction horizon. Short-term and long-term forecasts differ in the data patterns they rely on, the methods used, and their practical applications. Understanding these differences is critical for selecting appropriate models and achieving accurate predictions.

Short-Term Forecasting

Long-Term Forecasting

Focuses on near-future predictions and relies heavily on recent trends, seasonality, and short-term patterns.

Extends further into the future, emphasizing trend modeling and structural assumptions.

Common Models / Techniques: Exponential Smoothing (SES, Holt), ARIMA, Short-horizon LSTM, Regression with lag features

Common Models / Techniques: Trend decomposition, Structural Time Series Models, Prophet, Hybrid statistical-ML models

Typical Applications: Operational planning, daily/weekly sales, inventory management, short-term energy demand

Typical Applications: Strategic planning, long-term financial forecasts, capacity planning, long-range sales projections

Selecting the right forecast horizon ensures that models are appropriately tuned for accuracy and reliability.


4. Seasonal vs Non-Seasonal Forecasting

Time series data can either exhibit repeating patterns at regular intervals or simply show trends and irregular fluctuations without cycles. Distinguishing between seasonal and non-seasonal series helps in selecting the right models, preprocessing steps, and evaluation methods for accurate forecasts.

Seasonal Forecasting

Non-Seasonal Forecasting

Captures repeating patterns over fixed intervals, such as weekly, monthly, or yearly cycles.

Focuses on trend and irregular components without recurring cycles.

Common Models / Techniques: SARIMA, Seasonal Decomposition, Holt-Winters Exponential Smoothing

Common Models / Techniques: ARIMA, Regression-based forecasting, Exponential Smoothing

Typical Applications: Retail demand, energy consumption, web traffic, climate data

Typical Applications: Stock prices, single-variable sensor readings, inventory demand without seasonal patterns

By identifying whether a series is seasonal or non-seasonal, you can avoid mis-specifying models and improve forecast accuracy.


5. Intermittent Forecasting

Intermittent time series forecasting deals with data that is sparse or irregular, often containing many zero or near-zero values. This type of forecasting is common in inventory management, spare parts demand, service calls, or event-driven datasets where occurrences are irregular and unpredictable. Standard time series models often fail on intermittent data because they assume continuous observations, so specialised approaches are required to capture underlying demand patterns effectively. Common techniques and methods:


  1. Croston’s Method: Specifically designed for intermittent demand, separates modelling of demand size and timing to improve accuracy.

  2. Syntetos-Boylan Approximation (SBA): An enhancement of Croston’s method, adjusting for bias in small-sample forecasts.

  3. Bootstrapping Methods: Simulate potential future demand using resampling techniques to capture irregular patterns.

  4. Exponential Smoothing Variants: Modified to handle sparse occurrences and prevent overreaction to zeros.

  5. Machine Learning Approaches: Tree-based models or neural networks that incorporate lag features, event indicators, or external variables to predict sparse events.


Intermittent forecasting is particularly challenging due to the unpredictability of zero-demand periods. In practice, combining statistical methods with machine learning techniques often yields better results. Python implementations using libraries like croston, scikit-learn, or TensorFlow/Keras allow practitioners to experiment with both traditional and modern approaches, making it easier to model irregular patterns accurately.


Other Types of Time Series Forecasting

Beyond the most commonly used types, there are several specialized or less frequent forecasting approaches that address unique data structures, external influences, or advanced prediction requirements. Grouping these “other” types helps maintain clarity while ensuring that practitioners and researchers are aware of the full range of methods available.


1. Hierarchical Forecasting

Multi-level forecasting involves predicting values at several aggregation levels, such as product → category → total sales. Techniques include bottom-up, top-down, and optimal reconciliation methods, which ensure that forecasts are consistent across all levels of the hierarchy.


2. Cross-Sectional / Panel Forecasting

These models handle multiple entities observed over time, such as several stores, machines, or patients. Forecasting methods exploit both temporal trends and cross-entity correlations. Popular approaches include panel regression, VAR for multiple entities, and deep learning models that incorporate entity embeddings.


3. Spatial-Temporal Forecasting

Combines temporal and spatial dependencies to predict values across locations and time. Useful in traffic flow, weather prediction, or IoT sensor networks. Graph-based neural networks, spatio-temporal LSTMs, and convolutional models are commonly applied to capture both spatial and temporal patterns.


4. Event-Based / Exogenous Factor Forecasting

Incorporates external events or interventions that influence the series, such as promotions, holidays, or policy changes. Regression models with exogenous variables, Prophet with holidays, or hybrid ML-statistical models are often used to account for these factors.


5. Real-Time / Online Forecasting

Continuously updates predictions as new data streams in, critical for high-frequency environments like stock trading, web traffic, or sensor monitoring. Incremental learning algorithms, online ARIMA, and streaming deep learning models enable real-time adaptation.


6. Probabilistic / Uncertainty Forecasting

Produces ranges or confidence intervals rather than single-point predictions, helping quantify forecast uncertainty. Techniques include Bayesian structural time series, quantile regression, and Monte Carlo simulations. Probabilistic forecasts are particularly important in risk-sensitive domains like finance, energy, and healthcare.


These specialized approaches extend the capabilities of standard forecasting techniques, allowing analysts and data scientists to tackle complex, structured, or uncertain datasets.


Benchmark Datasets for Time Series Forecasting

Benchmark datasets are essential for practicing time series forecasting, evaluating model performance, and comparing results against standard baselines. They cover a wide range of domains such as finance, retail, energy, traffic, and general competitions. Below, we provide popular datasets along with descriptions and typical use cases.


1. Finance & Economics

Financial time series datasets are widely used to model trends, volatility, and correlations between economic variables. They often feature high-frequency data and are ideal for testing both statistical and machine learning forecasting methods.


  1. S&P 500 Index - Daily stock prices for the S&P 500 index, commonly used for financial forecasting and trend analysis. Useful for experimenting with ARIMA, LSTM, and Prophet models.

  2. NASDAQ / Dow Jones Historical Data - Historical daily closing prices of major stock indices. Ideal for modeling market trends, volatility, and returns over time.

  3. Exchange Rates - Daily or monthly currency exchange rates, such as USD/EUR or USD/JPY. Frequently used for multivariate forecasting studies and macroeconomic predictions.


2. Retail & Demand Forecasting

Retail and demand datasets are perfect for testing forecasting models in real-world business scenarios. They often include hierarchical structures, promotions, events, and intermittent demand, which make them challenging and suitable for benchmarking.


  1. M5 Forecasting Dataset - Comprehensive retail sales data from Walmart, including hierarchical structures (store → category → product). Contains promotions, events, and intermittent demand, making it ideal for benchmarking complex forecasting models.

  2. Rossmann Store Sales Dataset - Daily sales for multiple stores with additional features like promotions and holidays. Commonly used for regression-based or deep learning forecasting exercises.

  3. Favorita Grocery Sales Forecasting - Sales dataset from Kaggle competitions including stores, product categories, and special events. Great for practicing multivariate and hierarchical forecasting methods.


3. Energy & Weather

Energy and weather datasets allow forecasting of consumption, load, and environmental variables. They often involve high-frequency readings and strong seasonality, making them ideal for short-term and long-term forecasting experiments.


  1. Electricity Load Diagrams Dataset - Hourly electricity consumption data from multiple clients. Often used to benchmark short-term load forecasting models.

  2. UCI Household Power Consumption Dataset - Data recording household electricity usage at one-minute intervals. Includes active and reactive power, voltage, and sub-meter readings, ideal for sensor-level forecasting and anomaly detection.

  3. NOAA Weather Datasets - Temperature, wind speed, and precipitation measurements from the National Oceanic and Atmospheric Administration. Useful for modeling trends, seasonality, and extreme events.


3. Tourism & Traffic

Tourism and traffic datasets involve time series with strong temporal and sometimes spatial dependencies. These datasets are often used to evaluate models that must account for seasonality, events, and spatial interactions.


  1. Tourism Monthly Dataset (M3 Competition) - Monthly tourist arrivals used for benchmarking long-term forecasting models. Includes trend, seasonality, and irregular components.

  2. METR-LA / PEMS-BAY Traffic Flow Datasets - High-resolution traffic sensor data for multiple locations, capturing spatial-temporal dependencies. Often used to benchmark deep learning models like graph neural networks and LSTMs.


4. General Benchmark Competitions

General benchmark datasets from forecasting competitions provide standardized series across domains and frequencies. They are widely used in research to compare new methods against established baselines.


  1. M1, M3, M4 Competition Datasets - Classic datasets with yearly, quarterly, monthly, and weekly series across multiple domains. Widely used for benchmarking forecasting algorithms in research papers.

  2. NN5 Forecasting Challenge Dataset - Daily cash demand at 111 ATMs. Contains missing values and irregular patterns, making it ideal for testing models handling intermittent and noisy data.


These benchmark datasets provide a solid foundation for evaluating time series forecasting models across diverse real-world scenarios. Working with them helps build intuition around data behavior, model selection, and performance trade-offs, while also enabling fair comparison with established approaches used in research and industry.



Time Series Forecasting and Autoregressive Modelling of the S&P 500 Index in Python

We walk through a basic, practical implementation of an autoregressive (AR) model using the S&P 500 index as a real-world benchmark dataset. The example focuses on how historical price movements can be leveraged to model short-term dependencies in financial time series. It connects core forecasting concepts with a simple Python workflow, covering data loading, model training, forecasting, and evaluation to help readers translate theory into practice.


1. Data Acquisition and Library Setup

This step initializes the required Python libraries for data handling, visualization, modeling, and evaluation. The S&P 500 index data is then fetched directly from Yahoo Finance using yfinance, ensuring access to reliable historical market prices. The adjusted closing price series is extracted and cleaned to remove missing values, forming the input time series that will be used for training and evaluating the autoregressive model.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from statsmodels.tsa.ar_model import AutoReg
from sklearn.metrics import mean_absolute_error, mean_squared_error

# --------------------------------------------------
# 1. Download S&P 500 Data
# --------------------------------------------------
symbol = "^GSPC"
df = yf.download(symbol, start="2015-01-01", end="2024-01-01")

# Use Close for stability 
series = df["Close"].dropna()

print(series.head())

Output:
Ticker            ^GSPC
Date                   
2015-01-02  2058.199951
2015-01-05  2020.579956
2015-01-06  2002.609985
2015-01-07  2025.900024
2015-01-08  2062.139893

2. Train–Test Split and Autoregressive Model Fitting

The time series is first divided into training and testing segments using an 80–20 split, preserving the temporal order of observations to avoid data leakage. The training portion is then used to fit an autoregressive model with a fixed lag order, where future values are modeled as a linear combination of past observations. This step establishes the core forecasting model and provides a statistical summary that helps assess parameter significance and overall model fit.

# --------------------------------------------------
# 2. Train-Test Split
# --------------------------------------------------
train_size = int(len(series) * 0.8)
train, test = series[:train_size], series[train_size:]

# --------------------------------------------------
# 3. Fit Autoregressive Model
# --------------------------------------------------
# AR order (lag)
lag_order = 5

ar_model = AutoReg(
    train,
    lags=lag_order,
    old_names=False
)

ar_fit = ar_model.fit()

print(ar_fit.summary())

The following explanation highlights the key components of the autoregressive model summary and how to interpret them in the context of financial time series forecasting. The focus is on understanding what the estimated parameters and diagnostics reveal about the temporal behavior of the S&P 500 index and the reliability of the resulting forecasts.


1. Model Overview

The summary output explains how past values of the S&P 500 index are used to predict its current level. An AR(5) model is applied, meaning the index is modeled using its previous five observations. The model is estimated on 1,811 data points using conditional maximum likelihood, with information criteria such as AIC and BIC provided to assess model quality and compare different lag choices.

                            AutoReg Model Results                             
===================================================================
Dep. Variable:                  ^GSPC   No. Observations:                 1811
Model:                     AutoReg(5)   Log Likelihood               -8836.074
Method:               Conditional MLE   S.D. of innovations             32.255
Date:                Tue, 13 Jan 2026   AIC                          17686.147
Time:                        16:57:55   BIC                          17724.639
Sample:                             5   HQIC                         17700.354
                                 1811  

2. Lag Coefficients and Temporal Effects

The estimated coefficients reveal strong short-term dependence in the series. The first lag shows a large positive effect, indicating that the most recent past value heavily influences the current index level. The second lag remains positive and statistically significant but contributes less than the first. The third and fourth lags are negative, suggesting partial mean reversion after short-term momentum. The fifth lag becomes positive again, indicating a delayed reinforcing effect from older observations. The constant term is not statistically significant, showing that historical values drive most of the model’s explanatory power.


3. Residual Variability, Stability, Stationarity and Model Fit

The standard deviation of the innovations reflects the typical size of unexplained movements after accounting for the autoregressive structure. As is common with financial time series, a substantial amount of variability remains, highlighting the noisy nature of market data and the limitations of point forecasts derived from linear models.

The autoregressive roots all have magnitudes greater than one, confirming that the model is stable and stationary. This means shocks to the system decay over time rather than growing, which is a necessary condition for reliable forecasting.

===================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------
const          1.8205      2.928      0.622      0.534      -3.919       7.560
^GSPC.L1       0.8575      0.023     36.525      0.000       0.811       0.904
^GSPC.L2       0.2224      0.031      7.194      0.000       0.162       0.283
^GSPC.L3      -0.0789      0.031     -2.515      0.012      -0.140      -0.017
^GSPC.L4      -0.0793      0.031     -2.555      0.011      -0.140      -0.018
^GSPC.L5       0.0781      0.024      3.315      0.001       0.032       0.124
                                    Roots                                    
===================================================================
                  Real          Imaginary           Modulus         Frequency
-------------------------------------------------------------------
AR.1            1.0001           -0.0000j            1.0001           -0.0000
AR.2           -1.4291           -1.1063j            1.8072           -0.3952
AR.3           -1.4291           +1.1063j            1.8072            0.3952
AR.4            1.4370           -1.3628j            1.9804           -0.1208
AR.5            1.4370           +1.3628j            1.9804            0.1208
-------------------------------------------------------------------

Overall, the AR(5) results show that the S&P 500 index exhibits strong short-term memory, with recent observations exerting the greatest influence and older lags providing smaller corrective effects. While the model serves as a solid baseline for understanding temporal dependencies, the remaining uncertainty emphasizes why more advanced forecasting techniques are often explored for financial markets.


3. Forecast Generation and Model Evaluation

Once the autoregressive model is fitted, forecasts are generated for the test period by projecting the learned lag relationships forward in time. The prediction starts immediately after the end of the training data and spans the entire test window, ensuring a clear separation between observed and predicted values. The forecasting is performed in a non-dynamic manner, meaning each prediction relies on actual historical observations rather than previously forecasted values.

Model performance is then evaluated using standard error metrics. Mean Absolute Error (MAE) measures the average magnitude of prediction errors, while Root Mean Squared Error (RMSE) penalizes larger deviations more heavily. Together, these metrics provide a concise assessment of how closely the autoregressive model’s forecasts align with the true S&P 500 index values over the evaluation period.

# --------------------------------------------------
# 4. Forecast
# --------------------------------------------------
forecast = ar_fit.predict(
    start=len(train),
    end=len(train) + len(test) - 1,
    dynamic=False
)

# --------------------------------------------------
# 5. Evaluation
# --------------------------------------------------
mae = mean_absolute_error(test, forecast)
rmse = np.sqrt(mean_squared_error(test, forecast))

print(f"MAE: {mae:.2f}")
print(f"RMSE: {rmse:.2f}")

Output:
MAE: 303.27
RMSE: 352.75

The Mean Absolute Error (MAE) of 303.27 indicates that, on average, the forecasts deviate from the actual S&P 500 values by around 303 points. The Root Mean Squared Error (RMSE) of 352.75 highlights that larger deviations are penalized more heavily, reflecting occasional larger discrepancies between predicted and actual values.


4. Visualization of Forecast Results

To better understand the model’s performance, the forecasts are plotted alongside the training and test data. The training series provides context for the historical behavior of the S&P 500 index, while the test series shows the actual observed values the model aims to predict. The autoregressive forecast is overlaid as a dashed line, making it easy to visually compare predicted versus actual movements.

# --------------------------------------------------
# 6. Visualization
# --------------------------------------------------
plt.figure(figsize=(10, 5))
plt.plot(train.index, train, label="Train")
plt.plot(test.index, test, label="Test", color="black")
plt.plot(test.index, forecast, label="AR Forecast", linestyle="--")

plt.title("Autoregressive (AR) Forecasting on S&P 500 Index")
plt.xlabel("Date")
plt.ylabel("Adjusted Close Price")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

Output:

time series forecasting in python

This visualization highlights how the AR(5) model captures short-term trends and dependencies, while also revealing periods where predictions deviate from the true index values. It provides an intuitive way to assess the model’s strengths and limitations, showing both the predictive power and the inherent volatility in financial time series.


Conclusion

Time series forecasting is a critical component of quantitative analysis, enabling the modeling of temporal dependencies, trend dynamics, and seasonality in sequential data. Accurate forecasts rely on properly identifying the underlying structure of the series, selecting suitable modeling approaches, and rigorously validating predictive performance using metrics such as MAE and RMSE.

Autoregressive models, along with more advanced techniques like ARIMA, LSTM, and Transformer-based architectures, provide frameworks to capture short- and long-term dependencies. The predictive capability of these models depends on careful parameter selection, feature engineering, and an understanding of residual behavior. Properly applied, time series forecasting transforms raw temporal data into actionable insights for decision-making, risk management, and strategic planning.

Get in touch for customized mentorship, research and freelance solutions tailored to your needs.

bottom of page