STATE-OF-THE-ART DEEP LEARNING FOR MULTI-PRODUCT INTERMITTENT TIME SERIES FORECASTING

Open Access
- Author:
- Raval, Ronish Samir
- Graduate Program:
- Industrial Engineering
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- July 03, 2021
- Committee Members:
- Robert Voigt, Program Head/Chair
Soundar Kumara, Thesis Advisor/Co-Advisor
Saurabh Basu, Committee Member - Keywords:
- Time Series
Forecasting
Deep Learning
Statistical Analysis - Abstract:
- Deep learning is gaining traction and considerable attention due to the state-of-the-art results obtained in computer vision, object detection, natural language processing, sequential analysis, and multiple other domains. Study of literature reveals that time series analysis is a good candidate for modeling using deep learning techniques. Time series analysis has applications from finance to supply chain domains and proves to be critical in driving organizations' profit and strategic growth. In a retail setting, product demand forecasting helps in minimizing inventory, optimizing service levels, and maximizing revenue. When dealing with demand forecasting, a much complex branch of intermittent demand profiles arises. When forecasting time series, the standard option comes down to statistical learning methods such as ARIMA, exponential smoothing, and several other models. However, in case of intermittency in demand and forecasting multiple time series at once, statistical learning methods fail to provide a high level of accuracy and can sometimes become computationally expensive as well. Deep learning algorithms enter the fray, as they can be applied to tackle the problem of forecasting intermittent sales while solving the problem in a computationally frugal manner. The study focuses on solving these two problems using a state-of-the-art based approach. It helps us answer the questions of – How to implement neural networks in a value-add manner? And which models and architectures work best in our time series prediction problem with similar real-world applications? The study reveals that recurrent and convolutional architectures exhibit versatility and value in solving this problem, helping us understand the deep learning models and their application architectures in real-world scenarios. In this thesis, we have tried to answer these two important questions. The data was obtained from Kaggle for the M5 forecasting competition. The dataset relates to the daily Walmart sales of 3,000 products ranging across 10 stores. The data comprises of 3 different categories and 7 sub-categories, making it a multi-time series forecasting problem. We have applied the methods of statistical learning and deep learning to solve this problem. Statistical models of naïve method, moving average, ARIMA, Croston forecasting have been implemented. In deep learning, we initially use the deep feed-forward neural network to forecast the sales. Then recurrent architectures of RNN, LSTM and GRU are applied. Sequence learning and Attention mechanism have been implemented. Convolutional architectures of CNN, Wavenet, and temporal convolutional network have also been experimented for our problem. For the methodology, we initially select a single time series from the dataset and apply the statistical and deep learning models. This step in the methodology provides us with a strong fundamental understanding of how deep learning models are tuned to obtain the optimal architecture. Then, using the results from a single time series forecasting problem, we shortlist the most optimal deep learning models and their optimal architectures, to solve the problem of time series forecasting. We conclude that recurrent architectures provide the optimal solutions for our analysis (we define optimality through error minimization), and state-of-the-art models such as attention mechanism and sequence learning provide results within acceptable range, but their models are too computationally expensive to learn for multiple epochs and forecasts. We then conclude our analysis by providing important areas to focus on deep learning for time series forecasting in our future work.