Time series data, which involves sequential observations recorded over time, is essential in various aspects of life including business and environmental studies. There are numerous models and tools available for time series analysis, but their diverse APIs and complexities pose challenges to users. To address these difficulties, a company called Unit8 developed Darts, an open-source tool aimed at simplifying time series processing and forecasting within the Python ecosystem.
Data scientists have had to use a variety of different software libraries for each step of time series data analysis, including using Pandas for preprocessing, statsmodels for detecting seasonality, Facebook Prophet for forecasting, and various custom scripts for backtesting and model selection. Moreover, integrating advanced models like neural networks could require additional toolkits like TensorFlow or PyTorch, making the process complicated and tedious. Darts was developed to offer a more user-friendly and streamlined solution.
Like scikit-learn, a popular machine learning library in Python, Darts provides a unified API for working with time series data. It incorporates functionalities like data manipulation, model fitting, forecasting, and backtesting into one framework. Users can easily transition between models and approaches without the concern of compatibility issues using Darts.
Darts features a unique TimeSeries data type at its core that can represent multivariate and probabilistic time series. It ensures a proper time index, can handle multiple samples, and allows users to convert Pandas DataFrames into TimeSeries objects effortlessly. Darts also adapts the scikit-learn model interface, where a fit() method is used to train models and the predict() method is employed to make forecasts. This enables users to try out a variety of models, ranging from traditional methods to advanced neural network-based models like RNNs and Transformers.
In terms of functionality, Darts offers a vast array of built-in models, among them being Exponential Smoothing, (V)ARIMA, Facebook Prophet, plus various deep learning models. It also provides support for deep learning, allowing models to be trained using multiple time series and covariates. For larger datasets, it can leverage GPUs for enhanced processing speed.
Moreover, Darts includes tools for backtesting and model evaluation, for example, the historical_forecasts() function, which helps generate forecasts for specified horizons and compute error metrics. This feature allows users to improve their models and assess their accuracy and reliability over time. Other features supported by Darts include probabilistic filtering, grid search for hyperparameter tuning and automatic model selection, all of which are designed to reduce the risk of unintended side effects.
In summary, Darts is a comprehensive framework designed to streamline time series data analysis. It simplifies model training, forecasting, and evaluation by integrating various functionalities into a single, consistent API. Its ongoing development, coupled with its open-source nature, ensures that it will continuously evolve with new features and improvements, making it an essential tool for data scientists and analysts working with time series data.