🔥 News
- 🔥🔥🔥 We released MOMENT research code, so you can pre-train your own time series foundation model, with your own data, and reproduce experiments from our paper!
- We fixed an issue with Classification where MOMENT was unable to handle multi-channel inputs.
- MOMENT was accepted at ICML 2024!
- Interested in multimodal time series & text foundation models? Check out our preliminary work on JoLT (Jointly Learned Represenations for Time series & Text) [AAAI 2024 Student Abstract, NeurIPS 2023 DGM4H Workshop]. JoLT won the best student abstract presentation at AAAI! Stay tuned for multimodal time series & text foundation models!
📖 Introduction
We introduce MOMENT, a family of open-source foundation models for general-purpose time-series analysis. Pre-training large models on time-series data is challenging due to (1) the absence a large and cohesive public time-series repository, and (2) diverse time-series characteristics which make multi-dataset training onerous. Additionally, (3) experimental benchmarks to evaluate these models especially in scenarios with limited resources, time, and supervision, are still in its nascent stages. To address these challenges, we compile a large and diverse collection of public time-series, called the Time-series Pile, and systematically tackle time-series-specific challenges to unlock large-scale multi-dataset pre-training. Finally, we build on recent work to design a benchmark to evaluate time-series foundation models on diverse tasks and datasets in limited supervision settings. Experiments on this benchmark demonstrate the effectiveness of our pre-trained models with minimal data and task-specific fine-tuning. Finally, we present several interesting empirical observations about large pre-trained time-series models.
MOMENT: One Model, Multiple Tasks, Datasets & Domains
MOMENT on different datasets and tasks, without any parameter updates:
- Imputation: Better than statistical imputation baselines
- Anomaly Detection: Second best $F_1$ than all baselines
- Classification: More accurate than 11 / 16 compared methods
- Short-horizon Forecasting: Better than ARIMA on some datasets
By linear probing (fine-tuning the final linear layer):
- Imputation: Better than baselines on 4 / 6 datasets
- Anomaly Detection: Best $F_1$
- Long-horizon Forecasting: Competitive in some settings
MOMENT Captures the Language of Time Series
Principal components of the embeddings of synthetically generated sinusoids suggest that MOMENT can capture subtle trend, scale, frequency, and phase information. In each experiment, $c$ controls the factor of interest, for example the power of the trend polynomial $c \in [\frac{1}{8}, 8) (Oreshkin et al., 2020). We generate multiple sine waves by varying $c$, derive their sequence-level representations using MOMENT, and visualize them in a 2-dimensional space using PCA.
MOMENT Learns Meaningful Representation of Data
PCA visualizations of representations learned by MOMENT on the ECG5000 dataset in UCR Classification Archive. Here different colors represent different classes. Even without dataset-specific fine-tuning, MOMENT learns distinct representations for different classes.
Architecture in a Nutshell
A time series is broken into disjoint fixed-length sub-sequences called patches, and each patch is mapped into a D-dimensional patch embedding. During pre-training, we mask patches uniformly at random by replacing their patch embeddings using a special mask embedding [MASK]
. The goal of pre-training is to learn patch embeddings which can be used to reconstruct the input time series using a light-weight reconstruction head.
🧑💻 Usage
Recommended Python Version: Python 3.11 (support for additional versions is expected soon).
You can install the momentfm
package using pip:
pip install momentfm
Alternatively, to install the latest version directly from the GitHub repository:
pip install git+https://github.com/mononitogoswami/MOMENT.git
To load the pre-trained model for one of the tasks, use one of the following code snippets:
Forecasting
from momentfm import MOMENTPipeline
model = MOMENTPipeline.from_pretrained(
"AutonLab/MOMENT-1-large",
model_kwargs={
"task_name": "forecasting",
"forecast_horizon": 96
},
)
model.init()
Classification
from momentfm import MOMENTPipeline
model = MOMENTPipeline.from_pretrained(
"AutonLab/MOMENT-1-large",
model_kwargs={
"task_name": "classification",
"n_channels": 1,
"num_class": 2
},
)
model.init()
Anomaly Detection, Imputation, and Pre-training
from momentfm import MOMENTPipeline
model = MOMENTPipeline.from_pretrained(
"AutonLab/MOMENT-1-large",
model_kwargs={"task_name": "reconstruction"},
)
model.init()
Representation Learning
from momentfm import MOMENTPipeline
model = MOMENTPipeline.from_pretrained(
"AutonLab/MOMENT-1-large",
model_kwargs={"task_name": "embedding"},
)
model.init()
🧑🏫 Tutorials
Here is the list of tutorials and reproducibile experiments to get started with MOMENT for various tasks:
All these experiments can be reproduced on a single NVIDIA A6000 GPU with 48 GiB RAM.
[!TIP]
Have more questions about using MOMENT? Checkout Frequently Asked Questions, and you might find your answer!
BibTeX
@inproceedings{goswami2024moment,
title={MOMENT: A Family of Open Time-series Foundation Models},
author={Mononito Goswami and Konrad Szafer and Arjun Choudhry and Yifu Cai and Shuo Li and Artur Dubrawski},
booktitle={International Conference on Machine Learning},
year={2024}
}
⛑️ Research Code
We designed this codebase to be extremely lightweight, and in the process removed a lot of code! We released the complete but messier research code here. This includes code to handle different datasets, and scripts for pre-training, fine-tuning and evaluating MOMENT alongside other baselines. An early version of this code was available on Anonymous Github.
➕ Contributions
We encourage researchers to contribute their methods and datasets to MOMENT. We are actively working on contributing guidelines. Stay tuned for updates!
📰 Coverage
🤟 Contemporary Work
There's a lot of cool work on building time series forecasting foundation models! Here's an incomplete list. Checkout Table 9 in our paper for qualitative comparisons with these studies:
- TimeGPT-1 by Nixtla, [Paper, API]
- Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting by Morgan Stanley and ServiceNow Research, [Paper, Code, Hugging Face]
- Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series by IBM, [Paper, Hugging Face]
- Moirai: A Time Series Foundation Model for Universal Forecasting [Paper, Code, Hugging Face]
- A decoder-only foundation model for time-series forecasting by Google, [Paper, Code, Hugging Face]
- Chronos: Learning the Language of Time Series by Amazon, [Paper, Code, Hugging Face]
There's also some recent work on solving multiple time series modeling tasks in addition to forecasting:
- TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis [Paper, Code]
🪪 License
MIT License
Copyright (c) 2024 Auton Lab, Carnegie Mellon University
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
See MIT LICENSE for details.