Recent research highlights the value of Selective State Space Layers, also known as Mamba models, across language and image processing, medical imaging, and data analysis domains. These models are noted for their linear complexity during training and quick inference, which notably increases throughput and facilitates the efficient handling of long-range dependencies. However, challenges remain in understanding the information-flow dynamics, learning mechanisms, and interoperability of Mamba models, restricting their applicability, particularly in sensitive domains that necessitate explainability.
Enhancing explainability in deep neural networks, particularly in fields like Natural Language Processing (NLP), computer vision, and attention-based models, has seen various methods developed. AttentionRollout, for example, combines inter-layer pairwise attention paths with LRP (Layer-wise Relevance Propagation) scores, integrating attention gradients for class-specific relevance. Employing output token representations as the states in a Markov chain further improves attributions by treating certain operators as constants.
In order to bridge gaps in understanding Mamba models, Tel Aviv University researchers proposed reformulating Mamba computation through a data-control linear operator. This allows the unearthing of hidden attention matrices within the Mamba layer and facilitates the application of interpretability techniques from the transformer domain to Mamba models. This method uncovers the fundamental nature of Mamba models, provides interpretability tools based on hidden attention matrices, and compares Mamba models to transformers.
The researchers restated Selective State Space (S6) layers as self-attention, enabling the extraction of attention matrices. These matrices are then utilized to develop class-agnostic and class-specific tools for Mamba models’ explainable AI. This conversion of S6 layers into data-controlled linear operators simplifies the hidden matrices for easier interpretation.
Visualizations of attention matrices indicate parallels between Mamba and Transformer models in capturing dependencies. Mamba models fare similarly to Transformers in perturbation tests, demonstrating sensitivity. S6 models achieve superior pixel accuracy and mean Intersection over Union in segmentation tests, though Transformer-Attribution persistently outperforms Mamba-Attribution.
The researchers concluded that Mamba layers could be reformulated as an implicit form of causal self-attention, establishing a direct link between Mamba and self-attention layers. This understanding allows for the development of explainability techniques for Mamba models, improving awareness of their inner representations. These contributions offer significant tools for evaluating Mamba model performance, fairness, and robustness, while also paving the way for weakly supervised downstream tasks.
The research serves as a meaningful step towards better understanding and application of Mamba models and their inner workings. Readers can access further details through the research paper and Github link provided by the researchers.