Measuring Modality Utilization in Multi-Modal Neural Networks

Published in 2023 IEEE Conference on Artificial Intelligence (CAI), 2023

Multimodal data provides information from different sensor types about the same underlying phenomenon and enhances machine learning performance. However, neural networks trained end-to-end on all the modalities tend to rely mostly on one of the most dominant modalities. The black box nature of neural networks makes it difficult to assess the reliance of the network on various modalities. This work presents a novel modality utilization metric that quantifies the network reliance on different modalities. The proposed metric is validated on NTIRE-21 (classification problem) and MCubeS (image segmentation problem) datasets. The modality utilization metric contributes towards the explainability of multimodal neural networks and offers great utility in the field of multimodal data fusion. Permuting/shuffling samples of a modality M_i in the dataset to break the association between input modality M_i and the output label Y to compute modality utilization.

Recommended citation: S. Singh, P. P. Markopoulos, E. Saber, J. D. Lew and J. Heard, "Measuring Modality Utilization in Multi-Modal Neural Networks," 2023 IEEE Conference on Artificial Intelligence (CAI), Santa Clara, CA, USA, 2023, pp. 11-14.
Download Paper | Download Slides