An Explainable Multiclass Alzheimer’s Disease Classification Method using Vision Transformers
Abstract
The article presents a novel model for Alzheimer's Disease (AD) classification, combining the approaches of Vision Transformers (ViTs) and Explainable AI (XAI) to maximize accuracy, interpretability, and clinical usability in AD diagnostics. The proposed ViT-based model was tested using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), and patients were grouped into four categories: Healthy Control (HC), Early Mild Cognitive Impairment (EMCI), Late Mild Cognitive Impairment (LMCI), and Alzheimer's Dementia (AD). This model has overall accuracy of 90, a precision of 0.89, a recall of 0.91, and an F1 score- 0.90, which is better than the traditional Convolutional Neural Networks (CNNs) such as ResNet (accuracy 85, F1 score 0.82), DenseNet (accuracy 83, F1 score 0.81), and VGG16 (accuracy 80, F1 score 0.79). The ViT model was the most effective in distinguishing between the two conditions: HC (95% accuracy, 0.94 precision, 0.92 recall, 0.93 F1 score) and AD (90%). It had a slightly poorer performance in the EMCI (80 percent accuracy, 0.82 precision, 0.78 recall, 0.80 F1 score) and LMCI (85 percent accuracy, 0.88 precision, 0.87 recall, 0.87 F1 score) phases. The success of the ViT model can also be attributed to its ability to describe long-range relationships among brain scans compared to the conventional CNNs that have the ability to describe local receptive fields. In addition, XAI algorithms, such as Grad-CAM and LIME, provide understandable and clear predictions that enhance clinical confidence and decision-making. The implications of these findings are that it may be possible to have the model diagnose AD early and accurately with improved interpretability.
Keywords: Alzheimer's Disease, Vision Transformer, Explainable AI, Multi-Class Classification, Neuroimaging


