Human-Machine Interaction (HMI) combined with Explainable Deep Learning (xDL) can be a powerful approach for the diagnosis of Alzheimer’s disease in patients. Alzheimer’s disease is a complex neurodegenerative disorder that affects cognitive function, and early diagnosis is crucial for better management and treatment.

Explainable Deep Learning refers to the use of deep learning models that not only make predictions but also provide transparent explanations for their decisions. This is particularly important in medical applications where the interpretability of the model’s decisions is critical to gain trust from healthcare professionals and patients.


Alzheimer’s disease is a degenerative neurological condition, the most prevalent cause of dementia, characterized by progressive cognitive decline, memory impairment, and behavioral alterations. The disease is thought to be linked to the buildup of beta-amyloid plaques and tau protein tangles in the brain, leading to the degeneration of nerve cells. Diagnosis involves a thorough assessment, including cognitive evaluations and brain imaging, to differentiate Alzheimer’s from other forms of dementia. Although a cure remains elusive, current treatments like cholinesterase inhibitors and memantine can assist in symptom management and slowing disease progression. Ongoing research aims to uncover the disease’s underlying mechanisms and develop potential therapies that can modify its course. Providing care for Alzheimer’s patients poses significant challenges for caregivers, emphasizing the importance of support and education in patient care.

Whole slide images (WSI)

Whole Slide Imaging (WSI), also referred to as virtual microscopy or digital neuropathology, is a cutting-edge technology transforming the field of neuropathology. WSI involves digitizing entire glass slides containing tissue samples using high-resolution scanning devices. The resulting digital images represent the entire slide at microscopic levels, preserving the intricate details of the tissue specimens.

With WSI, neuropathologists can access and review these digitized slides on a computer or other digital display. This technology offers numerous advantages over traditional microscopy, such as remote access to slides, easy sharing of cases for consultations, and the ability to annotate and measure regions of interest digitally. Additionally, the digital format allows for computer-aided image analysis and computational neuropathology techniques, potentially enhancing diagnostic accuracy and efficiency.

WSI has the potential to revolutionize neuropathology practices by streamlining workflow, facilitating tele neuropathology for remote consultations, and contributing to large-scale research endeavors through the creation of vast digital neuropathology archives. Moreover, it can serve as an invaluable educational tool, enabling trainees and students to learn and practice with digitized slides.

However, the adoption of WSI in routine neuropathology practice requires careful consideration of factors such as data storage, integration with laboratory information systems, standardization, and regulatory compliance. Ensuring data security and privacy is also of paramount importance when handling patient information in a digital format.

As WSI technology continues to evolve, its integration into routine clinical practice is becoming more widespread, holding the potential to improve diagnostic accuracy, efficiency, and collaboration among pathologists worldwide.

This figure shows the process of whole slide imaging and magnification

Digital Representation of Images

Digital representation of images refers to the conversion of visual information from the analog domain into a digital format, allowing images to be stored, manipulated, and transmitted using binary code. In this process, images are divided into a grid of pixels, with each pixel assigned a numerical value representing its color and intensity. The level of detail and accuracy in the digital representation is determined by the resolution of the image, which is defined by the number of pixels per unit area. Digital image formats,such as nDPI, enable efficient compression and storage of images while maintaining visual quality. This digitization of images has played a pivotal role in various fields, including photography, medical imaging, computer vision, and multimedia applications.

This figure shows the process of image digitalization

Convolutional Neural Network (CNN)

A Convolutional Neural Network (CNN) is a specialized deep learning model used for image recognition and computer vision tasks. It applies filters to images to learn and extract features, uses activation functions for non-linearity, and pooling layers for downsampling. After several layers, it makes high-level decisions through fully connected layers. CNNs are powerful in recognizing complex patterns in images and have widespread applications in computer vision tasks. Pre-trained CNN models are commonly used as a starting point for various image-related tasks.

This figure shows the process of CNN

Vision transformers (ViTs)

Vision transformers (ViTs) are widely used in image recognition tasks such as object detecting, picture segmentation, and image classification. Images in ViTs are represented as sequences, with class labels anticipated, allowing models to learn image structure independently. Input pictures are processed as a series of patches, with each patch flattened into a single vector by concatenating the channels of all pixels in a patch and then linearly projecting it to the appropriate input dimension.

In the first step of ViT architecture is splitting an image into patches, then flattening the patches, producing lower-dimensional linear embedding from the flattened patches, adding positional embedding, feeding the sequence as an input to a standard transformer encoder, pertaining the model with image labels (fully supervised on a huge dataset) and last is fine-tune on the downstream dataset for image classification.

Ground truth data (GTD)

Ground truth data refers to the authentic, accurate, and reliable data that serves as a reference or benchmark for evaluating the performance and accuracy of a particular system, model, algorithm, or process. In various fields such as machine learning, computer vision, natural language processing, and data analysis, ground truth data is crucial for training, validating, and testing models or algorithms.

This table describe the whole classes of annotation per number of files and also the total annotations numbers:

ProjectFile nameN0. Of annotationsType of annotation


To ensure accuracy down to the last pixel, the HSA KIT program provides a straightforward user interface and high-end professional annotation tools. This can improve the whole analytic process by standardizing techniques, preserving consistency, and fostering repeatability.

The program focuses on object identification and recognition, both of which are essential components of normal image processing. Object identification algorithms employ image analysis to identify and categorize objects of interest, giving usable data for further research and decision-making.

This figure shows the Vacuoles with high opacity.

This figure shows the Vacuoles with high opacity.

Cost and loss function

Cost and loss functions are crucial, in the field of learning as they help train networks. These functions assess how accurately the models predictions match the data guiding the optimization process to improve the models performance by adjusting its parameters. Although people often use „cost function“ and „loss function“ interchangeably there are distinctions between them. The loss function calculates the error for a training example whereas the cost function represents the average of all loss functions, across all training examples.


In the field of learning metrics play a role in quantitatively assessing the performance of a model. These metrics are essential, for gaining insights into the model’s effectiveness identifying areas of weakness and finding ways to enhance it. Here is a brief overview of the metrics commonly employed in learning, with academic references to support their usage.

  1. True Positive (TP): These are examples of favorable outcomes predicted by the categorization model.
  2. False Negative (FN): In these circumstances, the classification model predicted a false result while the outcome was positive. They are also known as Type II mistakes.
  3. False Positive (FP): In these circumstances, the categorization model anticipated a good outcome but the outcome was negative. They are also known as Type I mistakes.
  4. True Negative (TN): These are instances where the categorization model anticipated a bad outcome correctly.

Mean Average Precision (mAP)

Mean Average Precision (mAP) is a composite metric calculated as the mean of the Average Precision (AP) over different classes or queries. It provides a single-figure measure of the quality of a model’s predictions across multiple classes, making it easier to compare different models or algorithms.


The calculation of mAP involves several steps:

  1. Precision-Recall Curve: For each class, a precision-recall curve is plotted.
  2. Average Precision (AP): The area under the precision-recall curve is calculated for each class.
  3. Mean Average Precision: The AP values for all classes are then averaged to get the mAP

Creation of ground-truth data

The samples utilized in this experiment were prepared by a client of HS Analysis GmbH. The samples were obtained from the brains of deceased persons by slicing, making slides, and labeling cells. Common methods include Hematoxylin and Eosin (H&E) for general tissue architecture, Immunohistochemistry (IHC) for specific prion protein detection, and Periodic Acid-Schiff (PAS) for identifying amyloid plaques which for this case we used (Vacuoles) as our data and then scanning the slides to create digital NDPI files, which were then provided to the firm to create a DL model and then scanning the slides to create digital NDPI files, which were then provided to the firm to create a DL model. To create GTD we need to upload the NDPI files to the HSA KIT software, then we choose the area on interest by using the ROI tool, and finally we annotate the vacuoles in the targeted ROI region.

Two models were trained when the GTD was created (the results are discussed in Chapter 4), one model was trained using (Vit) and the other model was trained using (Mask R-CNN) both which are Instance Segmentation. Both models where trained on the same amount of GTD which is 12913. Table 1 shows the GTD and the type of annotation that was used:

File NameAmount of GTDType of Annotations
Prion Vacuoles12913Vacuoles

The amount of data used in this work

Selection of the data set

After creation of GTD, the settings in Table 2 were used for 2 different architecture to train a model.

Model TypeEpochsLearning RateBatch SizeTile Size
Instance Segmentation1000.000111024

Setting for the trained models

Workflow with HSA KIT

Analyzing samples and digitalizing slides has never been simpler. HSA KIT offers an unparalleled experience to its consumers who wish to stay up with today’s „better“ alternatives and improve workflow efficiency. The HSA team goes above and beyond to gratify its clients, from software installation and integration through unending support and upgrades.

HSA KIT-based AI analysis would provide:

  • Standardized procedure with subjective/objective evaluation
  • Extraction of significant characteristics from raw data and creation of meaningful representations for AI model training
  • Module selection and configuration with little code
  • Software that is simple to use: Train, Annotate, and Automate
  • Rapid and efficient examination of various medical pictures, reducing the time required for diagnosis or treatment.
  • Report generated automatically

The HSA SCANNER with the HSA SCAN software

For more information or placing an order, contact :

Note: This website will be updated in future.