Diagnosing Endometrial Diseases from Tissue Images using AI

Discover how Artificial Intelligence can help diagnose endometrial diseases from tissue images

Thu 17 Jun 2021

Diagnosing Endometrial Diseases from Tissue Images using AI

The Problem

A terrible disease

Endometrial cancer is the sixth most commonly occurring cancer in women and the 15th most common cancer worldwide. In 2018 there were over 380,000 new cases reported and in 2017 the cancer was responsible for nearly 90.000 deaths worldwide. The malignant tumor usually arises from the layer of cells that form the lining of the uterus and occurs as the most common cancer of the female reproductive system.

Why early detection is important

As the disease grows, it spreads to the whole uterus and nearby tissue. At the latest stages, the cancer can reach the closest organs, such as the bladder. From there, the chances of recovery are very thin. If the disease is detected at an early stage, the patient has high chances of surviving, as the 5-year survival rate is estimated to 80%. Thus, early diagnosis and treatment are crucial to protect women’s life and fertility.

How do we detect it

The diagnosis of this disease is often made through the analysis of biopsies from the uterus by histopathologists. In the most common workflow, these biopsies are placed on a glass slide, stained using specialized chemicals, and finally scanned using a variety of so-called “full slide scanners” that produce very high-resolution samples of the sampled tissue. This exam also allows the detection of other conditions with malignancy potential, such as endometriosis or hyperplasia.

Detection is very hard, even for human experts

Making a diagnosis from the analysis of tissue samples is a complex task, executed by pathologists who have gone through a long and costly training. With biopsies becoming cheaper and less invasive, the practice is now very common, especially in the gynecological branch.

As a consequence, these exams now represent a large proportion of the histopathologist’s workload. The difficulty and repetitiveness of the task can induce fatigue or complacency, which can cause experts to miss some important features on certain slides. Several studies have confirmed that there was a lack of consistency in experts diagnoses, for example:

The same expert might give a different diagnosis for the same slide if given at different times;
The same slide might be given a different diagnosis when analyzed by different pathologists.

How can AI help

Technologies relying on Artificial Intelligence (AI) and, more specifically deep learning, have been achieving great performances on tasks from the medical sector thanks to their capability of identifying very subtle patterns and features from image data. Our previous research in the area has focused on developing solutions for blood cell analysis and male fertility diagnosis. Because of the promising results in a variety of fields, we believe that the diagnosis of uterine cancer using deep learning algorithms could significantly help both medical experts and patients.

Challenges

However, creating AI-based solutions that can help with the diagnosis of uterine cancer comes with many complexities that need to be taken into account.

A disease with many shapes

Cancer is a complex disease that takes different shapes and is hard to accurately recognize from tissue samples by non-experts. The correct diagnosis of uterine cancer and other uterine diseases is made even harder by the fact that tissues can have significantly different patterns depending on the menstrual cycle.

Many ways to prepare samples

Because of optical differences in the scanners, the use of different microscopic technologies, and differences in staining chemicals, samples taken from the same patient can look vastly different.

Finding data that captures all this variability

A deep learning model needs data to learn. Creating data that captures the variability mentioned above is a challenging task that ideally requires two to three experts to label hundreds of samples from different patients and cross-validate these labels to have a reliable ground truth.

Locating region of interest on whole slide images

In order to detect uterine cancer in the early stages, it is of crucial importance that the area of the uterus that is sampled for biopsy contains tissues affected by cancer. Moreover, cancerous tissue is often not present on all of the tissue, but only on patches of it. Thus, making a full search of the complete biopsied tissue is essential. This task requires both profound attention and time from the experts involved.

Stakes are very high

A wrong diagnosis in this disease can lead at best to long, unnecessary and extremely invasive treatment (in case of a false positive) and at worst death (in case of a false negative). As such, getting the diagnosis right is literally a matter of life or death.

Our solution

An AI that can detect different types of diseases

Kantify has built an AI-powered algorithm that focuses on the detection of three different uterine pathologies:

Uterine polyps: Unusual growths of the endometrium that are in most cases benign. However, on rare occasions they can lead to cancer and be a cause of infertility.
Hyperplasia: The endometrium grows unusually thick, which can in certain cases lead to cancer.
Uterine adenocarcinoma: Cancer that arises from the lining of the uterus.

An AI algorithm for more informed decisions

This algorithm has been designed to assist medical professionals to make better-informed decisions, in shorter periods of time. Our AI algorithm brings a valuable and explainable second opinion to the histopathologists in a laboratory, helping and advising them in order to reduce the risk of making a wrong diagnosis.

Can be mounted on every device in a lab

Our solution has the ability to be applied to whole-slide images, as well as to be deployed in the cloud or even directly embedded into medical devices, facilitating the work of pathologists through high availability and state-of-the-art support for the solution.

Our results

Metrics: how can we compare with experts

To measure how well our AI performed, we compare its diagnosis with the diagnosis from three experts to generate an accuracy metric. We then pay particular attention to the evaluation of the samples where our algorithm disagrees with expert opinion, in order to understand where the model still makes mistakes and focus our efforts there.

In general, our model agrees with expert’s opinions in 84% of cases and agrees with experts in 90% of cases where the expert diagnosis was cancer.

How do these results relate to the real-world problem

While the results are promising, the level of disagreement underlines the fact that the AI faces the same challenges as histopathologists when it comes to identifying features that are extremely difficult to differentiate from others. This explains the main confusion areas of the model e.g.:

Uterine cancer/hyperplasia: hyperplasia can take four different forms (simple, complex, with or without atypia). Complex hyperplasia with atypia has a significantly higher chance of developing itself into cancer (between 25% and 40%).
** Normal/hyperplasia**, as hyperplasia can be confused with benign mimics.
Normal/endometrial polyp, since polypoid features can highly resemble non-polypoid features from a normal endometrium.

Increasing the AI’s reliability

With the knowledge of the existence of such confusions and with the stakes in mind, we wanted our model to explain itself and to raise awareness when it is unsure of its decision. During our research, we confirmed that the model was more likely to make a mistake when its confidence level is low. We therefore added a feature that allows the model to abstain from making a prediction whenever its confidence level is below a human set threshold.

The figure above shows that the AI can improve its reliability at the cost of not being able to make predictions on every slide. Whenever the model abstains, the expert is warned that the studied sample may contain features that induce confusion and is provided with the two most likely conditions with the associated confidence level.

If for example, an expert needs the model to make good predictions at least 90% of the time, then he must select a probability threshold of 0.7 which implies that the model will discard 19% of its predictions.

This operation significantly improves the model, as 49% of the wrong diagnoses were discarded at the cost of 13% of the correct diagnoses.

To provide even more explainability, the AI is also capable of highlighting on the image the different features that have helped it make its decision.

Improving through collaboration

Thanks to the use of state-of-the-art deep learning techniques, the AI was able to make very significant progress compared to what has been achieved before, with an increase of 7% in overall accuracy. Even with the help of model abstinence, we believe that the AI would require further maturity before it is ready to be deployed in production. We have been working with a small amount of low-resolution data and have been relying on published content to make our interpretations of the results.

By joining cutting-edge technology, high-quality data, and an expert’s insight, our solution could achieve unprecedented results and consistently reach the prediction quality of an expert. Moreover, this technology has the potential to work with any histopathological data and therefore be applied to many other diseases. If you are interested in collaboration, let’s get in touch!