Case Study: AI & phenotypic assays for target identification

Discover how Kantify and I-Stem have combined artificial intelligence and phenotypic assays to perform unbiased target identification in a short timeframe

The context

The cost and risk of drug development in rare diseases

Identifying safe and effective drugs for a disease is a long, challenging and costly process. For most diseases, the cost associated with developing a successful drug, which can vary from one to three billion dollars, is largely offset by the potential financial returns of finding a successful drug - blockbuster drugs routinely generate tens or hundreds of billions of dollars in revenue.

Sadly, this financial incentive breaks down for rare diseases (mostly defined as diseases that affect fewer than 1 in 2000 people). While the cost and risk associated with developing a drug for a rare disease is largely equivalent to a non-rare disease, this is not the case for the returns, which are usually much smaller due to the limited number of patients that could benefit from the drug. As such, rare diseases have mostly been ignored by drug developers. This is a real issue, as there are over 7000 rare diseases, affecting over 300 million people worldwide - many with life-threatening or debilitating effects. While regulatory efforts are being made to make the development of these drugs more attractive, under the form of Orphan Drug designations, it is also urgent to develop techniques and tools to reduce the time, cost and risk involved in finding drugs that are safe and effective for rare diseases.

Two key areas where time, cost and risk of drug development can be reduced, are, firstly, the “discovery and preclinical development” phase, where a drug is discovered and its safety and efficacy is tested on biological or animal models, and secondly, in a “clinical validation” phase, where the safety and efficacy of a drug is validated on humans. Around half of the cost of the development of a drug is linked to the discovery and preclinical development stage.

A core challenge

One of the steps of the drug discovery and preclinical development is the identification of promising compounds (molecules) for a specific disease, through drug screening. To do so, two approaches exist: phenotypic drug screening and target-based screening.

Phenotypic drug screening, in contrast with target-based drug screening, is a solid, unbiased way to assess whether a compound acts on a disease phenotype. Phenotypic screening can be based on cell-based assays, animal models, organs on a chip, etc.

Once a promising small molecule is discovered – i.e. which hits on a disease phenotype - is discovered, comes a complex challenge, i.e. identify the concerned target(s) and Mechanism of Action (MoA) of the successful compound.

This process is lengthy and costly, commonly 1 to 3 years, which postpones the drug discovery process and regulatory approval. This is the challenge we tackle with our Artificial Intelligence algorithm Zepto.Target. By combining phenotypic drug screenings with Artificial Intelligence, Zepto.Target “deconvolutes” results from phenotypic drug screenings to identify likely targets and related Mechanisms of Action (MoA).

This case study presents how we combine two state-of-the-art technologies, Artificial Intelligence and phenotypic assays, to accelerate the confirmation of targets and the discovery of promising novel targets. Combined together, both technologies enable the discovery of promising compounds, the unbiased identification of targets, and the generation of novel intellectual property.

The case

Project partners

It’s in this context that i-STEM and our team at Kantify decided to define a pilot case aiming at using AI and Stem cells to accelerate the discovery and preclinical development stage for Limb Girdle Muscular Dystrophy, a rare neuromuscular disease.

  • I-Stem, the Institute of Stem Cell Therapy and Exploration of Monogenic Diseases, is the largest French laboratory for research and development dedicated to human pluripotent stem cells. I-Stem is part of the Biotherapy Institute for Rare Diseases, with the Institute of Myology and Généthon, funded by AFM-Telethon, the largest patient organization in the field of neuromuscular diseases. As such, discovery of targets is a challenge that is particularly important to I-Stem. In addition, I-Stem possesses a unique expertise in developing iPSCs and using their iPSCs for high throughput phenotypic assays.

  • Kantify is a Belgian startup that focuses on developing novel Artificial Intelligence technologies to accelerate each step of the drug discovery. Kantify aims at bridging the gap between Artificial Intelligence and Drug Discovery. To do so, Kantify has developed Zeptomics, a powerful in silico drug discovery technology that accelerates the discovery of small molecules through target identification, hit discovery, ADMET prediction (and more). Zepto.Target - the focus of this article - is one of the key pillars of Zeptomics and can be used independently of the other Zeptomics applications.

Objective of the project

By working together, I-Stem and Kantify wanted to demonstrate how the combination of phenotypic assays with Kantify’s Artificial Intelligence technology Zepto.Target could lead to the fast identification of known or unknown targets for a disease with extremely low prevalence.

More specifically, the partners looked at the possible objectives:

  • Identify new therapeutic targets
  • Identify the associated biological pathways
  • Evaluate the potential to pre-select safe and efficient drugs interacting with these newly discovered targets
  • Tackle these two objectives in a short timeframe

The disease

A particular sub-type of Limb-girdle muscular dystrophy (LGMD2D or LGMDR3) was selected as the disease of interest for this first case. LGMDs are a large group of genetic muscular dystrophies causing progressive weakness and wasting of the shoulder and pelvic girdle muscles. Limb-girdle muscular dystrophy R3 (LGMDR3) is caused by mutations in the SGCA gene coding for α-sarcoglycan (α-SG). Sarcoglycan (SG) proteins are known to form a transmembrane complex involved in protecting muscle membranes against contraction-induced damages. As previously described, the most frequent mutation causing LGMDR3 (R77C) results in the production of misfolded α-SG proteins. This mutation leads to the loss of this protective function, as the misfolded R77C-α-SG, despite being partially functional, seems to be quickly recognized by the cell quality control and degraded by the proteasome, leading to the absence of the α-SG protein from the membrane of the cells.

Prevalence and Prognosis

LGMDR3 is severely disabling. It is of unknown but likely very low prevalence, and is currently incurable.

Project steps

Design of the assay

Fibroblasts carrying c.229C>T mutation on SGCA gene leading to the synthesis of the misfolded R77C-α-SG protein were collected on a patient suffering from LGMDR3. In collaboration with the team of Isabelle Richard at Généthon, this cell line was further genetically modified to overexpress a mcherry tagged R77C-α-SG protein to follow its subcellular localization and facilitate the analysis of the presence or absence of the R77C-α-SG protein at the membrane level.

Screening and Evaluation

The presence of the R77C-α-SG protein was quantified through immunofluorescence experiments to evaluate the potential of a drug to rescue the R77C-α-SG protein expression at the membrane level of the cells. If a drug rescues the membrane expression of the R77C-α-SG protein, it can potentially counter the symptoms of LGMDR3 by restoring the protective function of the protein on muscular cell membranes.

Using a High Throughput Screening fluorescence assay

A High Throughput Screening fluorescence assay was used to quantify the rescue (presence) of a specific protein in the cells when exposed to many different small molecule compounds, in parallel. In this case, the capacity of a small molecule compound to rescue the R77C-α-SG protein expression at the membrane level in the cell cultures was evaluated through a dedicated metric, i.e the percentages of cells within cultures showing fluorescent signs of rescued α-SG proteins.

Outcome of the High Throughput Screening fluorescence assay

958 FDA-approved drugs and pharmacologically active small molecule compounds selected by I-Stem were tested on this high-throughput phenotypic assay for their capacity to rescue the α-SG protein expression at the membrane level in the cell cultures. Among these 958 compounds, 42 compounds were found to rescue the protein of interest expression and so to potentially restore the lost protective function.

Despite the existence of some compound-protein interaction knowledge for the library used, establishing the mode of action of these 42 hits was a real challenge. 38 of the identified hits were already known to be active on one or several protein targets or protein families, but determining whether the effects of the compounds on this assay were actually related to these particular compound-protein interactions or whether they were due to some unknown off-target hits would have required further investigation for all of these different cases.

Using Zepto.Target for the discovery of the target

Kantify's drug discovery technology Zeptomics

Zepto.Target is an Artificial Intelligence algorithm which is part of Kantify’s technology Zeptomics. Zeptomics is an Artificial Intelligence based solution, which relies on millions of curated data points from libraries, datasets and literature, about assays, proteins and compounds. Thanks to a number of algorithmic innovations, Zeptomics has achieved a fundamental understanding of biochemical processes. In AI terms, this is called “generalization”.

For example, for any new compound, or any new target, Zeptomics can, * predict what will be the related hits, even unknown * define if the possible lead compound will be safe for future patients in terms of ADMET and off-target effects.

Zepto.Target, Zeptomics' application for Target Identification

Kantify has used the target prediction algorithm of Zeptomics, called Zepto.Target.

Zepto.Target is a novel model relying on Deep Learning, a subset of Machine Learning. Its functioning is extremely novel compared to what is currently described in the literature and in the industry.

Instead of relying on a review - even advanced - of the literature to discover possible targets, Zepto.Target is capable of deconvoluting, fully computationally, the likely targets of a disease and the corresponding disease pathway.

To be able to generate the results that we present below, the only data that we requested from I-Stem were the results of their phenotypic drug screening.

For reasons of confidentiality, we don’t disclose in this article how Zepto.Target is designed.

Evaluation of Zepto.Target results

To evaluate the predictions of Zepto.Target, we use a set of performance metrics, including a metric called the F1-score. Each considered protein target gets a F1 score characterizing how likely its predicted interactions with the tested compounds are to actually explain the in-vitro results.

In the below distribution graph, we can see:

  • On the X axis the value of the observed F1 scores on all the potential protein targets

  • On the Y axis the count of targets per ranges of F1 score

Zepto.Target screened 3000 proteins known to be expressed in the cells of the assay, and identified approximately 30 targets of interest, visible in the right tail of the below graph, the top scoring proteins being considered particularly interesting. Four particular targets stood out of the analysis, as they maximized most of the metrics observed, including a fully novel target.

The top scoring proteins were reviewed by experts of the disease working at I-Stem, who confirmed that :

  • Some of the top scoring proteins had previously been hypothesized as involved in the disease mechanisms or related pathways in the literature (HDAC inhibitors as previously published by the group).

  • Some other top scoring proteins were totally new in the context of the disease and represent new potential therapeutic targets.

These initial results are particularly promising and are currently being investigated further.

What’s next ?

Whereas gain and loss of function are still required to validated their mechanism of action and role in the degradation of the R77C-α-SG , this study highlights a novel set of molecules and targets of interest for the treatment of LGMDR3 in particular, and target confirmation or discovery in general.

I-Stem and Kantify

Together, I-Stem and Kantify combine a complementary expertise and complementary technology. Thanks to this fruitful collaboration, I-Stem and Kantify are now collaborating further on using phenotypic assays and AI to accelerate drug discovery.

Further collaborations

Zepto.Target, combined with phenotypic screening can tackle a key frontier of AI based drug discovery: bias-free target identification. Since this project, Zepto.Target has demonstrated its accuracy in other target discovery projects in rare diseases. Whereas target identification is a long and biased process, Zepto.Target is fast (a few days maximum), unbiased, and disease agnostic.

Zepto.Target is a useful model for biotech and pharma using phenotypic assays to either confirm or identify what are the likely targets involved in a disease pathway. Contact us for an initial exchange.

Get in touch !