Ersilia's ecosystem
Learn about the work of Ersilia and where to start using/contributing to our tools
Last updated
Was this helpful?
Learn about the work of Ersilia and where to start using/contributing to our tools
Last updated
Was this helpful?
Ersilia develops and implements AI/ML tools for infectious disease research. This documentation will be useful if you are...
A chemist or biologist looking to use some of our AI/ML platforms for your projects.
An open-source developer aiming to contribute to our tools.
A data scientist developing AI/ML tools and wishing to incorporate them in our platform.
An Ersilia enthusiast looking forward to learning more about our work.
All of our work is openly available through our . Below we will summarise the main repositories and where to find the most important software tools. For a complete catalog of all our organisation repositories, please see .
The is our main platform. It serves ready-to-use AI models related to the drug discovery cascade. Models can be browsed in our , and can run locally (see Installation instructions) and we also offer a selection of them for online inference (please select those available Online through our ) as well as an based on GitHub.
Detailed information about the Ersilia Model Hub, its components and how to use it and contribute to its backend as well as contribute models can be found in . Developers may look into the for an in-depth view of the code.
The repositories linked to the Ersilia Model Hub are:
: This is the main repository, corresponding to a CLI to fetch and run models locally.
: Template for new model incorporation. This template uses GitHub Actions workflows specified in .
: Collection of statistics around the Hub and its usage, such as scientific publications, disease areas covered, etc.
: GitHub Actions-based repository to check for integrity of the models within Ersilia.
: GitHub Action-based online inference for all models. Data and results are available publicly through GitHub issues. Please do not submit IP-sensitive data.
: LLM-based interface to easily interact with the Ersilia Model Hub.
: Model packaging for serving through FastAPI.
: Standardised inputs for model testing.
: Pipeline to store model inference results in AWS, creating an open database of pre-calculations (cache).
: repositories labelled with an Ersilia (eos) identifier contain individual models. A full list of models, their identifiers and relevant information is available in .
Event Fund: A one-week course we developed in collaboration with the H3d Centre and the support of the Wellcome Trust and Code for Science and Society.
ZairaChem is an automated pipeline for ML model training. Read more about it in its dedicated , as well as the and code repository (). Coupled to ZairaChem, we have developed Olinda, a model distillation framework to convert the high-performant, heavy ZairaChem models into portable ONNX models amenable for large-scale calculations and online deployment ().
ChemSampler is a pipeline based on the generative AI models available in the Ersilia Model Hub. Given a starting molecule, it performs several rounds of generative chemistry and produces a list of molecular candidates. ChemSampler can be constrained using several parameters. Please read its dedicated or check the code repository ().
ChemSampler is still under development. Please open a if you want to use this tool, and we will try to assist you accordingly.
As part of our mission we provide training in AI and Data Science to researchers across the Global South. All our trainings are documented and freely available. Check out the section if you are interested, and have a look at the following code repositories:
AI2050 courses: 2h introduction to Drug Discovery () and full week course for more advanced students (), developed in collaboration with the H3D Foundation.
Python 101: An introduction to Python programming language geared to scientists (focusing on data analysis, plotting and basic pythonic operations; ). Inspired by the Carpentries!
In addition to our software tools, we have a number of repositories related to scientific research projects. Those repositories typically contain the necessary data and code to reproduce an analysis reported in a research paper. For a full overview of our research projects and publications please have a look at our . Below are a few exemplary projects, finalized or in current development:
ADDA4TB: Targeted protein degradation for Mycobacterium tuberculosis, in collaboration with Stellenbosch University ().
GRADIENT Pharmacogenetics in Africa: Analysis of potential pharmacogenes related to antimalaria and anti-TB drugs, in collaboration with H3D ().
SARS-CoV-2 Chemical Space: Analysis of the chemical space associated with curated COVID-19 therapy data, done in collaboration with UB-CeDD ().