Model usage
This section shows how to download models and run predictions using the Ersilia Model Hub
You can explore the available models through our website or by running the following command in the CLI:
Each model is identified by:
EOS-ID:
eos[1-9][a-z0-9]{3}
Slug: 1-3 word reference for the model
In this case example, we show how to run predictions based on the AI/ML model developed in the paper Retrosynthetic accessibility score (RAscore) β rapid machine learned synthesizability classification from AI driven retrosynthetic planning by Thakkar et al, 2021. The RA score is particularly useful to pre-screen large libraries of compounds, for example those produced by generative models.
Use model through CLI
Fetch model and install it locally
The first step is to download the model to your local device and install it along with its dependencies. By default, a ~/eos
directory (for Ersilia Open Source) will be created in your HOME
. This folder will contain all fetched models along with additional files to manage the AI/ML content available locally.
To download and install the RA Score prediction model, simply use the fetch
command. In the Ersilia Model Hub, the RA Score prediction model has the identifier eos2r5a
and the slug retrosynthetic-accessibility
. You can use either one to refer to this model all of the commands below
Get model information
Once the model is downloaded, you can get more information through the model card:
We do our best to keep the user away from the dependency hell. Models are automatically installed with the necessary degree of isolation from the system. All models are available through GitHub and also as Docker Images
Serve model
Once the model has been fetched, it should be ready to be used. A model in the Ersilia Model Hub can be thought of as a set of APIs. You can serve the model like this:
A URL will be prompted as well as a process id (PID). These can be relevant if you are an advanced user and want to have low-level control of the tool. The most important is, however, the list of available APIs. By default, all models use the run
API.
Make predictions
The RA Score prediction model takes chemical structures as input and provides a score (ranging from 0 to 1). The higher the score, the more synthetically accessible the molecule is predicted to be.
Ideally, in the chemistry models, the input molecules are specified as SMILES strings. SMILES strings can be easily found online. For instance, we can find an antibiotic, Halicin, in PubChem, and then predict its retrosynthetic accessibility as follows:
It is also possible to use InChIKey or even molecule name (through the Chemical Identifier Resolver) instead of SMILES. Ersilia will take care of this automatically. However, please take into account that this requires an internet connection and will slow down the process, as requests to external tools are necessary.
You can make multiple predictions in batch mode. This is typically much faster than running predictions one by one in a loop. For instance, we can predict the RA Score of Halicin and Ibuprofen.
This can become impractical and perhaps you prefer to provide an input file instead. Let's name this file input.csv
.
The terminal command now becomes much cleaner:
By default, predictions are returned in the standard output of the terminal. We favour the widely used JSON format because it offers great flexibility and interoperability. However, many of the model APIs return an output that can be naturally expressed in tabular format, for example, in a CSV file. If this is what you want, simply specify an output file with the .csv
extension.
At the moment, the available formats are JSON (.json
), CSV (.csv
), TSV (.tsv
) and HDF5 (.h5
). The latter is appropriate for large-scale numerical data and is relevant for the lake of pre-computed predictions available in the Isaura resource.
Close model
Once you are done with predictions, it is advised to stop the model server:
Delete model
If you are sure you don't want to use a model anymore, you may want to remove it from your computer. This includes deleting all model files and specific dependencies:
As a Python package
Models can be fetched from the Ersilia Model Hub, served, and run as a Python package. The main class is called ErsiliaModel
:
Then, you can perform the same actions as in the CLI. To serve:
To make predictions for Halicin and Ibuprofen:
To close the model:
Using the with
statement
with
statementA more concise way to run prediction would be to use the with
clause:
Using Ersilia through Colab
We have prepared a ready-to-go Google Colaboratory (Colab) notebook to run models and store predictions in Google Drive. Read more about Colab in our training materials and get started by clicking on the button Open in Colab from our GitHub.
Please note we do not extensively maintain the Colaboratory implementation and some models might not work. Please open an issue on GitHub if you encounter any problems.
Last updated