Ersilia Book

Model usage

This section shows how to download models and run predictions using the Ersilia Model Hub
You can explore the available models through our website or by running the following command in the CLI:
# display ready to use models
ersilia catalog
Each model is identified by:
  • EOS-ID: eos[1-9][a-z0-9]{3}
  • Slug: 1-3 word reference for the model
In this case example, we show how to run predictions based on the AI/ML model developed in the paper Retrosynthetic accessibility score (RAscore) – rapid machine learned synthesizability classification from AI driven retrosynthetic planning by Thakkar et al, 2021. The RA score is particularly useful to pre-screen large libraries of compounds, for example those produced by generative models.

Use model through CLI

Fetch model and install it locally

The first step is to download the model to your local device and install it along with its dependencies. By default, a ~/eos directory (for Ersilia Open Source) will be created in your HOME. This folder will contain all fetched models along with additional files to manage the AI/ML content available locally.
To download and install the RA Score prediction model, simply use the fetch command. In the Ersilia Model Hub, the RA Score prediction model has the identifier eos2r5a and the slug retrosynthetic-accessibility. You can use either one to refer to this model all of the commands below
# fetch model from remote repository using slug ...
ersilia fetch retrosynthetic-accessibility
# ... or using ersilia identifier
ersilia fetch eos2r5a

Get model information

Once the model is downloaded, you can get more information through the model card:
# display model card using slug...
ersilia card retrosynthetic-accessibility
# ... or using ersilia identifier
ersilia card eos2r5a
We do our best to keep the user away from the dependency hell. Models are automatically installed with the necessary degree of isolation from the system. While some models have no dependencies at all and can be run using the system Python installation, others need to be containerized using Docker.

Serve model

Once the model has been fetched, it should be ready to be used. A model in the Ersilia Model Hub can be thought of as a set of APIs. You can serve the model like this:
# serve model
ersilia serve retrosynthetic-accessibility
A URL will be prompted as well as a process id (PID). These can be relevant if you are an advanced user and want to have low-level control of the tool. The most important is, however, the list of available APIs. In this case, we want to infer RA Score through the predict API.
The predict API is obviously one of the most ubiquitous throughout our catalog of models. Other common APIs are transform and interpret.

Make predictions

The RA Score prediction model takes chemical structures as input and provides an score (ranging from 0 to 1). The higher the score the more synthetically accessible the molecule is predicted to be.
Ideally, in the chemistry models, the input molecules are specified as SMILES strings. SMILES strings can be easily found online. For instance, we can find an antibiotic, Halicin in PubChem, and then predict its retrosynthetic accessibility as follows:
# Halicin
ersilia api predict -i "C1=C(SC(=N1)SC2=NN=C(S2)N)[N+](=O)[O-]"
It is also possible to use InChIKey or even molecule name (through the Chemical Identifier Resolver) instead of SMILES. Ersilia will take care of this automatically. However, please take into account that this requires an internet connection and will slow down the process, as requests to external tools are necessary.
You can make multiple predictions in batch mode. This is typically much faster than running predictions one by one in a loop. For instance, we can predict the RA Score of Halicin and Ibuprofen.
# Halicin and Ibuprofen
ersilia api predict -i "['C1=C(SC(=N1)SC2=NN=C(S2)N)[N+](=O)[O-]','CC(C)CC1=CC=C(C=C1)C(C)C(=O)O']"
This can become impractical and perhaps you prefer to provide an input file instead. Let's name this file input.csv.
The terminal command now becomes much cleaner:
# predict using an input file
ersilia api predict -i input.csv
By default, predictions are returned in the standard output of the terminal. We favour the widely used JSON format because it offers great flexibility and interoperability. However, many of the model APIs return an output that can be naturally expressed in tabular format, for example, in a CSV file. If this is what you want, simply specify an output file with the .csv extension.
# save output in a CSV file
ersilia api predict -i input.csv -o output.csv
At the moment, the available formats are JSON (.json), CSV (.csv), TSV (.tsv) and HDF5 (.h5). The latter is appropriate for large-scale numerical data and is relevant for the lake of pre-computed predictions available in the Isaura resource.

Close model

Once you are done with predictions, it is advised to stop the model server:
# close model
ersilia close

Delete model

If you are sure you don't want to use a model anymore, you may want to remove it from your computer. This includes deleting all model files and specific dependencies:
# delete model
ersilia delete retrosynthetic-accessibility

As a Python package

Models can be fetched from the Ersilia Model Hub, served, and run as a Python package. The main class is called ErsiliaModel:
# import main class
from ersilia import ErsiliaModel
# instantiate the model
mdl = ErsiliaModel("retrosynthetic-accessibility")
Then, you can perform the same actions as in the CLI. To serve:
# serve model
To make predictions for Halicin and Ibuprofen:
# Halicin and Ibuprofen
input = [
# predict
To close the model:
# close model

Using the with statement

A more concise way to run prediction would be to use the with clause:
# use with statement
with ErsiliaModel("retrosynthetic-accessibility") as mdl: