Ersilia Book
  • 🤗Welcome to Ersilia!
    • The Ersilia Open Source Initiative
    • Ten principles
    • Ersilia's ecosystem
  • 🚀Ersilia Model Hub
    • Getting started
    • Online inference
    • Local inference
    • Model contribution
      • Model template
      • Model incorporation workflow
      • Troubleshooting models
      • BioModels annotation
    • For developers
      • Command line interface
      • CI/CD workflows
      • Test command
      • Testing playground
      • Model packaging
      • Inputs
      • Codebase quality and consistency
      • Results caching
  • 💊Chemistry tools
    • Automated activity prediction models
      • Light-weight AutoML with LazyQSAR
      • Accurate AutoML with ZairaChem
      • Model distillation with Olinda
    • Sampling the chemical space
    • Encryption of AI/ML models
  • AMR chemical collections
  • 🙌Contributors
    • Communication channels
    • Tech stack
    • Internships
      • Outreachy Summer 2025
      • Outreachy Winter 2024
      • Outreachy Summer 2024
      • Outreachy Winter 2023
      • Outreachy Summer 2023
      • Outreachy Winter 2022
      • Outreachy Summer 2022
  • 📑Training materials
    • AI2050 intro workshop
    • AI2050 AI for Drug Discovery
    • Introduction to ML for Drug Discovery
    • Python 101
    • External resources
  • 🎨Styles
    • Brand guidelines
    • Slide and document templates
    • Scientific figures with Stylia
    • Coding style
  • 🌍About Us
    • Where to find us?
    • Diversity and inclusion statement
    • Code of conduct
    • Open standards and best practices
    • Ersilia privacy notice
    • Strategic Plan 2025-2027
    • Ersilia, the Invisible City
Powered by GitBook

2025, Ersilia Open Source Initiative

On this page
  • 1. Sanity checks
  • 2. Fetch the model from local
  • 3. Test the model locally
  • 4. Update the model
  • 5. Check the Docker image
  • Docker image is available in DockerHub, but it does not run successfully
  • Docker image works on AMD64, but it fails in ARM64
  • Docker image was not pushed to DockerHub successfully

Was this helpful?

  1. Ersilia Model Hub
  2. Model contribution

Troubleshooting models

This page describes a few steps you can take when a Ersilia model is not working.

PreviousModel incorporation workflowNextBioModels annotation

Last updated 8 days ago

Was this helpful?

This documentation refers only to technical issues, package dependencies and the similar. We are not focusing on whether a model predicts more or less accurate predictions for a specific use case.

When Ersilia fetches a model from our online repository it automatically tests it using a random 3-molecule input. If the model is unable to produce an output, you will receive the following error and the model will not be fetched:

"Error occured while fetching model: eos0abc"

If that is the case, go through the following steps:

1. Sanity checks

Make sure to:

  1. Use the latest version of Ersilia. In your local clone of the Ersilia repository, please run git pull --rebase. If you installed Ersilia in the ersilia conda environment using pip install -e . you don't need to reinstall it, conda will use the updated codebase. If you installed it by simply running a pip install command, please reinstall Ersilia in your conda environment.

  2. Install and activate git-lfs. The Large File Storage system from GitHub allows to store files > 500 MB in GitHub. It is possible that git-lfs is installed in your system but not activated. Typically, Ersilia will download the model from our AWS S3 storage, but some essential files might be stored in git-lfs.

  3. Check that the eos folder is being created in your Home directory. There, a dest folder with the model subfolders should appear, this is the directory where Ersilia will clone the model. If the eos folder does not exist, you might need to revise your User permissions for creating new folders.

  4. Identify if there is a connection error. Run the fetch command in verbose mode: ersilia -v fetch eos0abc and scan the output printed in the terminal for errors such as TimeoutError: [Errno 60] Operation timed out or NewConnectionError. Typically you will receive errors from the urllib3 library if there is an internet issue. Please change network and make sure you are not using a VPN or firewall before trying again.

  5. Make sure you have enough free space in your system. If running the fetch command in verbose model prints in the output a message like No Space Left on Device, free it up before proceeding. We recommend at least 10 GB of free disk space.

Once you have completed the above steps, try fetching the model one more time. If the problem persists, do not insist on model fetching from the online repository, follow the instructions below. We provide a step by step suggestion of what actions a contributor can take to identify the source of the problem. We are using the model from the to run this debugging demo.

2. Fetch the model from local

The first step is to download the model in your system and use the fetch from local path option. This will avoid connection issues and failures in downloading from git-lfs or S3, as well as speed up the debugging process. Fork the repository and clone it.

git clone https://github.com/contributor-user/eos9ei3.git
ersilia -v fetch eos9ei3 --repo_path eos9ei3 > out.log 2>&1

Always:

  • Run the fetch command in verbose mode (-v) to print the output in the terminal

  • Use the --repo_path flag to fetch from the locally cloned repository instead of its online version

  • Copy the log to a file out.log 2>&1

We need to then carefully read the output that the fetch step is giving. At the end, it typically says:

Model API eos9ei3:predict did not produce an output

This is not the final error, it is simply stating that it could not calculate the test molecules, hence the empty output. We need to read the whole fetch file. In this log file, look for Error messages, package dependencies, memory or connection issues, it can give you a good hint of what is happening. If it is an easy fix, you can go ahead an try it out by following the steps below.

If you encounter serious problems, please open a Bug Report issue in the Ersilia page and paste there as much information as possible, including the log file and which system you are using.

3. Test the model locally

1. Create the conda environment

Since the fetch has failed, we do not have the necessary conda environment created. Depending on the step at which fetch is failing, you might have a conda environment with the model name, but it won't be complete. We recommend deleting it and starting anew, following the install.yml instructions.

In our example case, the install.yml looks like:

python: "3.10"

commands:
  - ["pip", "rdkit", "2023.3.1"]

It seems the model runs on python 3.10 and it only requires the RDKIT package to work. If there is more than one package install required, please install them one by one to identify any possible package dependencies or deprecated packages.

conda create -n eos9ei3 python=3.10
conda activate eos9ei3
pip install rdkit==2023.3.1

2. Run the model from run.sh

We have replicated the first steps of the Ersilia fetch command. If there are no package dependency issues and the conda environment is complete, we can go onto testing the model directly from the run.sh file. Remember to create a mock file for this purpose.

cd eos9ei3/model/framework
bash run.sh . ~/Desktop/test.csv ~/Desktop/output.csv

This command will print an output on the terminal, which will give us hints of what can be the problem. Most typically, the errors are due to:

  • Importing relative packages: if the import paths are not well specified, please reformat them to use absolute paths, avoiding future clashes.

  • Input and output adapters: add print statements in the code to see that the input and output are in the right format, and modify them if needed.

  • GPU - CPU issues: models that use pyTorch or other packages might have GPU specific configurations that need to be tweaked in order to work in most systems.

Once we are able to successfully run the run.sh model, we need to try it with Ersilia. Hopefully, the local fetch will now work. If it does not, go through the log file to obtain a hint of what might be going wrong.

ersilia -v fetch eos9ei3 --repo_path eos9ei3 > out.log 2>&1

4. Update the model

  • Remove any temporal edits, like print statements.

  • Ensure the packages listed in the install.yml are updated to the version that is working.

  • Revise the .gitattributes file.

5. Check the Docker image

It may happen that workflows fail to build a Docker image, or that a Docker image is built and uploaded to DockerHub, but nonetheless the model does not successfully run. Below, we try to cover various scenarios and provide guidelines on how to troubleshoot them.

Docker image is available in DockerHub, but it does not run successfully

If an image is already available in Ersilia's DockerHub but it fails to run successfully, then the best is to simply pull the image with the Docker CLI, and inspect it from the command line.

# docker pull ersiliaos/$MODEL_ID:[latest,dev-amd64,dev-arm64]
docker pull ersiliaos/eos5axz:latest
docker run -it --entrypoint /bin/bash ersiliaos/eos5axz:latest

Now your terminal should let you inspect the content of the docker container.

  • If you model did not require any Conda installation, then your system Python should already have all dependencies installed. You can check it by simply typing python and trying to import any of the model packages (for example, import rdkit).

  • If your model did require Conda installation, then in principle you should already be inside a default Conda environment with all your packages available.

To test the model inside the Docker container, the best is to go to the framework folder and simply run it using the run.sh file:

# cd /bundles/eos5axz/[random_id]/model/framework
cd /bundles/eos5axz/20250507-a54cd98a-fcb7-4989-9e99-1f10018ac004/model/framework
bash run.sh . examples/run_input.csv examples/run_output.csv

Docker image works on AMD64, but it fails in ARM64

Oftentimes, building models on AMD64 platforms works but, unfortunately, they fail on ARM64. In that case, there can be several reasons for the failure:

  • Different packages between AMD64 and ARM64: In that case, it will be necessary to specify the packages on a platform-specific way in the model repository. Broadly speaking, there are two scenarios:

    • Compilers are not available in ARM64

    • Model-specific packages have different versions/binaries in AMD64 vs ARM64

  • Different model code between AMD64 and ARM64: This is a more rare scenario, but it can happen that code for running the models on ARM64 is different that that of AMD64. In that case, it will be necessary to provide code both for ARM64 and AMD64 separately.

Generally, the best way to troubleshoot ARM64-related issues is to build models from source in an ARM64 platform, including Apple M1/2/3 chips.

Note that we do not have a well-established way to specify platform-specific dependencies in the model repository, nor to specify platform-specific code. We are currently working on this.

ARM64 platforms (for example, Apple M1 chips) do not work with Python versions below 3.8. Please check that the Python version of your model is 3.8 or above. Otherwise, the model will not get packaged successfully for ARM64 architectures.

Docker image was not pushed to DockerHub successfully

If your Docker image was not pushed to DockerHub successfully on the workflows, then you can try to install the model from source in an interactive way.

  • ersiliaos/ersiliapack-py38 will contain Ersilia Pack and Python 3.8 as a system Python. Similarly, ersiliaos/ersiliapack-py312 will contain Python 3.12. You should consider this image if your model does not have any Conda package.

  • ersiliaos/ersiliapack-conda-py38 will contain Ersilia Pack along with Conda (Python 3.8). You should consider this image if your model has Conda packages.

First, pull the image and run it in interactive mode:

docker pull ersiliaos/ersiliapack-py312:latest
docker run -it --entrypoint /bin/bash ersiliaos/ersiliapack-py312:latest

By default, you will be place in the /root directory in the container. You will notice that this directory is essentially empty.

You'll need to download the model source to work with it. The easiest is to get if from S3:

apt-get update
apt-get install wget
apt-get install unzip

wget https://ersilia-models-zipped.s3.eu-central-1.amazonaws.com/eos5axz.zip
unzip eos5axz.zip

Now you should have the model folder available. Ersilia is not installed inside the base image, so you need to try and package the model with Ersilia Pack:

# ersilia_model_pack --repo_path $REPO_PATH --bundles_repo_path $BUNDLE_PATH
ersilia_model_pack --repo_path eos5axz --bundles_repo_path bundles

When you are ready to push the changes to your fork of the model, open a PR to the main branch. As in the Model Incorporation, a series of will be triggered. Please check their result before moving on.

We have several base images containing the repository. The naming convention is as follows:

For more information, visit the documentation.

🚀
Example Workflow
Ersilia Pack
Ersilia Pack
automated tests