Skills: using OS models

This section serves as a guideline for the Skills Development Session 4

GitHub

GitHub is an internet hosting service for software development and version control using Git. It allows to easily collaborate between several developers and across organisations. Ersilia, like most organisations and computational laboratories, centralizes its open source code in GitHub.

Each project in GitHub is stored in a unique "Repository". Most academic publications cite code that is deposited in GitHub. When you land in a new GitHub repository, it is important to look at:

  • License file: whether the code is released under an approved OS and can be used to our problem of interest

  • Readme file: featured in the landing page of the repository, highlights the main information about the code you can find there, often also contains relevant links to publications and to how to cite the software

You can read more about how to work with GitHub, writing issues to authors and cloning repositories in the extra section about Git and GitHub and its associated presentation.

Accessing the Ersilia Model Hub

We have prepared a simpler environment in a Google Colab notebook that provides an easy to use API to prepare your data and access several predictions from the Hub.

The main steps featured in the notebook are:

  1. Connection to Google Drive (where we will centralize our data)

  2. Standardisation of the SMILES according to ChEMBL rules using the standardiser package (more information on the steps performed by the standardiser here).

  3. Selection of the model and running the basic commands:

    1. Fetch

    2. Serve

    3. Predict

  4. Visualise the model output in tabular format and, if possible, the distribution of the output variable in a histogram.

If the runtime disconnects, remember to run all the cells again

When writing paths and names (strings in python) you must take into account lower and upper case and other possible misspellings.

Last updated