Skills: using OS models
This section serves as a guideline for the Skills Development Session 4
GitHub
GitHub is an internet hosting service for software development and version control using Git. It allows to easily collaborate between several developers and across organisations. Ersilia, like most organisations and computational laboratories, centralizes its open source code in GitHub.
Each project in GitHub is stored in a unique "Repository". Most academic publications cite code that is deposited in GitHub. When you land in a new GitHub repository, it is important to look at:
License file: whether the code is released under an approved OS and can be used to our problem of interest
Readme file: featured in the landing page of the repository, highlights the main information about the code you can find there, often also contains relevant links to publications and to how to cite the software
You can read more about how to work with GitHub, writing issues to authors and cloning repositories in the extra section about Git and GitHub and its associated presentation.
Accessing the Ersilia Model Hub
We have prepared a simpler environment in a Google Colab notebook that provides an easy to use API to prepare your data and access several predictions from the Hub.
The main steps featured in the notebook are:
Connection to Google Drive (where we will centralize our data)
Standardisation of the SMILES according to ChEMBL rules using the standardiser package (more information on the steps performed by the standardiser here).
Selection of the model and running the basic commands:
Fetch
Serve
Predict
Visualise the model output in tabular format and, if possible, the distribution of the output variable in a histogram.
If the runtime disconnects, remember to run all the cells again
When writing paths and names (strings in python) you must take into account lower and upper case and other possible misspellings.
Last updated