Tools
Last updated
Last updated
Online chat open during and after the conference. Each participant should receive a private sign in to the #ai-workshop channel
Programming language that we will use throughout the course. It is characterized for being an "object-oriented" language.
to work with tabular data (i,e .xlsx, .csv ...). It is loaded into a Google Colab or a Jupyter Notebook with the following command (pd is the common pandas abbreviation):
import pandas as pd
for creating all kinds of plots. It is loaded into a Google Colab or a Jupyter Notebook with the following command (plt is the common matplotlib abbreviation):
import matplotlib.pyplot as plt
The largest chemioinformatics . Among many other functions, it allows the user to standardise molecules, draw chemical structures and create fingerprints for molecular representation. It is loaded into a Google Colab or a Jupyter Notebook with the following command:
import rdkit
RDKit is a very large package, and we usually import specific functions to work with:
from rdkit import Chem
It is loaded into a Google Colab or a Jupyter Notebook with the following command:
import umap
import sklearn
While pandas and matplotlib come preinstalled in Google Colab, RDKit, UMAP and SKLEARN must be installed using the !pip install
command
Colab is not directly connected to the user's Google Drive, but this can easily be achieved by running the following command:
If you close Colab, or disconnect the runtime, any package installed will disappear so you will need to run all the relevant cells again.
The Ersilia Model Hub is a platform of open source pretrained AI/ML models for drug discovery, developed and maintained by the Ersilia Open Source Initiative. It is licensed under a GPLv3 OS license.
Text based user interface used to run programs, manage computer files and interact with the computer. The default CLI in UNIX systems (Linux and MacOS) is called Terminal, in Windows is the Command Prompt or the Windows Power Shell
is a python package to perform dimension reduction with a Uniform Manifold Approximation and Projection (UMAP). It is very convenient to visualise, in our case, a dataset of chemical entities as a single 2D scatter plot, where each dot represents a molecule.
(sklearn) is a python package containing several algorithms to perform supervised and unsupervised machine learning.
Google Colaboratory (Colab) is a Jupyter notebook that allows users to write and execute Python code for free in Google cloud. It runs fully on Chrome and does not require additional installations, aside from a Google account and internet connection. If you need, you can also run Colab on your .
of available models
Open source
If you use the Ersilia Model Hub in your research, please .
Free and open source for distributed version control. It allows to track changes in any set of files, speeding up collaborative work. It needs to be installed in your local system.
for software development and version control based on Git. It is a platform where users can collaborate and contribute to open source projects.