Tools

Slack

Online chat open during and after the conference. Each participant should receive a private sign in to the #ai-workshop channel

Python

Programming language that we will use throughout the course. It is characterized for being an "object-oriented" language.

Pandas

Python package to work with tabular data (i,e .xlsx, .csv ...). It is loaded into a Google Colab or a Jupyter Notebook with the following command (pd is the common pandas abbreviation):

import pandas as pd

Matplotlib

Python package for creating all kinds of plots. It is loaded into a Google Colab or a Jupyter Notebook with the following command (plt is the common matplotlib abbreviation):

import matplotlib.pyplot as plt

RDKit

The largest chemioinformatics Python Package. Among many other functions, it allows the user to standardise molecules, draw chemical structures and create fingerprints for molecular representation. It is loaded into a Google Colab or a Jupyter Notebook with the following command:

import rdkit

RDKit is a very large package, and we usually import specific functions to work with:

from rdkit import Chem

Umap-learn

Umap-learn is a python package to perform dimension reduction with a Uniform Manifold Approximation and Projection (UMAP). It is very convenient to visualise, in our case, a dataset of chemical entities as a single 2D scatter plot, where each dot represents a molecule.

It is loaded into a Google Colab or a Jupyter Notebook with the following command:

import umap

SciKit-learn

Scikit-learn (sklearn) is a python package containing several algorithms to perform supervised and unsupervised machine learning.

import sklearn

While pandas and matplotlib come preinstalled in Google Colab, RDKit, UMAP and SKLEARN must be installed using the !pip install command

Google Colaboratory

Google Colaboratory (Colab) is a Jupyter notebook that allows users to write and execute Python code for free in Google cloud. It runs fully on Chrome and does not require additional installations, aside from a Google account and internet connection. If you need, you can also run Colab on your local hardware.

Colab is not directly connected to the user's Google Drive, but this can easily be achieved by running the following command:

from google.colab import drive
drive.mount('/content/drive')

If you close Colab, or disconnect the runtime, any package installed will disappear so you will need to run all the relevant cells again.

Ersilia Model Hub

The Ersilia Model Hub is a platform of open source pretrained AI/ML models for drug discovery, developed and maintained by the Ersilia Open Source Initiative. It is licensed under a GPLv3 OS license.

If you use the Ersilia Model Hub in your research, please cite us.

Command Line Interface

Text based user interface used to run programs, manage computer files and interact with the computer. The default CLI in UNIX systems (Linux and MacOS) is called Terminal, in Windows is the Command Prompt or the Windows Power Shell

Git

Free and open source software for distributed version control. It allows to track changes in any set of files, speeding up collaborative work. It needs to be installed in your local system.

GitHub

Internet-hosting service for software development and version control based on Git. It is a platform where users can collaborate and contribute to open source projects.

Last updated