Tools
Slack
Online chat open during and after the conference. Each participant should receive a private sign in to the #ai-workshop channel
Python
Programming language that we will use throughout the course. It is characterized for being an "object-oriented" language.
Pandas
Python package to work with tabular data (i,e .xlsx, .csv ...). It is loaded into a Google Colab or a Jupyter Notebook with the following command (pd is the common pandas abbreviation):
import pandas as pd
Matplotlib
Python package for creating all kinds of plots. It is loaded into a Google Colab or a Jupyter Notebook with the following command (plt is the common matplotlib abbreviation):
import matplotlib.pyplot as plt
RDKit
The largest chemioinformatics Python Package. Among many other functions, it allows the user to standardise molecules, draw chemical structures and create fingerprints for molecular representation. It is loaded into a Google Colab or a Jupyter Notebook with the following command:
import rdkit
RDKit is a very large package, and we usually import specific functions to work with:
from rdkit import Chem
Umap-learn
Umap-learn is a python package to perform dimension reduction with a Uniform Manifold Approximation and Projection (UMAP). It is very convenient to visualise, in our case, a dataset of chemical entities as a single 2D scatter plot, where each dot represents a molecule.
It is loaded into a Google Colab or a Jupyter Notebook with the following command:
import umap
SciKit-learn
Scikit-learn (sklearn) is a python package containing several algorithms to perform supervised and unsupervised machine learning.
import sklearn
While pandas and matplotlib come preinstalled in Google Colab, RDKit, UMAP and SKLEARN must be installed using the !pip install
command
Google Colaboratory
Google Colaboratory (Colab) is a Jupyter notebook that allows users to write and execute Python code for free in Google cloud. It runs fully on Chrome and does not require additional installations, aside from a Google account and internet connection. If you need, you can also run Colab on your local hardware.
Colab is not directly connected to the user's Google Drive, but this can easily be achieved by running the following command:
If you close Colab, or disconnect the runtime, any package installed will disappear so you will need to run all the relevant cells again.
Ersilia Model Hub
The Ersilia Model Hub is a platform of open source pretrained AI/ML models for drug discovery, developed and maintained by the Ersilia Open Source Initiative. It is licensed under a GPLv3 OS license.
If you use the Ersilia Model Hub in your research, please cite us.
Command Line Interface
Text based user interface used to run programs, manage computer files and interact with the computer. The default CLI in UNIX systems (Linux and MacOS) is called Terminal, in Windows is the Command Prompt or the Windows Power Shell
Git
Free and open source software for distributed version control. It allows to track changes in any set of files, speeding up collaborative work. It needs to be installed in your local system.
GitHub
Internet-hosting service for software development and version control based on Git. It is a platform where users can collaborate and contribute to open source projects.
Last updated