Specific instructions for the summer Ersilia interns (June 2022 - September 2022)
Every Monday by 9 pm CET, one person of the team will suggest one topic and the rest of us will prioritize model search about this topic during the week. Topics can be related to a task of biomedical relevance, or to a particular family of algorithms. Valid topics could be:
By biomedical relevance:
- Antimalarial activity prediction
- Broad spectrum antibiotic activity prediction
- Drug toxicity prediction
- Synthetic accessibility of compounds
By algorithmic relevance:
- Graph neural networks
- Reinforcement learning methods
- Chemistry language transformer models
In the Slack
#internshipschannel, one person (for example @Miquel) will write a message like this:
@channel This is the topic of the week!
Topic: Antimalarial activity prediction
Why? We are currently working with Medicines for Malaria Venture and they have asked for predictions on some antimalarial candidates.
In this case, @Miquel should pin
the message so that everybody can find it easily during the week.
It would be great if the rest of the @channel can make comments, ask questions, or give feedback about the topic choice. Or even just confirm that you've read the message (
It is important that you communicate your research to the rest of the team. We suggest the following three steps:
First, write a quick note in the Ersilia Slack
#literaturechannel. Simply copy the link to the publication as soon as you discover the model, or even a link to a tweet. Before the link, add the
emoji so that we know it is about a model. For example:
Compound price prediction with deep learning! [link]
Once you've posted the model in the
#literaturechannel, read about it in more detail and try to figure out if code is available. Then, add the model as a new entry in the Ersilia Model Hub Spreadsheet. Please request edit rights to @Gemma if you don't have them.
In the Spreadsheet, you will find two sheets:
- Hub: Contains complete documentation about the models, including an Ersilia Open Source identifier (e.g.
eos4e40), a slug (e.g.
chemprop-antibiotic) and a status (e.g. Done).
- Raw List: Contains a backlog of models that could be of interest to Ersilia. This sheet contains minimal information about the models. There is an Approved and a Selected tickbox. Approved means that Ersilia is willing to incorporate this model. Selected means that someone is already working on the model or the model has been successfully incorporated in the Ersilia Model Hub.
You should start by the Raw List sheet. You are always free to add models there (relevant to the Topic of the Week or not). @Miquel is responsible for curating this list and approving the models, he will use the Slack
#internshipschannel to ask questions or discuss the relevance of the model before approving them.
Start by adding your model in the Raw List sheet and wait for approval. As soon as the model has been approved, you are ready to move forward. Please tick the Selected column so that nobody else picks your model of interest.
Now you are ready to start filling the Hub sheet. Fill in as much information as possible, and if something is missing make sure you provide relevant links and enough information for others to understand what the model is about.
Please write To do, In progress or Done in the Status column:
- To do means that you have filled the information but you have not started the model incorporation per se. In other words, you have not started to work on the coding part yet.
- In progress means that you are already working on the code.
- Done means that the model has been successfully incorporated in the Ersilia Model Hub.
Try to provide a Title and a Description, and perhaps suggest a Slug. The Ersilia Open Source identifiers are already predefined, so no need to worry about it. Don't forget to write your name in the Contributor column.
For now, @Miquel will be responsible for curating the models listed in the Spreadsheet. He will reach out to you if he has questions or needs more information about the model. He will copy the information from the Spreadsheet to an AirTable Base. This Base is accessed programmatically by the Ersilia CLI and our (provisional) hub interface.
As soon as you feel ready to start coding, you should change the Status in the Spreadsheet Hub sheet from To do to In progress. Please go to the next page to learn more about how to Incorporate models to the hub.
In brief, this is a suggested routine that you can follow:
- 5.Wait for approval.
- 6.Select an approved model.
- 9.When the model is successfully added, change the Status to Done!🎉