Outreachy Summer 2022
Specific instructions for the summer Ersilia interns (June 2022 - September 2022)
Follow the Topic of the Week
Every Monday by 9 pm CET, one person of the team will suggest one topic and the rest of us will prioritize model search about this topic during the week. Topics can be related to a task of biomedical relevance, or to a particular family of algorithms. Valid topics could be:
By biomedical relevance:
Antimalarial activity prediction
Broad spectrum antibiotic activity prediction
Drug toxicity prediction
Synthetic accessibility of compounds
By algorithmic relevance:
Graph neural networks
Reinforcement learning methods
Chemistry language transformer models
Please note that Topics of the Week should be taken as a soft guideline. Discovery and selection of model should not be blocked by the existence of the Topic of the Week. If you find a model that is interesting but it is unrelated to the Topic of the Week, feel free to select it and work on it. It is still a valid choice!
Announcing the Topic of the Week
In the Slack #internships
channel, one person (for example @Miquel) will write a message like this:
📆 @channel This is the topic of the week! 🤖 Topic: Antimalarial activity prediction 🤔 Why? We are currently working with Medicines for Malaria Venture and they have asked for predictions on some antimalarial candidates. ⏭️ Next: @Gemma
In this case, @Miquel should pin 📌 the message so that everybody can find it easily during the week.
It would be great if the rest of the @channel can make comments, ask questions, or give feedback about the topic choice. Or even just confirm that you've read the message (🙌,👍,❤️,...)
Note that @Miquel has nominated @Gemma. So @Gemma will be responsible for selecting a model next week. If you are eager to suggest a topic, simply contact the current responsible person so that they can nominate you 😉.
Notify and keep track of models
It is important that you communicate your research to the rest of the team. We suggest the following three steps:
Ersilia Slack #literature
channel
#literature
channelFirst, write a quick note in the Ersilia Slack #literature
channel. Simply copy the link to the publication as soon as you discover the model, or even a link to a tweet. Before the link, add the 🤖 emoji so that we know it is about a model. For example: 🤖 Compound price prediction with deep learning! [link]
Ersilia Model Hub Spreadsheet
Once you've posted the model in the #literature
channel, read about it in more detail and try to figure out if code is available. Then, add the model as a new entry in the Ersilia Model Hub Spreadsheet. Please request edit rights to @Gemma if you don't have them.
In the Spreadsheet, you will find two sheets:
Hub: Contains complete documentation about the models, including an Ersilia Open Source identifier (e.g.
eos4e40
), a slug (e.g.chemprop-antibiotic
) and a status (e.g. Done).Raw List: Contains a backlog of models that could be of interest to Ersilia. This sheet contains minimal information about the models. There is an Approved and a Selected tickbox. Approved means that Ersilia is willing to incorporate this model. Selected means that someone is already working on the model or the model has been successfully incorporated in the Ersilia Model Hub.
You should start by the Raw List sheet. You are always free to add models there (relevant to the Topic of the Week or not). @Miquel is responsible for curating this list and approving the models, he will use the Slack #internships
channel to ask questions or discuss the relevance of the model before approving them.
Start by adding your model in the Raw List sheet and wait for approval. As soon as the model has been approved, you are ready to move forward. Please tick the Selected column so that nobody else picks your model of interest.
Now you are ready to start filling the Hub sheet. Fill in as much information as possible, and if something is missing make sure you provide relevant links and enough information for others to understand what the model is about.
Please write To do, In progress or Done in the Status column:
To do means that you have filled the information but you have not started the model incorporation per se. In other words, you have not started to work on the coding part yet.
In progress means that you are already working on the code.
Done means that the model has been successfully incorporated in the Ersilia Model Hub.
Try to provide a Title and a Description, and perhaps suggest a Slug. The Ersilia Open Source identifiers are already predefined, so no need to worry about it. Don't forget to write your name in the Contributor column.
The goal of the Contributor colum is, simply, to know who is the person of reference for each model. Some models are more difficult to add than others, so you should not be stressed about the number of models that you contribute. Ersilia is a safe and collaborative space, we do not monitor this kind of metrics 🤗.
Ersilia Model Hub AirTable
For now, @Miquel will be responsible for curating the models listed in the Spreadsheet. He will reach out to you if he has questions or needs more information about the model. He will copy the information from the Spreadsheet to an AirTable Base. This Base is accessed programmatically by the Ersilia CLI and our (provisional) hub interface.
You don't have to worry about the AirTable base for now. This database is fully managed by @Miquel.
Start coding! 🚀
As soon as you feel ready to start coding, you should change the Status in the Spreadsheet Hub sheet from To do to In progress. Please go to the next page to learn more about how to Incorporate models to the hub.
TL;DR
In brief, this is a suggested routine that you can follow:
Check the Topic of the Week.
Search models related to the topic.
Post your findings in the
#literature
channel.Choose one or few models and add them in the backlog (Raw List sheet of the Ersilia Model Hub Spreadsheet).
Wait for approval.
Select an approved model.
Move to the Hub sheet of the Spreadsheet and fill in as much information as you can. For now, set the Status to To do.
When you are ready to start coding, change the Status to In progress.
When the model is successfully added, change the Status to Done! 🎉
Last updated
Was this helpful?