Altuna Akalin
Max Delbrück Center,
Berlin Institute for Medical Systems Biology

Capstone Projects

Vedran Franke
Max Delbrück Center,
Berlin Institute for Medical Systems Biology
Bora Uyar
Max Delbrück Center,
Berlin Institute for Medical Systems Biology

September-October 2022
Berlin Institute for Medical Systems Biology, Max Delbrück Center
Berlin, Germany
Application deadline
8 August 2022

Modules and tentative schedule

General Info

The general aim of the course is to equip participants with practical and technical knowledge to deploy machine learning methods on genomic data sets. With this aim in mind, we will go through certain statistical concepts and move on to unsupervised and supervised machine learning methods to analyze high-dimensional data sets.

There will be theoretical lectures followed by practical sessions where students directly apply what they have learned. These sessions will be provided online in succession. Participants will have a week to work on each module in their own time. Interactions will be provided over the online teaching platform. The programming will be mainly done in R.
More detailed course plan is here.

The course will be beneficial for first year computational biology PhD students, and experimental biologists and medical scientists who want to begin data analysis or are seeking a better understanding of computational genomics and analysis of popular sequencing methods. The course is open for anyone who is interested in the subjects we are teaching.

We have done similar courses in 2015, 2016, 2017, 2018 and 2019 . Here is a link to an article about the 2015 course.

Duration and Structure

This will be an online training event which will be mostly asynchronous. A typical module would comprise of lectures followed by hands-on exercises and a quiz. The participants will have a week to complete the lectures and exercises for each module at their own pace and at the time of their choosing within that week. Only the participants who complete the exercises and a quiz in a timely manner and have at least 50% of the tasks in the exercises will be invited to the capstone project. The capstone project tasks are designed using data from a real world problem.

We will provide online virtual machines via However, most tasks can be done on a regular laptop. A recent version of R and Rmarkdown is necessary to complete the hands-on exercises. The online platform will provide an R instance with these dependencies and data pre-installed.

About the instructors

The instructors have decades of experience in data analysis for genomics. They are developers of Bioconductor packages such as methylKit, genomation and RCAS. In addition, they have been developing end-to-end genomics data analysis pipelines for RNA-seq, ChIP-seq, Bisulfite-seq, and scRNA-seq called PiGx


Application deadline: 8 August 2022

Candidates must apply by filling in the online form. The form will have a section for submitting an abstract on current research interests and projects. Please be prepared to provide an abstract describing your current research projects and interest (max 500 words).

There will also be questions on programming experience. These questions will help us organize a better course that is adjusted to applicant competencies. So, it is important you fill in those questions truthfully.


No course fee. But only participants who complete certain tasks are elligible for a certificate.

Course Location

The course will be held online, and will be partially self-paced. The online compute platform will be provided by . The online teaching platform link will be provided to acccepted candidates.


Questions and suggestions should be sent to the following e-mail:

We look forward to hearing from you!


Scientific Organization: Altuna Akalin