Career Summary
- I am a dedicated researcher interested in applying ML/AI to computational biomedicine and drug discovery problems, currently focused on leveraging machine learning techniques for morphological profiling and cell painting data. I defended my PhD on 16.01.2024.
Work Experience
- Course: Machine Learning for Masters of Public policy at the Hertie School’s Data Science Lab
- Assisting Prof. Dr. Drew Dimmery in conducting the Machine Learning course and tutorials.
- Phenotypic drug discovery through cell painting profiles using Machine Learning and AI.
- Unsupervised learning and clustering to discover biological patterns and relevant Modes of Action data in cell painting data from two cell cancer lines, across multiple partner sites.
- I am currently assisting my colleagues at Leibniz-Forschungsinstitut für Molekulare Pharmakologie in publishing a morphological profiling dataset generated with the EU-OPENSCREEN bioactive compound library and exploring the clusters in this cell painting data for further research prospects of this data for drug discovery.
- Supervised a group of 6 pre-university students in performing computational biomedicine research for one week at virtual Computational Biology (CB) Summer Camp.
- Helped students finish a research project within one week, while learning and applying new computational biology tools every day.
- In the CB camp, we researched genes of interest in Breast Cancer and answer questions related to the pathway, enrichment analysis of these genes with my group.
- The tools and lectures in CB camp included OMIM, UCSC Genome Browser, Probability and Statistics, RNA expression and microarrays, GEOR, GEO2R, String-DB, gene ontologies, gene regulation, enriched function and pathway analysis from GO and KEGG, looking at genes of interest in gene cards and preliminaries of AI.
- Supervised a group of 6 pre-university students in computational biomedicine research using R and machine learning for one week at virtual Machine Learning with R summer camp.
- In the R camp, we identified Liver Cancer Hepatocellular Carcinoma (LIHC) biomarkers and categorized cancer samples using microRNA expression data from TCGA (The Cancer Genome Atlas).
- We used the ggplot2 R package for data analysis, visualization and machine learning.
- Supervised a group of 6 pre-university students in performing computational biomedicine research for one week in two virtual Computational Biology (CB) Summer camps.
- In the first CB camp, we researched genes of interest in Pancreatic Cancer and in the second CB camp, we researched Biomarkers in Multiple Sclerosis and we answered questions related to the pathway, enrichment analysis of these genes in my group.
- The tools and lectures in CB camps are similar to the ones used in 2023 mircore CB camp mentioned above.
- Supervised a group of 6 pre-university students in computational biomedicine research at virtual Machine Learning with R summer camp.
- In the R camp, as in 2023 we identified Liver Cancer Hepatocellular Carcinoma (LIHC) biomarkers.
- We studied important miRNAs marking the disease using a data table containing miRNAs and tumor samples control samples and we determined significant miRNA, using the smallest p values.
- We learned that LIHC is an aggressive form of cancer with a low chance of survival once the disease has progressed and ee found genes related to unregulated miRNAs including CDKs and RAS genes.
- We used the ggplot2 R package for data analysis, visualization and machine learning using Random Forests.
- My PhD thesis is titled: Machine learning methods for prediction of protein-protein interactions hotspot residues and my supervisors were Prof. Dr Paolo Carloni, Prof. Dr. Geraldine Zimmer-Bensch.
- The hot spots are key residues for protein- protein binding. Their identification may shed light on the impact of disease associated mutations on protein complexes and help design protein-protein interaction inhibitors for therapy
- I developed a pipeline that reduces noise from biological data repositories using Robust Principal component analysis, then performs dimensionality reduction using Extreme Gradient boosting.
- I applied this pipeline to the relevant data containing sequence and structure based features of protein-protein interation residues to get reduced data.
- I then trained and validated ML classifiers on this reducded data and finally predicted hot spot residues on an independent test set to quantify the predictive power of my pipeline when used with ML classifiers.
- The codes and data are publically available on Github.
- During my PhD studies I participated in a plethora of Deep Learning, Machine Learning, Computational biology and Bioinformatics conferences, workshops and schools.
- I presented my poster in Summer School on Machine Learning in Drug Design at K. U. Leuven titled “Impact of Disease associated variants in the protein interactome at the human synapse” in August, 2018.
- I gave an oral presentation titled “RPCA (Robust Principal Component Analysis) based approach for protein-protein interaction hot-spot prediction” at the AdvCompBio 2019 in Barcelona in November, 2019
- Coming from an engineering background before my PhD, it was quintessential for me to gain deeper insights into Computational Biomedicine, particularly in applying ML/AI to life sciences. To this end, these are just a few of the many courses that I attended:
- Online course on HPC-based Computational Bio-Medicine at Barcelona Supercomputing Center, 2022.
- Artificial Intelligence For Science Bootcamp organised by NVIDIA and Leibniz-Rechenzentrum der Bayerischen Akademie der Wissenschaften (LRZ), 2022.
- Computational biology online course by Center for Neuroscience and Cell Biology (CNC), University of Coimbra, 2022.
- I attended DeepLearn 2019, 3rd international summer school on Deep Learning in Warsaw in July 2019.
- I was hosted by Prof. Dr. Ira assent to research the application of ML in particular ensemble methods for my PhD thesis.
- I gave a presentation titled: Impact of disease-associated variants in the protein interactome at the human synapse at the computer science department at Aarhus University
- Project: Semi supervised Learning and co-training with bilateral views for video object segmentation and Supervisor: Dr. Hrishikesh Sharma
- Teaching Assistant for Linear Algebra, Probability and Statistics, Multivariable Calculus and Complex variables during my M.Tech studies.
- Helped the course professors with mentoring tutorial sessions (theoretical and coding) and also grading the course assignments and exams.
- M. Tech. Thesis titled Visual Tracking using Analysis Dictionary Learning.
- The main area of research was Computer Vision and Machine Learning and supervisors were Dr. A.V. Subramanyam and Dr. Angshul Majumdar.
Hackathons
I actively engage in hackathons in the fields of application of AI/ML for bio-image analysis, drug discovery, and computational biomedicine. This engagement helps me stay connected with the latest in the field, constantly working on new and challenging problems, and networking with like-minded individuals. With a strong background in electronics and communications engineering, I bring a unique perspective to computational biology, constantly seeking innovative solutions to complex problems.
Honours and Awards
Publications
-
Addressing persistent challenges in digital image analysis of cancerous tissues, doi:10.1101/2023.07.21.548450 Prabhakaran et al. & Participants of the Cell Imaging Hackathon 2022, bioRxiv, 2023.
-
Robust principal component analysis‐based prediction of protein‐protein interaction hot spots, doi:10.1002/prot.26047 Sitani et al., Proteins: Structure, Function, and Bioinformatics 89, no. 6 (2021): 639-647.
-
Online single and multiple analysis dictionary learning-based approach for visual object tracking doi:10.1117/1.JEI.28.1.013004 Sitani et al. , Journal of Electronic Imaging 28, no. 1 (2019): 013004.