Project: Using Deep Learning to Automatically Learn Feature Representation and Build a Better Classification Model on Protein Sequential Data
Started: Summer 2015
Funding: Bucknell University PUR
In theory, deep learning is not new. However, it has recently become one of the most exciting directions that machine learning has witnessed in years. It has had a tremendous impact on image classification. However, there are very few methods that have investigated its use on strictly sequential data, such as those found in biological sequences. This study will aim to investigate the use of deep learning to induce a protein sequence classifier that can outperform existing methods.
- Poster Presentation – Sigma Xi 2015 Summer Research Symposium
- Poster Presentation – Fifth Annual Susquehanna Valley Undergraduate Research Symposium, SVURS 2015, August 4, Bucknell University, Lewisburg, PA
- Poster Presentation – Presented at 15th Annual Kalman Research Symposium, April 2, 2016, Bucknell University, Lewisburg, PA
POST GRADUATION UPDATES
Son graduated with his degrees in Computer Science and Engineering, together with Digital Studio Arts. He went on to work for Amazon as an Software Engineering Intern, then took a position at Google working with machine learning. Son graduated with the aim of going back to graduate school in 1-2 years.
Project: A novel ensemble classifier for protein contact map prediction
Duration: Summer 2013 – Spring 2015
Funding: Bucknell University Program for Undergraduate Research, BRK Startup Fund, Geisinger BGRI Grant, CS Dept. Fund
One of the greatest challenges in bioinformatics is how to predict the 3-D structure of a protein by understanding the relationship between a sequence and its amino acid structure. A protein contact map is a useful way of representing protein 3-D conformations. It is based on a distance matrix, which is a symmetric matrix that contains the Euclidean distance between each pair of C-alpha atoms in each residue in the folded protein.
Our goal is to improve existing machine learning algorithms for predicting a protein contact map from protein sequence, and develop a novel algorithm that improves the performance of existing contact map predictors.
- Honors Thesis – Successfully defended, April 2015
- Short paper and poster – ACB BCB ’14 – ACM International Conference on Bioinformatics, Computational Biology and Biomedicine, Sept 20-23, Newport Beach, CA [link]
- Poster Presentation – Fourth Annual Susquehanna Valley Undergraduate Research Symposium, SVURS 2014, August 5, Geisinger Research, Danville, PA
- Poster – Kalman Research Symposium 2014, March 29, Bucknell University, Lewisburg, PA.
POST GRADUATION UPDATES
Chuqiao successfully defended her honors thesis in April, 2015. She is staying for a bit longer this summer to help finish a journal publication and submit before she departs us. She is currently planning on pursuing her graduate degree in computer science at Columbia University, starting Fall 2015. Congratulations, Chuqiao!